Post job

Data engineer jobs in New York

- 5,053 jobs
  • AI ML Engineer

    Dv01

    Data engineer job in New York, NY

    dv01 is lifting the curtain on the largest financial market in the world: structured finance. The $16+ trillion market is the backbone of everyday activities that empower financial freedom, from consolidating credit card debt and refinancing student loans, to buying a home and starting a small business. dv01's data analytics platform brings unparalleled transparency into investment performance and risk for lenders and Wall Street investors in structured products. As a data-first company, we wrangle critical loan data and build modern analytical tools that enable strategic decision-making for responsible lending. In a nutshell, we're helping prevent a repeat of the 2008 global financial crisis by offering the data and tools required to make smarter data-driven decisions resulting in a safer world for all of us. More than 400 of the largest financial institutions use dv01 for our coverage of over 75 million loans spanning mortgages, personal loans, auto, buy-now-pay-later programs, small business, and student loans. dv01 continues to expand coverage of new markets, adding loans monthly, and developing new technologies for the structured products universe. YOU WILL: Design and architect state-of-the-art AI/ML solutions to unlock valuable insights from unstructured banking documents: You will build document parsers to assist with the analysis of challenging document types to better inform customers of market conditions. You will build document classifiers to better improve the reliability and speed of our data ingestion processes and to provide enhanced intelligence for downstream users. Build, deploy and maintain new and existing AI/ML solutions: You will participate in the end-to-end software development of new feature functionality and design capabilities. You will write clean, testable, and maintainable code. You will also establish meaningful criteria for evaluating algorithm performance and suitability, and be prepared to optimize processes and make informed tradeoffs across speed, performance, cost-effectiveness, and accuracy. Interact with a diverse team: We operate in a highly collaborative environment, which means you'll interact with internal domain experts, product managers, engineers, and designers on a daily basis. Keep up to date with AI/ML best practices and evolving open-source frameworks: You will regularly seek out innovation and continuous improvement, finding efficiency in all assigned tasks. YOU ARE: Experienced with production systems: You have 3+ years of professional experience writing production-ready code in a language such as Python. Experienced with MLOps: Familiar with building end to end AI/ML systems with MLOps and DevOps practices (CI/CD, continuous training, evaluation, and performance tracking) with hands-on experience using MLflow. Experienced with the latest AI/ML developments: You also will have at least 3 years experience in hands-on development of machine learning models using frameworks like Pytorch and tensorflow, with at least 1 year focused on generative AI using techniques such as RAG, Agentic AI, and Prompt Engineering. Familiarity with Claude and/or Gemini is desired. A highly thoughtful AI engineer: You have strong communication skills and experience collaborating cross-functionally with product, design, and engineering. Experienced with Cloud & APIs: Proficient working in cloud environments (GCP preferred), as well as in contributing to containerized applications (Docker, Kubernetes) and creating APIs using FastAPI, Flask, or other frameworks. Experienced with Big Data Systems: You will have hands-on experience with Databricks, BigQuery or comparable big data systems. Strong SQL skills, bonus points for familiarity with dbt Forward-thinking: Proactive and innovative, with the ability to explore uncharted solutions and tackle challenges that don't have predefined answers. In good faith, our salary range for this role is $145,000-$160,000 but are not tied to it. Final offer amount will be at the company's sole discretion and determined by multiple factors, including years and depth of experience, expertise, and other business considerations. Our community is fueled by diverse people who welcome differing points of view and the opportunity to learn from each other. Our team is passionate about building a product people love and a culture where everyone can innovate and thrive. BENEFITS & PERKS: Unlimited PTO. Unplug and rejuvenate, however you want-whether that's vacationing on the beach or at home on a mental-health day. $1,000 Learning & Development Fund. No matter where you are in your career, always invest in your future. We encourage you to attend conferences, take classes, and lead workshops. We also host hackathons, brunch & learns, and other employee-led learning opportunities. Remote-First Environment. People thrive in a flexible and supportive environment that best invigorates them. You can work from your home, cafe, or hotel. You decide. Health Care and Financial Planning. We offer a comprehensive medical, dental, and vision insurance package for you and your family. We also offer a 401(k) for you to contribute. Stay active your way! Get $138/month to put toward your favorite gym or fitness membership - wherever you like to work out. Prefer to exercise at home? You can also use up to $1,650 per year through our Fitness Fund to purchase workout equipment, gear, or other wellness essentials. New Family Bonding. Primary caregivers can take 16 weeks off 100% paid leave, while secondary caregivers can take 4 weeks. Returning to work after bringing home a new child isn't easy, which is why we're flexible and empathetic to the needs of new parents. dv01 is an equal opportunity employer and all qualified applicants and employees will receive consideration for employment opportunities without regard to race, color, religion, creed, sex, sexual orientation, gender identity or expression, age, national origin or ancestry, citizenship, veteran status, membership in the uniformed services, disability, genetic information or any other basis protected by applicable law.
    $145k-160k yearly Auto-Apply 2d ago
  • Engineer, Strategic

    Constellation Energy 4.9company rating

    Data engineer job in Oswego, NY

    Who We Are As the nation's largest producer of clean, carbon-free energy, Constellation is focused on our purpose: accelerating the transition to a carbon-free future. We have been the leader in clean energy production for more than a decade, and we are cultivating a workplace where our employees can grow, thrive, and contribute. Our culture and employee experience make it clear: We are powered by passion and purpose. Together, we're creating healthier communities and a cleaner planet, and our people are the driving force behind our success. At Constellation, you can build a fulfilling career with opportunities to learn, grow and make an impact. By doing our best work and meeting new challenges, we can accomplish great things and help fight climate change. Join us to lead the clean energy future. Total Rewards Constellation offers a wide range of benefits and rewards to help our employees thrive professionally and personally. We provide competitive compensation and benefits that support both employees and their families, helping them prepare for the future. In addition to highly competitive salaries, we offer a bonus program, 401(k) with company match, employee stock purchase program comprehensive medical, dental and vision benefits, including a robust wellness program paid time off for vacation, holidays, and sick days and much more. Grab the salary band and paste Degreed Engineers ***This Engineering role can be filled at the Entry, Mid-level or Senior Engineer level. Please see minimum qualifications list below for each level*** Expected salary range: Entry-Level - $85,000 Mid-Level - $90,000 - $110,000 Sr Level - $117,000- $143,000 Ranges are per year based on experience, along with a comprehensive benefits package that includes bonus and 401K. Primary Purpose of Position Performs advanced technical/engineering problem solving in support of nuclear plant operations. Responsible for technical decisions. Possesses excellent knowledge in functional discipline and its practical application and has detailed knowledge of applicable industry codes and regulations. Primary Duties and Accountabilities Perform engineering and technical tasks as assigned by supervision applying general engineering principles Assure all engineering products prepared are in accordance with applicable safety analyses, industry codes, engineering specifications and all regulatory requirements. Participate in the development and implementation of effective processes and techniques at appropriate levels of detail and in compliance with established policies and procedures. Recommend format and methodology improvements to standard processes and procedures. Perform other job assignments and duties as directed by management or pursuant to company policy, including but not limited to emergency response, departmental coverage, call outs, and support of outage activities in positions outside the department. MINIMUM QUALIFICATIONS for Entry Level E01 Engineer &ndash New Graduate Bachelor&rsquos degree in Engineering (Chemical, Civil/Structural, Electrical, Industrial, Mechanical or Nuclear) Maintain minimum access requirements or unescorted access requirements, as applicable, and favorable medical examination and/or testing in accordance with position duties MINIMUM QUALIFICATIONS for Mid-level E02 Engineer Bachelor&rsquos degree in engineering (chemical, civil/structural, electrical, industrial, mechanical or nuclear) with 2 years of nuclear or related engineering experience Maintain minimum access requirements or unescorted access requirements, as applicable, and favorable medical examination and/or testing in accordance with position duties MINIMUM QUALIFICATIONS for Senior E03 Engineer Bachelor&rsquos degree in Engineering (Chemical, Civil/Structural, Electrical, Industrial, Mechanical or Nuclear) with 5 years of nuclear experience or related engineering experience Maintain minimum access requirement or unescorted access requirements, as applicable, and favorable medical examination and/or testing in accordance with position duties Preferred Qualifications Engineer in Training Certification Good grasp of techniques and a good understanding of the fundamental functions performed by the group. As responsibility increases within the organization Experience with diesel engines or diesel generators
    $117k-143k yearly Auto-Apply 2d ago
  • Lead HPC Architect Cybersecurity - High Performance & Computational Data Ecosystem

    Icahn School of Medicine at Mount Sinai 4.8company rating

    Data engineer job in New York, NY

    The Scientific Computing and Data group at the Icahn School of Medicine at Mount Sinai partners with scientists to accelerate scientific discovery. To achieve these aims, we support a cutting-edge high-performance computing and data ecosystem along with MD/PhD-level support for researchers. The group is composed of a high-performance computing team, a clinical data warehouse team and a data services team. The Lead HPC Architect, Cybersecurity, High Performance Computational and Data Ecosystem, is responsible for designing, implementing, and managing the cybersecurity infrastructure and technical operations of Scientific Computing's computational and data science ecosystem. This ecosystem includes a 25,000+ core and 40+ petabyte usable high-performance computing (HPC) systems, clinical research databases, and a software development infrastructure for local and national projects. The HPC system is the fastest in the world at any academic biomedical center (Top 500 list). To meet Sinai's scientific and clinical goals, the Lead brings a strategic, tactical and customer-focused vision to evolve the ecosystem to be continually more resilient, secure, scalable and productive for basic and translational biomedical research. The Lead combines deep technical expertise in cybersecurity, HPC systems, storage, networking, and software infrastructure with a strong focus on service, collaboration, and strategic planning for researchers and clinicians throughout the organization and beyond. The Lead is an expert troubleshooter, productive partner and leader of projects. The lead will work with stakeholders to make sure the HPC infrastructure is in compliance with governmental funding agency requirements and to promote efficient resource utilizations for researchers This position reports to the Director for HPC and Data Ecosystem in Scientific Computing and Data. Key Responsibilities: HPC Cybersecurity & System Administration: Design, implement, and manage all cybersecurity operations within the HPC environment, ensuring alignment with industry standards (NIST, ISO, GDPR, HIPAA, CMMC, NYC Cyber Command, etc.). Implement best practices for data security, including but not limited to encryption (at rest, in transit, and in use), audit logging, access control, authentication control, configuration managements, secure enclaves, and confidential computing. Perform full-spectrum HPC system administration: installation, monitoring, maintenance, usage reporting, troubleshooting, backup and performance tuning across HPC applications, web service, database, job scheduler, networking, storage, computes, and hardware to optimize workload efficiency. Lead resolution of complex cybersecurity and system issues; provide mentorship and technical guidance to team members. Ensure that all designs and implementations meet cybersecurity, performance, scalability, and reliability goals. Ensure that the design and operation of the HPC ecosystem is productive for research. Lead the integration of HPC resources with laboratory equipment for data ingestion aligned with all regulatory such as genomic sequencers, microscopy, clinical system etc. Develop, review and maintain security policies, risk assessments, and compliance documentation accurately and efficiently. Collaborate with institutional IT, compliance, and research teams to ensure all regulatory, Sinai Policy and operational alignment. Design and implement hybrid and cloud-integrated HPC solutions using on-premise and public cloud resources. Partner with other peers regionally, nationally and internationally to discover, propose and deploy a world-class research infrastructure for Mount Sinai. Stay current with emerging HPC, cloud, and cybersecurity technologies to keep the organization's infrastructure up-to-date. Work collaboratively, effectively and productively with other team members within the group and across Mount Sinai. Provide after-hours support as needed. Perform other duties as assigned or requested. Requirements: Bachelor's degree in computer science, engineering or another scientific field. Master's or PhD preferred. 10 years of progressive HPC system administration experience with Enterprise Linux releases including RedHat/CentOS/Rocky Systems, and batch cluster environment. Experience with all aspects of high-throughput HPC including schedulers (LSF or Slurm), networking (Infiniband/Gigabit Ethernet), parallel file systems and storage, configuration management systems (xCAT, Puppet and/or Ansible), etc. Proficient in cybersecurity processes, posture, regulations, approaches, protocols, firewalls, data protection in a regulated environment (e.g. finance, healthcare). In-depth knowledge HIPAA, NIST, FISMA, GDPR and related compliance standards, with prove experience building and maintaining compliant HPC system Experience with secure enclaves and confidential computing. Proven ability to provide mentorship and technical leadership to team members. Proven ability to lead complex projects to completion in collaborative, interdisciplinary settings with minimum guidance. Excellent analytical ability and troubleshooting skills. Excellent communication, documentation, collaboration and interpersonal skills. Must be a team player and customer focused. Scripting and programming experience. Preferred Experience Proficient with cloud services, orchestration tools, openshift/Kubernetes cost optimization and hybrid HPC architectures. Experience with Azure, AWS or Google cloud services. Experience with LSF job scheduler and GPFS Spectrum Scale. Experience in a healthcare environment. Experience in a research environment is highly preferred. Experience with software that enables privacy-preserving linking of PHI. Experience with Globus data transfer. Experience with Web service, SAP HANA, Oracle, SQL, MariaDB and other database technologies. Strength through Unity and Inclusion The Mount Sinai Health System is committed to fostering an environment where everyone can contribute to excellence. We share a common dedication to delivering outstanding patient care. When you join us, you become part of Mount Sinai's unparalleled legacy of achievement, education, and innovation as we work together to transform healthcare. We encourage all team members to actively participate in creating a culture that ensures fair access to opportunities, promotes inclusive practices, and supports the success of every individual. At Mount Sinai, our leaders are committed to fostering a workplace where all employees feel valued, respected, and empowered to grow. We strive to create an environment where collaboration, fairness, and continuous learning drive positive change, improving the well-being of our staff, patients, and organization. Our leaders are expected to challenge outdated practices, promote a culture of respect, and work toward meaningful improvements that enhance patient care and workplace experiences. We are dedicated to building a supportive and welcoming environment where everyone has the opportunity to thrive and advance professionally. Explore this opportunity and be part of the next chapter in our history. About the Mount Sinai Health System: Mount Sinai Health System is one of the largest academic medical systems in the New York metro area, with more than 48,000 employees working across eight hospitals, more than 400 outpatient practices, more than 300 labs, a school of nursing, and a leading school of medicine and graduate education. Mount Sinai advances health for all people, everywhere, by taking on the most complex health care challenges of our time - discovering and applying new scientific learning and knowledge; developing safer, more effective treatments; educating the next generation of medical leaders and innovators; and supporting local communities by delivering high-quality care to all who need it. Through the integration of its hospitals, labs, and schools, Mount Sinai offers comprehensive health care solutions from birth through geriatrics, leveraging innovative approaches such as artificial intelligence and informatics while keeping patients' medical and emotional needs at the center of all treatment. The Health System includes more than 9,000 primary and specialty care physicians; 13 joint-venture outpatient surgery centers throughout the five boroughs of New York City, Westchester, Long Island, and Florida; and more than 30 affiliated community health centers. We are consistently ranked by U.S. News & World Report's Best Hospitals, receiving high "Honor Roll" status. Equal Opportunity Employer The Mount Sinai Health System is an equal opportunity employer, complying with all applicable federal civil rights laws. We do not discriminate, exclude, or treat individuals differently based on race, color, national origin, age, religion, disability, sex, sexual orientation, gender, veteran status, or any other characteristic protected by law. We are deeply committed to fostering an environment where all faculty, staff, students, trainees, patients, visitors, and the communities we serve feel respected and supported. Our goal is to create a healthcare and learning institution that actively works to remove barriers, address challenges, and promote fairness in all aspects of our organization.
    $89k-116k yearly est. 3d ago
  • Founding Engineer

    Pact 3.8company rating

    Data engineer job in New York, NY

    What is pact? pact is a cancer copilot that helps oncology patients and their families take control of their care. We connect patients to cutting-edge treatments and relevant clinical trials in seconds, not weeks. Built by patients and caregivers, pact is the tool we wish had existed during our own cancer journeys. We are seeking like-minded change agents to join us in accelerating patient access to clinical trials and better treatment options. Who we are looking for? We are hiring a Full Stack Engineer for a full-time, in-person role in SF or NYC, with a clear path to becoming a founding team member. You will work directly with our CTO to own and build core product features across the stack, from AI-powered trial matching to patient- and clinician-facing experiences. This role offers real technical and product ownership at an early stage, with influence over architecture, roadmap, and execution. PACT is building at the frontier of cancer care, and this is an opportunity to help shape the future of how patients access life-changing treatments. We are looking for someone who wants responsibility, speed, and impact from day one. Qualifications Deep passion for improving the experience of cancer patients and their families 2+ years of software engineering experience at a big tech company or health-tech startup Desire and comfort to work and thrive in ambiguity Strong communication skills and ability to collaborate across product, design, and clinical stakeholders Frontend expertise: Typescript, React, NextJS, Figma Backend expertise: Python, gRPC, API contract building Experience deploying on GCP Experience building software in regulated healthcare environments, including HIPAA and patient privacy requirements Compensation Competitive early-stage base salary Meaningful equity with long-term upside
    $59k-79k yearly est. 1d ago
  • Senior Architect - NYC Code, Development & CA Specialist

    The Highrise Group

    Data engineer job in New York, NY

    Highrise | Brooklyn, NY (Hybrid) Highrise is a full-service Architecture, Expediting, and Development firm focused on complex New York City projects. We are seeking a high-level Senior Architect to join our in-house Brooklyn team and lead projects from design through construction. This is a hybrid position with 2-3 days remote and the remainder in our Brooklyn office. Role & Responsibilities Lead architectural design and documentation for new buildings and major alterations Manage Construction Administration (CA), including: RFIs, submittals, shop drawings, and field conditions Site visits and coordination with GC, consultants, and ownership Issue resolution during construction Ensure compliance with NYC Zoning, Building Code, Energy Code, and ADA Coordinate DOB filings and agency reviews Collaborate closely with ownership, expeditors, engineers, and development teams Review work and mentor junior staff Qualifications Licensed Architect preferred (NY license a strong plus) Proven experience in Construction Administration for NYC projects Deep working knowledge of NYC Code, Zoning, Energy, and ADA Strong development-driven project background Advanced proficiency in AutoCAD and Revit Able to independently manage projects from concept through CO What We Offer Very competitive compensation Generous PTO Hybrid schedule (2-3 days remote) Direct access to ownership and decision-making Long-term growth within a fast-growing firm High-quality NYC development projects 📍 Location: Brooklyn, NY 🗓 Schedule: Full-Time, Hybrid (2-3 days remote)
    $99k-129k yearly est. 5d ago
  • Principal Data Scientist, Advanced ML & Optimization

    Via 3.6company rating

    Data engineer job in New York

    Via is using technology to transform transportation around the world. From changing a single person's daily commute to reducing humanity's collective environmental footprint - we've got huge goals. As a Principal Data Scientist - Advanced ML and Optimization, you'll utilize advanced quantitative & statistical techniques to drive business model and product innovation for Via. What You'll Do: Use optimization theory and/or reinforcement learning and/or graph-based methods to improve transportation networks. Own projects end-to-end, including data querying and cleaning, model/algorithm development, model/algorithm deployment, and, when necessary, building proof-of-concept web apps that leverage your models/algorithms. Hone your presentation and communication skills by giving regular reports on your progress and findings. Who You Are: Have a Master's or PhD degree from a top-tier university in neuroscience, physics, biology, math, economics, computer science, operations research, or other highly quantitative field, and a record of exceptional academic achievement. Have worked in a full-stack setting (from data to proof-of-concept application), or are eager to gain this skillset. Are hungry for knowledge. You enjoy reading and understanding recent scientific publications, and often build toy implementations of novel techniques to see how they function. Have completed 2+ projects that utilized optimization, reinforcement learning, and/or graph-based methods in either an academic or industry environment. Have written at least 10,000 lines of high-quality functional or object-oriented code in Python or other language suitable for data analytics. Are a natural relationship builder that values camaraderie, comes with a sense of humor, and doesn't take themselves too seriously. Can think on your feet and are curious, coachable, and self-motivated. Are an extraordinary communicator with demonstrated writing, editing and visualization skills What Catches Our Eye: Prior experience with SQL is a plus. Prior experience with graph-neural networks is a plus. Authorship of one or more high-impact publications is a plus. Prior role at a startup, or successful completion of a Postdoc, or experience working in another fast-paced environment is a plus. Compensation and Benefits: Final salary will be determined by the candidate's experience, knowledge, and skills. Salary reflected does not include equity or variable pay, where applicable. Salary Range: $125,000-$165,000 We are proud to offer a generous and comprehensive benefits package, including free medical plans and 401K matching. We're Via, and we build technology that changes the way the world moves. We're driven by a simple mission: to create modern and efficient public transportation systems that provide far greater access to jobs, healthcare, and education. With our best in class suite of products, we make transit thrive. Our teams of world class engineers, data-scientists, product managers, operations specialists, marketers, transit experts and more bring cutting-edge AI-powered software and innovative technology-enabled operations to our partners across the globe. Founded in 2012, Via builds solutions to digitize, automate, and enable data-driven decision making for entire transportation networks; fixed-route buses, microtransit, paratransit, school buses, autonomous vehicles, and more. If you're excited to be at the forefront of modernizing the future of transportation, are up for solving tough problems, and willing to become/already are a transit nerd, we are the place for you. Even if your past experience doesn't align perfectly with every qualification in the job description for this role, we encourage you to apply. You may be just the right candidate for this or other opportunities. Ready to join the ride? Via is an equal opportunity employer.
    $125k-165k yearly Auto-Apply 60d+ ago
  • Staff Data Scientist

    Recursion Pharmaceuticals 4.2company rating

    Data engineer job in New York, NY

    Your work will change lives. Including your own. Please note: Our offices will be closed for our annual winter break from December 22, 2025, to January 2, 2026. Our response to your application will be delayed. The Impact You'll Make As a member of Recursion's AI-driven drug discovery initiatives, you will be at the forefront of reimagining how biological knowledge is generated, stored, accessed, and reasoned upon by LLMs. You will play a key role in developing the biological reasoning infrastructure, connecting large-scale data and codebases with dynamic, agent-driven AI systems.You will be responsible for defining the architecture that grounds our agents in biological truth. This involves integrating biomedical resources to enable AI systems to reason effectively and selecting the most appropriate data retrieval strategies to support those insights. This is a highly collaborative role: you will partner with machine learning engineers, biologists, chemists, and platform teams to build the connective tissue that allows our AI agents to reason like a scientist. The ideal candidate possesses deep expertise in both core bioinformatics/cheminformatics libraries and modern GenAI frameworks (including RAG and MCP), a strong architectural vision, and the ability to translate high-potential prototypes into scalable production workflows. In this role, you will: * Architect and maintain robust infrastructure to keep critical internal and external biological resources (e.g., ChEMBL, Ensembl, Reactome, proprietary assays) up-to-date and accessible to reasoning agents. * Design sophisticated context retrieval strategies, choosing the most effective approach for each biological use case, whether working with structured, entity-focused data, unstructured RAG, or graph-based representations. * Integrate established bioinformatics/cheminformatics libraries into a GenAI ecosystem, creating interfaces (such as via MCP) that allow agents to autonomously query and manipulate biological data. * Pilot methods for tool use by LLMs, enabling the system to perform complex tasks like pathway analysis on the fly rather than relying solely on memorized weights. * Develop scalable, production-grade systems that serve as the backbone for Recursion's automated scientific reasoning capabilities. * Collaborate cross-functionally with Recursion's core biology, chemistry, data science and engineering teams to ensure our biological data and the reasoning engines are accurately reflecting the complexity of disease biology and drug discovery. * Present technical trade-offs (e.g., graph vs. vector) to leadership and stakeholders in a clear, compelling way that aligns technical reality with product vision. The Team You'll Join You'll join a bold, agile team of scientists and engineers dedicated to building comprehensive biological maps by integrating Recursion's in-house datasets, patient data, and external knowledge layers to enable sophisticated agent-based reasoning. Within this cross-functional team, you will design and maintain the biological context and data structures that allow agents to reason accurately and efficiently. You'll collaborate closely with wet-lab biologists and core platform engineers to develop systems that are not only technically robust but also scientifically rigorous. The ideal candidate is curious about emerging AI technologies, passionate about making biological data both machine-readable and machine-understandable, and brings a strong foundation in systems biology, biomedical data analysis, and agentic AI systems. The Experience You'll Need * PhD in a relevant field (Bioinformatics, Cheminformatics, Computational Biology, Computer Science, Systems Biology) with 5+ years of industry experience, or MS in a relevant field with 7+ years of experience, focusing on biological data representation and retrieval. * Proficiency in utilizing major public biological databases (NCBI, Ensembl, STRING, GO) and using standard bioinformatics/cheminformatics toolkits (e.g., RDKit, samtools, Biopython). * Strong skills in designing and maintaining automated data pipelines that support continuous ingestion, transformation, and refresh of biological data without manual intervention. * Ability to work with knowledge graph data models and query languages (e.g., RDF, SPARQL, OWL) and translate graph-structured data into relational or other non-graph representations, with a strong judgment in evaluating trade-offs between different approaches. * Competence in building and operating GenAI stacks, including RAG systems, vector databases, and optimization of context windows for large-scale LLM deployments. * Hands-on expertise with agentic AI frameworks (e.g., MCP, Google ADK, LangChain, AutoGPT) and familiarity with leading LLMs (e.g., Google Gemini/Gemma) in agentic workflows, including benchmarking and evaluating agent performance on bioinformatics/cheminformatics tasks such as structure prediction, target identification, and pathway mapping. * Strong Python skills and adherence to software engineering best practices, including CI/CD, Git-based version control, and modular design. * Excellent cross-functional communication skills, ability to clearly explain complex architectural decisions to both scientific domain experts and technical stakeholders. Nice to Have * Strong background in machine learning and deep learning, including hands-on experience with foundation models and modern neural architectures. * Fine-tuning LLMs on scientific corpora for domain-specific reasoning. * Integrating LLMs with experimental or proprietary assay data in live scientific workflows. * Background in drug discovery and target identification. * Meaningful contributions to open-source libraries, research codebases, or community-driven tools. Working Location & Compensation: This is an office-based, hybrid role in either our Salt Lake City, UT or New York City, NY offices. Employees are expected to work in the office at least 50% of the time. At Recursion, we believe that every employee should be compensated fairly. Based on the skill and level of experience required for this role, the estimated current annual base range for this role is $200,600 - $238,400. You will also be eligible for an annual bonus and equity compensation, as well as a comprehensive benefits package. #LI-DNI The Values We Hope You Share: * We act boldly with integrity. We are unconstrained in our thinking, take calculated risks, and push boundaries, but never at the expense of ethics, science, or trust. * We care deeply and engage directly. Caring means holding a deep sense of responsibility and respect - showing up, speaking honestly, and taking action. * We learn actively and adapt rapidly. Progress comes from doing. We experiment, test, and refine, embracing iteration over perfection. * We move with urgency because patients are waiting. Speed isn't about rushing but about moving the needle every day. * We take ownership and accountability. Through ownership and accountability, we enable trust and autonomy-leaders take accountability for decisive action, and teams own outcomes together. * We are One Recursion. True cross-functional collaboration is about trust, clarity, humility, and impact. Through sharing, we can be greater than the sum of our individual capabilities. Our values underpin the employee experience at Recursion. They are the character and personality of the company demonstrated through how we communicate, support one another, spend our time, make decisions, and celebrate collectively. More About Recursion Recursion (NASDAQ: RXRX) is a clinical stage TechBio company leading the space by decoding biology to radically improve lives. Enabling its mission is the Recursion OS, a platform built across diverse technologies that continuously generate one of the world's largest proprietary biological and chemical datasets. Recursion leverages sophisticated machine-learning algorithms to distill from its dataset a collection of trillions of searchable relationships across biology and chemistry unconstrained by human bias. By commanding massive experimental scale - up to millions of wet lab experiments weekly - and massive computational scale - owning and operating one of the most powerful supercomputers in the world, Recursion is uniting technology, biology and chemistry to advance the future of medicine. Recursion is headquartered in Salt Lake City, where it is a founding member of BioHive, the Utah life sciences industry collective. Recursion also has offices in Toronto, Montréal, New York, London, Oxford area, and the San Francisco Bay area. Learn more at ****************** or connect on X (formerly Twitter) and LinkedIn. Recursion is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, veteran status, or any other characteristic protected under applicable federal, state, local, or provincial human rights legislation. Accommodations are available on request for candidates taking part in all aspects of the selection process. Recruitment & Staffing Agencies: Recursion Pharmaceuticals and its affiliate companies do not accept resumes from any source other than candidates. The submission of resumes by recruitment or staffing agencies to Recursion or its employees is strictly prohibited unless contacted directly by Recursion's internal Talent Acquisition team. Any resume submitted by an agency in the absence of a signed agreement will automatically become the property of Recursion, and Recursion will not owe any referral or other fees. Our team will communicate directly with candidates who are not represented by an agent or intermediary unless otherwise agreed to prior to interviewing for the job.
    $200.6k-238.4k yearly Auto-Apply 2d ago
  • Data Scientist II - Marketing Mix Models

    Walt Disney Co 4.6company rating

    Data engineer job in New York, NY

    Marketing science - a sub-team within marketing analytics at Disney's Direct to Consumer team (Hulu, Disney+, ESPN+ and Star) - is in search of an econometrician to run marketing mix models (MMM) and associated ancillary analysis. This position will work as part of a team focused primarily on econometric modeling, which also provides support for downstream practices used to inform marketing investment. The analyst plays a hands-on role in modeling efforts. The ideal candidate has a substantial quantitative skill set with direct experience in marketing science practices (MMM, attribution modeling, testing / experimentation, etc.), and should serve as a strong mentor to analysts, helping to onboard new talent in support of wider company goals. Technical acumen as well as narrative-building are integral to the success of this role. Responsibilities * Build, sustain and scale econometric models (MMM) for Disney Streaming Services with support from data engineering and data product teams * Quantify ROI on marketing investment, determine optimal spend range across the portfolio, identify proposed efficiency caps by channel, set budget amounts and inform subscriber acquisition forecasts * Support ad hoc strategic analysis to provide recommendations that drive increased return on spend through shifts in mix, flighting, messaging and tactics, and that help cross-validate model results * Provide insights to marketing and finance teams, helping to design and execute experiments to move recommendations forward based on company goals (e.g., subscriber growth, LTV, etc) * Support long-term MMM (et.al.) automation, productionalization and scale with support from data engineering and product * Build out front-end reporting and dashboarding in partnership with data product analysts and data engineers to communicate performance metrics across services, markets, channels and subscriber types Basic Qualifications * Bachelor's degree in advanced Mathematics, Statistics, Data Science or comparable field of study * 3+ years of experience in a marketing data science / analytics role with understanding of measurement and optimization best practices * Coursework or direct experience in applied econometric modeling, ideally in support of measure marketing efficiency and optimize spend, flighting and mix to maximize return on ad spend (i.e., MMM) * Exposure / understanding of media attribution practices for digital and linear media, the data required to power them and methodologies for measurement * Understanding of incrementality experiments to validate model recommendations and gain learnings on channel/publisher efficacy * Exposure to / familiarity with with BI/data concepts and experience building out self-service marketing data solutions * Strong coding experience in one (or more) data programming languages like Python/R * Ability to draw insights and conclusions from data to inform model development and business decisions * Experience in SQL Preferred Qualifications * Masters degree in Computer Science, Engineering, Mathematics, Physics, Econometrics, or Statistics The hiring range for this position in Santa Monica, CA is $117,500 to $157,500 per year and in New York City, NY & Seattle, WA is $123,000 to $165,000. The base pay actually offered will take into account internal equity and also may vary depending on the candidate's geographic region, job-related knowledge, skills, and experience among other factors. A bonus and/or long-term incentive units may be provided as part of the compensation package, in addition to the full range of medical, financial and/or other benefits, dependent on the level and position offered. About Disney Direct to Consumer: Disney's Direct to Consumer team oversees the Hulu and Disney+ streaming businesses within Disney Entertainment helping to bring The Walt Disney Company's best-in-class storytelling to fans and families everywhere. About The Walt Disney Company: The Walt Disney Company, together with its subsidiaries and affiliates, is a leading diversified international family entertainment and media enterprise that includes three core business segments: Disney Entertainment, ESPN, and Disney Experiences. From humble beginnings as a cartoon studio in the 1920s to its preeminent name in the entertainment industry today, Disney proudly continues its legacy of creating world-class stories and experiences for every member of the family. Disney's stories, characters and experiences reach consumers and guests from every corner of the globe. With operations in more than 40 countries, our employees and cast members work together to create entertainment experiences that are both universally and locally cherished. This position is with Disney Streaming Services LLC, which is part of a business we call Disney Direct to Consumer. Disney Streaming Services LLC is an equal opportunity employer. Applicants will receive consideration for employment without regard to race, religion, color, sex, sexual orientation, gender, gender identity, gender expression, national origin, ancestry, age, marital status, military or veteran status, medical condition, genetic information or disability, or any other basis prohibited by federal, state or local law. Disney champions a business environment where ideas and decisions from all people help us grow, innovate, create the best stories and be relevant in a constantly evolving world. Apply Now Apply Later Current Employees Apply via My Disney Career Explore Location
    $123k-165k yearly 20d ago
  • Lead Data Scientist

    Smarsh 4.6company rating

    Data engineer job in New York

    Who are we? Smarsh empowers its customers to manage risk and unleash intelligence in their digital communications. Our growing community of over 6500 organizations in regulated industries counts on Smarsh every day to help them spot compliance, legal or reputational risks in 80+ communication channels before those risks become regulatory fines or headlines. Relentless innovation has fueled our journey to consistent leadership recognition from analysts like Gartner and Forrester, and our sustained, aggressive growth has landed Smarsh in the annual Inc. 5000 list of fastest-growing American companies since 2008. Summary As a Lead Data Scientist (NLP & Financial Compliance) at Smarsh, you will spearhead the development of state-of-the-art natural language processing (NLP) and large language model (LLM) solutions that power next-generation compliance and surveillance systems. You'll work on highly specialized problems at the intersection of natural language processing, communications intelligence, financial supervision, and regulatory compliance, where unstructured data from emails, chats, voice transcripts, and trade communications hold the keys to uncovering misconduct and risk. The role will involve working with other Senior Data Scientists and mentoring Associate Data Scientists in analyzing complex data, generating insights, and creating solutions as needed across a variety of tools and platforms. This role demands both technical excellence in NLP modeling and a deep understanding of financial domain behavior-including insider trading, market manipulation, off-channel communications, MNPI, bribery, and other supervisory risk areas. The ideal candidate for this position will possess the ability to perform both independent and team-based research and generate insights from large data sets with a hands-on/can do attitude of servicing/managing day to day data requests and analysis. This role also offers a unique opportunity to get exposure to many problems and solutions associated with taking machine learning and analytics research to production. On any given day, you will have the opportunity to interface with business leaders, machine learning researchers, data engineers, platform engineers, data scientists and many more, enabling you to level up in true end-to-end data science proficiency.How will you contribute? Collect, analyze, and interpret small/large datasets to uncover meaningful insights to support the development of statistical methods / machine learning algorithms. Lead the design, training, and deployment of NLP and transformer-based models for financial surveillance and supervisory use cases (e.g., misconduct detection, market abuse, trade manipulation, insider communication). Development of machine learning models and other analytics following established workflows, while also looking for optimization and improvement opportunities Data annotation and quality review Exploratory data analysis and model fail state analysis Contribute to model governance, documentation, and explainability frameworks aligned with internal and regulatory AI standards. Client/prospect guidance in machine learning model and analytic fine-tuning/development processes Provide guidance to junior team members on model development and EDA Work with Product Manager(s) to intake project/product requirements and translate these to technical tasks within the team's tooling, technique and procedures Continued self-led personal development What will you bring? Strong understanding of financial markets, compliance, surveillance, supervision, or regulatory technology Experience with one or more data science and machine/deep learning frameworks and tooling, including scikit-learn, H2O, keras, pytorch, tensorflow, pandas, numpy, carot, tidyverse Command of data science and statistics principles (regression, Bayes, time series, clustering, P/R, AUROC, exploratory data analysis etc…) Strong knowledge of key programming concepts (e.g. split-apply-combine, data structures, object-oriented programming) Solid statistics knowledge (hypothesis testing, ANOVA, chi-square tests, etc…) Knowledge of NLP transfer learning, including word embedding models (glo Ve, fast Text, word2vec) and transformer models (Bert, SBert, HuggingFace, and GPT-x etc.) Experience with natural language processing toolkits like NLTK, spa Cy, Nvidia NeMo Knowledge of microservices architecture and continuous delivery concepts in machine learning and related technologies such as helm, Docker and Kubernetes Familiarity with Deep Learning techniques for NLP. Familiarity with LLMs - using ollama & Langchain Excellent verbal and written skills Proven collaborator, thriving on teamwork Preferred Qualifications Master's or Doctor of Philosophy degree in Computer Science, Applied Math, Statistics, or a scientific field Familiarity with cloud computing platforms (AWS, GCS, Azure) Experience with automated supervision/surveillance/compliance tools About our culture Smarsh hires lifelong learners with a passion for innovating with purpose, humility and humor. Collaboration is at the heart of everything we do. We work closely with the most popular communications platforms and the world's leading cloud infrastructure platforms. We use the latest in AI/ML technology to help our customers break new ground at scale. We are a global organization that values diversity, and we believe that providing opportunities for everyone to be their authentic self is key to our success. Smarsh leadership, culture, and commitment to developing our people have all garnered Comparably.com Best Places to Work Awards. Come join us and find out what the best work of your career looks like.
    $76k-103k yearly est. Auto-Apply 4d ago
  • Big Data Consultant

    Pyramid It

    Data engineer job in New York

    Pyramid is a leading Information Technology Consulting services company headquartered in metropolitan Atlanta, GA with prime emphasis on the following service offerings: • Staff Augmentation • Lifecycle IT solutions o Application Development & Support o Outsourced Testing • Mobile Development and Test Automation The company was incorporated in the State of Georgia in 1996 and has grown to over 2500 Information Technology consultants serving clients across the United States and around the globe. In addition to Atlanta, Pyramid has offices worldwide including Charlotte, NC; Chicago, IL; Dallas, TX; Richmond, VA; San Francisco, CA and Somerset, NJ in the United States, London in the United Kingdom, Singapore, and three offices in India (New Delhi, Hyderabad and Chandigargh). Pyramid has been ranked by Staffing Industry Analysts as one of the largest diversity staffing firms specializing in IT and among the fastest growing U.S. staffing firm overall. In addition, Pyramid is a previous winner of the National Minority Supplier Development Council's Supplier of the Year and has won numerous Supplier of the Year awards from the Georgia Minority Supplier Development Council. see less Specialties IT Staff Augmentation, Application Management Services, Enterprise Project Solutions, Mobile Development, Mobile Test Automation, Product and Engineering Services, Enterprise Mobility, Test Automation, QA - Manual and Automated Testing, QA Strategy Website ************************ Industry Information Technology and Services Type Privately Held Company Size 1001-5000 employees Founded 1996 Job Description Experience: Overall 6+ Years and 1+ Year on Big Data Essential Functions/Responsibilities: Big Data / NoSQL DBA with overall 6+ years of experience in any of the DBA technologies and at least 1+ year in Big Data / NoSQL Administration Should have hands on with Hadoop Admin, managing HDFS cluster and Hadoop Metadata. Should be proficient with any one of the NoSQL technologies Hbase, MongoDB, Cassandra Very good communication and customer handling skills Should be a quick learner and able to manage and coordinate between various application and infrastructure teams. Additional Information All your information will be kept confidential according to EEO guidelines.
    $88k-117k yearly est. 17h ago
  • ETL Talend MDM Architect

    Trg 4.6company rating

    Data engineer job in New York, NY

    Responsibilities: • Develop and test Extract, Transformation, and Loading (ETL) modules based on design specifications • Develop and test ETL Mappings in Talend • Plan, test, and deploy ETL mappings, and database code as part of application build process across the enterprise • Provide effective communications with all levels of internal and external customers and staff • Must demonstrate knowledge in the following areas: o Data Integration o Data Architecture o Team Lead experience is a plus • Understand, analyze, assess and recommend ETL environment from technology strategy and operational standpoint • Understand and assess source system data issues and recommend solution from data integration standpoint • Create high level, low level technical design documents for data integration • Design exceptions handling, audit and data resolution processes • Performance tune ETL environment • Conduct proof of concepts • Estimation of work based on functional requirements documents • Identify system deficiencies and recommending solutions • Designing, coding, and writing unit test cases from functional requirements • Delivering efficient and bug-free ETL packages and documentation • Maintenance and support of enterprise ETL jobs • Experience with Talend Hadoop tools is a plus Basic Qualifications: • 3+ years of development experience on Talend ETL tools • 7+ years working with one or more of the following ETL Tools: Talend, Informatica, Ab Initio or Data Stage • 7+ years proficient experience as a developer • Bachelor's Degree in Computer Science or equivalent • Database (Oracle, SQL Server, DB2) • Database Programming (Complex SQL, PL/SQL development knowledge) • Data Modeling • Business Analysis • Top level performer with ability to work independently in short time frames • Proficient working in a Linux environment • Experience in scripting languages (Shell, Python or Perl) • 5+ years of experience deploying large scale projects ETL projects that • 3+ years of experience in a development lead position • Data analysis, data mapping, data loading, and data validation • Understand reusability, parameterization, workflow design, etc. • Thorough understanding of Entire life cycle of Software and various Software Engineering Methodologies • Performance tuning of interfaces that extract, transform and load tens of millions of records • Knowledge of Hadoop ecosystem technologies is a plus Additional Information If you are comfortable with the position and location then please revert me back at the earliest with your updated resume and following details or I would really appreciate if you can call me back on my number. Full Name: Email: Skype id: Contact Nos.: Current Location: Open to relocate: Start Availability: Work Permit: Flexible time for INTERVIEW: Current Company: Current Rate: Expected Rate: Total IT Experience [Years]: Total US Experience [Years]: Key Skill Set: Best time to call: In case you are not interested, I will be very grateful if you can pass this position to your colleagues or friends who might be interested. All your information will be kept confidential according to EEO guidelines.
    $100k-125k yearly est. 17h ago
  • ETL Architect

    Integrated Resources 4.5company rating

    Data engineer job in New York, NY

    A Few Words About Us Integrated Resources, Inc is a premier staffing firm recognized as one of the tri-states most well-respected professional specialty firms. IRI has built its reputation on excellent service and integrity since its inception in 1996. Our mission centers on delivering only the best quality talent, the first time and every time. We provide quality resources in four specialty areas: Information Technology (IT), Clinical Research, Rehabilitation Therapy and Nursing. Position: ETL Architect Location: NYC Duration: 6 months Job Description: This opportunity is for individuals who have Hands-on experience in data warehouse design and development. The Role demands more than a typical ETL lead role as it interacts outwardly on projects with architects, PM's, OPS, data modelers, developers, admins, DBA's and testers. This is a hands-on delivery-focused role, and the individual will be responsible for technical delivery of data warehouse and data integration projects Must have skills • 7-10 years Hands on experience with Informatica ETL in designing and developing ETL processes based on multiple sources using ETL tools • Experience in Architecting end to end ETL solutions • Hands on UNIX experience. Scripting (e.g. shell, perl, alerts, cron, automation) • Expert at all aspects of relational database design • Experience working with engineering team with respect to database-related performance tuning, writing of complex SQL, indexing, etc. Good to Have: • Experience with IDQ, MDM, other ETL tools • Experience with dashboard and report development • Experience with financial services firms will be preferred Additional Information Kind Regards Sachin Gaikwad Technical Recruiter Integrated Resources, Inc. Direct Line : 732-429-1920
    $102k-130k yearly est. 60d+ ago
  • Principal Data Scientist : Product to Market (P2M) Optimization

    The Gap 4.4company rating

    Data engineer job in New York, NY

    About Gap Inc. Our brands bridge the gaps we see in the world. Old Navy democratizes style to ensure everyone has access to quality fashion at every price point. Athleta unleashes the potential of every woman, regardless of body size, age or ethnicity. Banana Republic believes in sustainable luxury for all. And Gap inspires the world to bring individuality to modern, responsibly made essentials. This simple idea-that we all deserve to belong, and on our own terms-is core to who we are as a company and how we make decisions. Our team is made up of thousands of people across the globe who take risks, think big, and do good for our customers, communities, and the planet. Ready to learn fast, create with audacity and lead boldly? Join our team. About the Role Gap Inc. is seeking a Principal Data Scientist with deep expertise in operations research and machine learning to lead the design and deployment of advanced analytics solutions across the Product-to-Market (P2M) space. This role focuses on driving enterprise-scale impact through optimization and data science initiatives spanning pricing, inventory, and assortment optimization. The Principal Data Scientist serves as a senior technical and strategic thought partner, defining solution architectures, influencing product and business decisions, and ensuring that analytical solutions are both technically rigorous and operationally viable. The ideal candidate can lead end-to-end solutioning independently, manage ambiguity and complex stakeholder dynamics, and communicate technical and business risk effectively across teams and leadership levels. What You'll Do * Lead the framing, design, and delivery of advanced optimization and machine learning solutions for high-impact retail supply chain challenges. * Partner with product, engineering, and business leaders to define analytics roadmaps, influence strategic priorities, and align technical investments with business goals. * Provide technical leadership to other data scientists through mentorship, design reviews, and shared best practices in solution design and production deployment. * Evaluate and communicate solution risks proactively, grounding recommendations in realistic assessments of data, system readiness, and operational feasibility. * Evaluate, quantify, and communicate the business impact of deployed solutions using statistical and causal inference methods, ensuring benefit realization is measured rigorously and credibly. * Serve as a trusted advisor by effectively managing stakeholder expectations, influencing decision-making, and translating analytical outcomes into actionable business insights. * Drive cross-functional collaboration by working closely with engineering, product management, and business partners to ensure model deployment and adoption success. * Quantify business benefits from deployed solutions using rigorous statistical and causal inference methods, ensuring that model outcomes translate into measurable value * Design and implement robust, scalable solutions using Python, SQL, and PySpark on enterprise data platforms such as Databricks and GCP. * Contribute to the development of enterprise standards for reproducible research, model governance, and analytics quality. Who You Are * Master's or Ph.D. in Operations Research, Operations Management, Industrial Engineering, Applied Mathematics, or a closely related quantitative discipline. * 10+ years of experience developing, deploying, and scaling optimization and data science solutions in retail, supply chain, or similar complex domains. * Proven track record of delivering production-grade analytical solutions that have influenced business strategy and delivered measurable outcomes. * Strong expertise in operations research methods, including linear, nonlinear, and mixed-integer programming, stochastic modeling, and simulation. * Deep technical proficiency in Python, SQL, and PySpark, with experience in optimization and ML libraries such as Pyomo, Gurobi, OR-Tools, scikit-learn, and MLlib. * Hands-on experience with enterprise platforms such as Databricks and cloud environments * Demonstrated ability to assess, communicate, and mitigate risk across analytical, technical, and business dimensions. * Excellent communication and storytelling skills, with a proven ability to convey complex analytical concepts to technical and non-technical audiences. * Strong collaboration and influence skills, with experience leading cross-functional teams in matrixed organizations. * Experience managing code quality, CI/CD pipelines, and GitHub-based workflows. Preferred Qualifications * Experience shaping and executing multi-year analytics strategies in retail or supply chain domains. * Proven ability to balance long-term innovation with short-term deliverables. * Background in agile product development and stakeholder alignment for enterprise-scale initiatives. Benefits at Gap Inc. * Merchandise discount for our brands: 50% off regular-priced merchandise at Old Navy, Gap, Banana Republic and Athleta, and 30% off at Outlet for all employees. * One of the most competitive Paid Time Off plans in the industry.* * Employees can take up to five "on the clock" hours each month to volunteer at a charity of their choice.* * Extensive 401(k) plan with company matching for contributions up to four percent of an employee's base pay.* * Employee stock purchase plan.* * Medical, dental, vision and life insurance.* * See more of the benefits we offer. * For eligible employees Gap Inc. is an equal-opportunity employer and is committed to providing a workplace free from harassment and discrimination. We are committed to recruiting, hiring, training and promoting qualified people of all backgrounds, and make all employment decisions without regard to any protected status. We have received numerous awards for our long-held commitment to equality and will continue to foster a diverse and inclusive environment of belonging. In 2022, we were recognized by Forbes as one of the World's Best Employers and one of the Best Employers for Diversity. Salary Range: $201,700 - $267,300 USD Employee pay will vary based on factors such as qualifications, experience, skill level, competencies and work location. We will meet minimum wage or minimum of the pay range (whichever is higher) based on city, county and state requirements.
    $88k-128k yearly est. 27d ago
  • Data Scientist, User Operations

    Openai 4.2company rating

    Data engineer job in New York, NY

    About the Team OpenAI's User Operations organization is building the data and intelligence layer behind AI-assisted operations - the systems that decide when automation should help users, when humans should step in, and how both improve over time. Our flagship platform is transforming customer support into a model for “agent-first” operations across OpenAI. About the Role As a Data Scientist on User Operations, you'll design the models, metrics, and experimentation frameworks that power OpenAI's human-AI collaboration loop. You'll build systems that measure quality, optimize automation, and turn operational data into insights that improve product and user experience at scale. You'll partner closely with Support Automation Engineering, Product, and Data Engineering to ensure our data systems are production-grade, trusted, and impactful. This role is based in San Francisco or New York City. We use a hybrid work model of three days in the office per week and offer relocation assistance to new employees. Why it matters Every conversation users have with OpenAI products produces signals about how humans and AI interact. User Ops Data Science turns those signals into insights that shape how we support users today and design agentic systems for tomorrow. This is a unique opportunity to help define how AI collaboration at scale is measured and improved inside OpenAI. In this role, you will: Build and own metrics, classifiers, and data pipelines that determine automation eligibility, effectiveness, and guardrails. Design and evaluate experiments that quantify the impact of automation and AI systems on user outcomes like resolution quality and satisfaction. Develop predictive and statistical models that improve how OpenAI's support systems automate, measure, and learn from user interactions. Partner with engineering and product teams to create feedback loops that continuously improve our AI agents and knowledge systems. Translate complex data into clear, actionable insights for leadership and cross-functional stakeholders. Develop and socialize dashboards, applications, and other ways of enabling the team and company to answer product data questions in a self-serve way Contribute to establishing data science standards and best practices in an AI-native operations environment. Partner with other data scientists across the company to share knowledge and continually synthesize learnings across the organization You might thrive in this role if you have: 10+ years of experience in data science roles within product or technology organizations. Expertise in statistics and causal inference, applied in both experimentation and observational causal inference studies. Expert-level SQL and proficiency in Python for analytics, modeling, and experimentation. Proven experience designing and interpreting experiments and making statistically sound recommendations. Experience building data systems or pipelines that power production workflows or ML-based decisioning. Experience developing and extracting insights from business intelligence tools, such as Mode, Tableau, and Looker. Strategic and impact-driven mindset, capable of translating complex business problems into actionable frameworks. Ability to build relationships with diverse stakeholders and cultivate strong partnerships. Strong communication skills, including the ability to bridge technical and non-technical stakeholders and collaborate across various functions to ensure business impact. Ability to operate effectively in a fast-moving, ambiguous environment with limited structure. Strong communication skills and the ability to translate complex data into stories for non-technical partners. Nice-to-haves: Familiarity with large language models or AI-assisted operations platforms. Experience in operational automation or customer support analytics. Background in experimentation infrastructure or human-AI interaction systems. About OpenAI OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic. For additional information, please see OpenAI's Affirmative Action and Equal Employment Opportunity Policy Statement. Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations. To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance. We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link. OpenAI Global Applicant Privacy Policy At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.
    $88k-129k yearly est. Auto-Apply 47d ago
  • Purview Data Governance Consultant

    Tryton TC LLC

    Data engineer job in New York, NY

    Job DescriptionDescription: Purview Data Governance Consultant OR (c2c) About Tryton TC Tryton TC is a specialized IT staffing firm connecting top-tier technical talent with innovative organizations across the country. We take pride in partnering with exceptional professionals who deliver meaningful impact and value in every engagement. Position Overview We are seeking a Data Quality Consultant to join our client's team in New York City. This role will focus on implementing and optimizing data quality solutions, ensuring clean, consistent, and reliable data across the organization. The ideal candidate will bring both technical expertise and strong communication skills to effectively engage with executive-level stakeholders. Key Responsibilities Lead hands-on implementation and configuration of data quality tools and technologies Collaborate with business and technical teams to define and maintain data quality standards Partner with senior and executive stakeholders to communicate project progress and insights Troubleshoot, optimize, and document data quality processes and workflows Requirements: Required Skills and Experience Bachelor's degree in Computer Science, Information Systems, Data Management, or a related field 3-5 years of professional experience in data quality, data governance, or data management roles Proven experience implementing data quality tools or technologies Expertise in at least one of the following: Microsoft Purview (Data Quality) Google Dataplex (Data Quality) Other tools such as Informatica, Ataccama, or SODA Strong communication and presentation skills, with the ability to engage with executive-level stakeholders Ability to work onsite in New York City 2-3 days per week Preferred Qualifications Experience within the energy and utilities sector Advanced knowledge of data governance frameworks or data management best practices
    $88k-118k yearly est. 15d ago
  • Data Scientist, GTM Analytics

    Airtable 4.2company rating

    Data engineer job in New York, NY

    Airtable is the no-code app platform that empowers people closest to the work to accelerate their most critical business processes. More than 500,000 organizations, including 80% of the Fortune 100, rely on Airtable to transform how work gets done. Airtable is the no-code app platform that empowers people closest to the work to accelerate their most critical business processes. More than 500,000 organizations, including 80% of the Fortune 100, rely on Airtable to transform how work gets done. Our data team's mission is to fuel Airtable's growth and operations. We are a strategic enabler, by building high-quality and customer-centric data products and solutions. We are looking for a Data Scientist to work directly with Airtable's business stakeholders. Your data product will be instrumental in accelerating the efficiency of Customer Engagement (CE) organizations including sales, CSG and revenue operations teams. This role offers the opportunity to significantly impact Airtable's strategy and go-to-market execution, providing you with a platform to deploy your data skills in a way that directly contributes to our company's growth and success. What you'll do Champion AI Driven Data Product with Scalability: Design and implement ML models and AI solutions to enable CE team with actionable insights and recommendations. Build scalable data pipelines and automated workflows with MLOps best practices. Support Key Business Processes: Provide strategic insights, repeatable frameworks and thought partnership independently to support key CE business processes like territory carving, annual planning, pricing optimization and performance attribution, etc.. Strategic Analysis: Drive in-depth deep-dive analysis to ensure accuracy and relevance. Influence the business stakeholders with a good story telling of the data. Tackle ambiguous problems to uncover business value with minimal oversight. Develop Executive Dashboards: Design, build, and maintain high-quality dashboards and BI tools. Partner with Revenue Operations team to enable vast roles of CE team efficiently with the data products. Strong Communication Skills: Effectively communicate the “so-what” of an analysis, illustrating how insights can be leveraged to drive business impact across the organization. Who you are Education: Bachelor degree in a quantitative discipline (Math, Statistics, Operations Research, Economics, Engineering, or CS), MS/MBA preferred. Industry Experience: 4+ years of working experience as a data scientist / analytics engineer in high-growth B2B SaaS, preferably supporting sales, CSG or other go-to-market stakeholders. Demonstrated business acumen with a deep understanding of Enterprise Sales strategies (sales pipeline, forecast models, sales capacity, sales segmentation, quota planning), CSG strategies (customer churn risk model, performance attribution) and Enterprise financial metrics (ACV, ARR, NDR) Familiar with CRM platforms (i.e., Salesforce) Technical Proficiency: 6+ years of experience working with SQL in modern data platforms, such as Databricks, Snowflake, Redshift, BigQuery 6+ years of experience working with Python or R for analytics or data science projects 6+ years of experience building business facing dashboards and data models using modern BI tools like Looker, Tableau, etc. Proficient-level experience developing automated solutions to collect, transform, and clean data from various sources, by using tools such as dbt, Fivetran Proficient knowledge of data science models, such as regression, classification, clustering, time series analysis, and experiment design Hands-on experience with batch LLM pipeline is preferred Excellent communication skills to present findings to both technical and non-technical audiences. Passionate to thrive in a dynamic environment. That means being flexible and willing to jump in and do whatever it takes to be successful. Airtable is an equal opportunity employer. We embrace diversity and strive to create a workplace where everyone has an equal opportunity to thrive. We welcome people of different backgrounds, experiences, abilities, and perspectives. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status or any characteristic protected by applicable federal and state laws, regulations and ordinances. Learn more about your EEO rights as an applicant. VEVRAA-Federal Contractor If you have a medical condition, disability, or religious belief/practice which inhibits your ability to participate in any part of the application or interview process, please complete our Accommodations Request Form and let us know how we may assist you. Airtable is committed to participating in the interactive process and providing reasonable accommodations to qualified applicants. Compensation awarded to successful candidates will vary based on their work location, relevant skills, and experience. Our total compensation package also includes the opportunity to receive benefits, restricted stock units, and may include incentive compensation. To learn more about our comprehensive benefit offerings, please check out Life at Airtable. For work locations in the San Francisco Bay Area, Seattle, New York City, and Los Angeles, the base salary range for this role is:$179,500-$221,500 USDFor all other work locations (including remote), the base salary range for this role is:$161,500-$199,300 USD Please see our Privacy Notice for details regarding Airtable's collection and use of personal information relating to the application and recruitment process by clicking here. 🔒 Stay Safe from Job Scams All official Airtable communication will come from an @airtable.com email address. We will never ask you to share sensitive information or purchase equipment during the hiring process. If in doubt, contact us at ***************. Learn more about avoiding job scams here.
    $179.5k-221.5k yearly Auto-Apply 3d ago
  • Windchill PLM Data Migration Consultant

    ACL Digital

    Data engineer job in Jamestown, NY

    Project Title: Data Migration to Windchill PLM Project Description: This project involves the migration of engineering data, including documents and associated metadata, from the current storage solutions (Windows network drives) to the Windchill PLM system. The process will ensure that data is accurately segregated, transferred, and verified in collaboration with subject matter experts. Scope of Work: * Data Segregation: Identify and categorize engineering data. This includes segregating documents to their respective programs, part numbers etc. Classify them as drawings, reports, specifications etc. Prepare data for migration in accordance with Windchill PLM requirements. * Data Migration: Utilize automated tools and manual processes to transfer data to the Windchill PLM system. Ensure integrity and completeness during the migration. * Verification: Post-migration, conduct thorough checks with the help of subject matter experts to validate the accuracy and completeness of the migrated data. Deliverables: * Migration of the data to Windchill * A comprehensive migration plan detailing the process and timelines. * Documentation of data segregation and preparation methods. * Reports on data migration status and any issues encountered. * Verification reports signed off by subject matter experts. Skills Required: * Bachelor's degree in mechanical engineering or industrial engineering or equivalent. * Proficiency in Microsoft Excel for data analysis and preparation. * Familiarity with engineering drawings, documents, and metadata structures. * Experience with Windchill PLM tools for data management and migration. WEX tools to exchange data with Windchill. Project Timeline: Aug 2024 to March 2025. For 8 months. Resource: Resource must be an US citizen, must be present physically in Falconer, Jamestown NY
    $84k-112k yearly est. 60d+ ago
  • Network Planning Data Scientist (Manager)

    Atlas Air Worldwide Holdings 4.9company rating

    Data engineer job in White Plains, NY

    Atlas Air is seeking a detail-oriented and analytical Network Planning Analyst to help optimize our global cargo network. This role plays a critical part in the 2-year to 11-day planning window, driving insights that enable operational teams to execute the most efficient and reliable schedules. The successful candidate will provide actionable analysis on network delays, utilization trends, and operating performance, build models and reports to govern network operating parameters, and contribute to the development and implementation of software optimization tools that improve reliability and streamline planning processes. This position requires strong analytical skills, a proactive approach to problem-solving, and the ability to translate data into operational strategies that protect service quality and maximize network efficiency. Responsibilities Analyze and Monitor Network Performance Track and assess network delays, capacity utilization, and operating constraints to identify opportunities for efficiency gains and reliability improvements. Develop and maintain key performance indicators (KPIs) for network operations and planning effectiveness. Modeling & Optimization Build and maintain predictive models to assess scheduling scenarios and network performance under varying conditions. Support the design, testing, and implementation of software optimization tools to enhance operational decision-making. Reporting & Governance Develop periodic performance and reliability reports for customers, assisting in presentation creation Produce regular and ad hoc reports to monitor compliance with established operating parameters. Establish data-driven processes to govern scheduling rules, protect operational integrity, and ensure alignment with reliability targets. Cross-Functional Collaboration Partner with Operations, Planning, and Technology teams to integrate analytics into network planning and execution. Provide insights that inform schedule adjustments, fleet utilization, and contingency planning. Innovation & Continuous Improvement Identify opportunities to streamline workflows and automate recurring analyses. Contributes to the development of new planning methodologies and tools that enhance decision-making and operational agility. Qualifications Proficiency in SQL (Python and R are a plus) for data extraction and analysis; experience building decision-support tools, reporting tools dashboards (e.g., Tableau, Power BI) Bachelor's degree required in Industrial Engineering, Operations Research, Applied Mathematics, Data Science or related quantitative discipline or equivalent work experience. 5+ years of experience in strategy, operations planning, finance or continuous improvement, ideally with airline network planning Strong analytical skills with experience in statistical analysis, modeling, and scenario evaluation. Strong problem-solving skills with the ability to work in a fast-paced, dynamic environment. Excellent communication skills with the ability to convey complex analytical findings to non-technical stakeholders. A proactive, solution-focused mindset with a passion for operational excellence and continuous improvement. Knowledge of operations, scheduling, and capacity planning, ideally in airlines, transportation or other complex network operations Salary Range: $131,500 - $177,500 Financial offer within the stated range will be based on multiple factors to include but not limited to location, relevant experience/level and skillset. The Company is an Equal Opportunity Employer. It is our policy to afford equal employment opportunity to all employees and applicants for employment without regard to race, color, religion, sex, sexual orientation, national origin, citizenship, place of birth, age, disability, protected veteran status, gender identity or any other characteristic or status protected by applicable in accordance with federal, state and local laws. If you'd like more information about your EEO rights as an applicant under the law, please download the available EEO is the Law document at ****************************************** To view our Pay Transparency Statement, please click here: Pay Transparency Statement “Know Your Rights: Workplace Discrimination is Illegal” Poster The "EEO Is The Law" Poster
    $131.5k-177.5k yearly Auto-Apply 60d+ ago
  • Staff Data Scientist, Personalization & Shopping

    Pinterest 4.6company rating

    Data engineer job in Day, NY

    Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we're on a mission to bring everyone the inspiration to create a life they love, and that starts with the people behind the product. Discover a career where you ignite innovation for millions, transform passion into growth opportunities, celebrate each other's unique experiences and embrace the flexibility to do your best work. Creating a career you love? It's Possible. Pinterest is the world's leading visual search and discovery platform, serving over 500 million monthly active users globally on their journey from inspiration to action. At Pinterest, Shopping is a strategic initiative that aims to help Pinners take action by surfacing the most relevant content, at the right time, in the best user-friendly way. We do this through a combination of innovative product interfaces, and sophisticated recommendation systems. We are looking for a Staff Data Scientist with experience in machine learning and causal inference to help advance Shopping at Pinterest. In your role you will develop methods and models to explain why certain content is being promoted (or not) for a Pinner. You will work in a highly collaborative and cross-functional environment, and be responsible for partnering with Product Managers and Machine Learning Engineers. You are expected to develop a deep understanding of our recommendation system, and generate insights and robust methodologies to answer the “why”. The results of your work will influence our development teams, and drive product innovation. What you'll do: Ensure that our recommendation systems produce trustworthy, high-quality outputs to maximize our Pinner's shopping experience. Develop robust frameworks, combining online and offline methods, to comprehensively understand the outputs of our recommendations. Bring scientific rigor and statistical methods to the challenges of product creation, development and improvement with an appreciation for the behaviors of our Pinners. Work cross-functionally to build relationships, proactively communicate key insights, and collaborate closely with product managers, engineers, designers, and researchers to help build the next experiences on Pinterest. Relentlessly focus on impact, whether through influencing product strategy, advancing our north star metrics, or improving a critical process. Mentor and up-level junior data scientists on the team. What we're looking for: 7+ years of experience analyzing data in a fast-paced, data-driven environment with proven ability to apply scientific methods to solve real-world problems on web-scale data. Strong interest and experience in recommendation systems and causal inference. Strong quantitative programming (Python/R) and data manipulation skills (SQL/Spark). Ability to work independently and drive your own projects. Excellent written and communication skills, and able to explain learnings to both technical and non-technical partners. A team player eager to partner with cross-functional partners to quickly turn insights into actions. Bachelor's/Master's degree in a relevant field such as Computer Science, or equivalent experience. In-Office Requirement Statement: We let the type of work you do guide the collaboration style. That means we're not always working in an office, but we continue to gather for key moments of collaboration and connection. This role will need to be in the office for in-person collaboration 1-2 times/quarter and therefore can be situated anywhere in the country. Relocation Statement: This position is not eligible for relocation assistance. Visit our PinFlex page to learn more about our working model. #LI-REMOTE #LI-NM4 At Pinterest we believe the workplace should be equitable, inclusive, and inspiring for every employee. In an effort to provide greater transparency, we are sharing the base salary range for this position. The position is also eligible for equity. Final salary is based on a number of factors including location, travel, relevant prior experience, or particular skills and expertise. Information regarding the culture at Pinterest and benefits available for this position can be found here. US based applicants only$164,695-$339,078 USD Our Commitment to Inclusion: Pinterest is an equal opportunity employer and makes employment decisions on the basis of merit. We want to have the best qualified people in every job. All qualified applicants will receive consideration for employment without regard to race, color, ancestry, national origin, religion or religious creed, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, age, marital status, status as a protected veteran, physical or mental disability, medical condition, genetic information or characteristics (or those of a family member) or any other consideration made unlawful by applicable federal, state or local laws. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you require a medical or religious accommodation during the job application process, please complete this form for support.
    $105k-141k yearly est. Auto-Apply 3d ago
  • ETL Talend MDM Architect

    TRG 4.6company rating

    Data engineer job in New York, NY

    Responsibilities: • Develop and test Extract, Transformation, and Loading (ETL) modules based on design specifications • Develop and test ETL Mappings in Talend • Plan, test, and deploy ETL mappings, and database code as part of application build process across the enterprise • Provide effective communications with all levels of internal and external customers and staff • Must demonstrate knowledge in the following areas: o Data Integration o Data Architecture o Team Lead experience is a plus • Understand, analyze, assess and recommend ETL environment from technology strategy and operational standpoint • Understand and assess source system data issues and recommend solution from data integration standpoint • Create high level, low level technical design documents for data integration • Design exceptions handling, audit and data resolution processes • Performance tune ETL environment • Conduct proof of concepts • Estimation of work based on functional requirements documents • Identify system deficiencies and recommending solutions • Designing, coding, and writing unit test cases from functional requirements • Delivering efficient and bug-free ETL packages and documentation • Maintenance and support of enterprise ETL jobs • Experience with Talend Hadoop tools is a plus Basic Qualifications: • 3+ years of development experience on Talend ETL tools • 7+ years working with one or more of the following ETL Tools: Talend, Informatica, Ab Initio or Data Stage • 7+ years proficient experience as a developer • Bachelor's Degree in Computer Science or equivalent • Database (Oracle, SQL Server, DB2) • Database Programming (Complex SQL, PL/SQL development knowledge) • Data Modeling • Business Analysis • Top level performer with ability to work independently in short time frames • Proficient working in a Linux environment • Experience in scripting languages (Shell, Python or Perl) • 5+ years of experience deploying large scale projects ETL projects that • 3+ years of experience in a development lead position • Data analysis, data mapping, data loading, and data validation • Understand reusability, parameterization, workflow design, etc. • Thorough understanding of Entire life cycle of Software and various Software Engineering Methodologies • Performance tuning of interfaces that extract, transform and load tens of millions of records • Knowledge of Hadoop ecosystem technologies is a plus Additional Information If you are comfortable with the position and location then please revert me back at the earliest with your updated resume and following details or I would really appreciate if you can call me back on my number. Full Name: Email: Skype id: Contact Nos.: Current Location: Open to relocate: Start Availability: Work Permit: Flexible time for INTERVIEW: Current Company: Current Rate: Expected Rate: Total IT Experience [Years]: Total US Experience [Years]: Key Skill Set: Best time to call: In case you are not interested, I will be very grateful if you can pass this position to your colleagues or friends who might be interested. All your information will be kept confidential according to EEO guidelines.
    $100k-125k yearly est. 60d+ ago

Learn more about data engineer jobs

Do you work as a data engineer?

What are the top employers for data engineer in NY?

Contact Government Services, LLC

Garner Health Technology, Inc.

Top 10 Data Engineer companies in NY

  1. Ernst & Young

  2. Point72

  3. Capital One

  4. Contact Government Services, LLC

  5. Capgemini

  6. Bloomberg

  7. Meta

  8. Garner Health Technology, Inc.

  9. Fanatics

  10. Genentech

Job type you want
Full Time
Part Time
Internship
Temporary

Browse data engineer jobs in new york by city

All data engineer jobs

Jobs in New York