Post job

Data engineer jobs in Hayward, CA - 10,497 jobs

All
Data Engineer
Data Scientist
Lead Data Analyst
Lead Data Architect
Data Consultant
  • Delivery Consultant- GenAI/ML & Data Science, AWS Industries

    Amazon 4.7company rating

    Data engineer job in Santa Clara, CA

    Application deadline: Jan 16, 2026 The Amazon Web Services Professional Services (ProServe) team is seeking a skilled Delivery Consultant to join our team at Amazon Web Services (AWS). In this role, you'll work closely with customers to design, implement, and manage AWS solutions that meet their technical requirements and business objectives. You'll be a key player in driving customer success through their cloud journey, providing technical expertise and best practices throughout the project lifecycle. Possessing a deep understanding of AWS products and services, as a Delivery Consultant you will be proficient in architecting complex, scalable, and secure solutions tailored to meet the specific needs of each customer. You'll work closely with stakeholders to gather requirements, assess current infrastructure, and propose effective migration strategies to AWS. As trusted advisors to our customers, providing guidance on industry trends, emerging technologies, and innovative solutions, you will be responsible for leading the implementation process, ensuring adherence to best practices, optimizing performance, and managing risks throughout the project. The AWS Professional Services organization is a global team of experts that help customers realize their desired business outcomes when using the AWS Cloud. We work together with customer teams and the AWS Partner Network (APN) to execute enterprise cloud computing initiatives. Our team provides assistance through a collection of offerings which help customers achieve specific outcomes related to enterprise cloud adoption. We also deliver focused guidance through our global specialty practices, which cover a variety of solutions, technologies, and industries. Key job responsibilities As an experienced technology professional, you will be responsible for: - Designing, implementing, and building complex, scalable, and secure GenAI and ML applications and models built on AWS tailored to customer needs - Providing technical guidance and implementation support throughout project delivery, with a focus on using AWS AI/ML services - Collaborating with customer stakeholders to gather requirements and propose effective model training, building, and deployment strategies - Acting as a trusted advisor to customers on industry trends and emerging technologies - Sharing knowledge within the organization through mentoring, training, and creating reusable artifacts About the team About AWS: Diverse Experiences: AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job below, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform. We pioneeredcloud computing and never stopped innovating - that's why customers from the most successful startups to Global 500 companiestrust our robust suite of products and services to power their businesses. Inclusive Team Culture - Here at AWS, it's in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth - We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance - We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there's nothing we can't achieve in the cloud. Basic Qualifications - Experience developing software code in one or more programming languages (java, python, etc.) - PhD or Masters of Science degree in Computer Science, or related technical, math, or scientific field (or equivalent experience) - 5+ years of cloud based solution (AWS or equivalent), system, network and operating system experience - 5+ years of experience hosting and deploying GenAI/ML solutions (e.g., for data pre-processing, training, deep learning, fine tuning, and inferences) or/and Data Science Experience - 5+ years of coding, data querying languages (e.g. SQL), scripting languages (e.g. Python) Preferred Qualifications - Knowledge of AWS platform and tools or equivalent cloud experience. Ideally, the candidate has AWS Experience with a proficiency in a wide range of AWS services (e.g. SageMaker, Bedrock, EMR, S3, OpenSearch Service, Step Functions, Lambda, and EC2) - AWS Professional level certifications (e.g., Solutions Architect Professional, DevOps Engineer Professional, Machine Learning Specialty) preferred - Hands on experience with deep learning (e.g., CNN, RNN, LSTM, Transformer), machine learning, CV, GNN, or distributed training - Experience with coding, automation and scripting (e.g., Terraform, Python) - Strong communication skills with the ability to explain technical concepts to both technical and non-technical audiences Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status. Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company's reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records. Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit ********************************************************* for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner. The base pay range for this position is listed below. Hourly pay ranges include the base pay rate plus the highest available shift differential which applies depending on the shift you select. Colorado $131,300 - $177,600 annually / hourly National $118,200 - $204,300 annually / hourly For salaried roles, your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at ***************************************************** . For hourly roles, as a total compensation company, you are eligible for additional earnings including overtime pay and performance bonuses. Final pay will be based on factors including shift selection and location. Starting Day 1 of employment, Amazon offers EAP, Mental Health Support, Medical Advice Line, 401(k) matching. Learn more about our benefits at ********************************************* .
    $131.3k-177.6k yearly 5d ago
  • Job icon imageJob icon image 2

    Looking for a job?

    Let Zippia find it for you.

  • Machine Learning Engineer - Backend/Data Engineer: Agentic Workflows

    Apple Inc. 4.8company rating

    Data engineer job in Sunnyvale, CA

    We design, build and maintain infrastructure to support agentic workflows for Siri. Our team is in charge of data generation, introspection and evaluation frameworks that are key to efficiently developing foundation models and agentic workflows for Siri applications. In this team you will have the opportunity to work at the intersection of with cutting edge foundation models and products. Minimum Qualifications Strong background in computer science: algorithms, data structures and system design 3+ year experience on large scale distributed system design, operation and optimization Experience with SQL/NoSQL database technologies, data warehouse frameworks like BigQuery/Snowflake/RedShift/Iceberg and data pipeline frameworks like GCP Dataflow/Apache Beam/Spark/Kafka Experience processing data for ML applications at scale Excellent interpersonal skills able to work independently as well as cross-functionally Preferred Qualifications Experience fine-tuning and evaluating Large Language Models Experience with Vector Databases Experience deploying and serving of LLMs At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $147,400 and $272,100, and your base pay will depend on your skills, qualifications, experience, and location. Apple employees also have the opportunity to become an Apple shareholder through participation in Apple's discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple's Employee Stock Purchase Plan. You'll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses - including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits. Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program. Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant . #J-18808-Ljbffr
    $147.4k-272.1k yearly 4d ago
  • Databricks Data Engineer - Manager - Consulting - Location Open

    Ernst & Young Oman 4.7company rating

    Data engineer job in San Francisco, CA

    At EY, we're all in to shape your future with confidence. We'll help you succeed in a globally connected powerhouse of diverse teams and take your career wherever you want it to go. Join EY and help to build a better working world. Technology - Data and Decision Science - Data Engineering - Manager We are looking for a dynamic and experienced Manager of Data Engineering to lead our team in designing and implementing complex cloud analytics solutions with a strong focus on Databricks. The ideal candidate will possess deep technical expertise in data architecture, cloud technologies, and analytics, along with exceptional leadership and client management skills. The opportunity In this role, you will design and build analytics solutions that deliver significant business value. You will collaborate with other data and analytics professionals, management, and stakeholders to ensure that business requirements are translated into effective technical solutions. Key responsibilities include: Understanding and analyzing business requirements to translate them into technical requirements. Designing, building, and operating scalable data architecture and modeling solutions. Staying up to date with the latest trends and emerging technologies to maintain a competitive edge. Key Responsibilities As a Data Engineering Manager, you will play a crucial role in managing and delivering complex technical initiatives. Your time will be spent across various responsibilities, including: Leading workstream delivery and ensuring quality in all processes. Engaging with clients on a daily basis, actively participating in working sessions, and identifying opportunities for additional services. Implementing resource plans and budgets while managing engagement economics. This role offers the opportunity to work in a dynamic environment where you will face challenges that require innovative solutions. You will learn and grow as you guide others and interpret internal and external issues to recommend quality solutions. Travel may be required regularly based on client needs. Skills and attributes for success To thrive in this role, you should possess a blend of technical and interpersonal skills. The following attributes will make a significant impact: Lead the design and development of scalable data engineering solutions using Databricks on cloud platforms (e.g., AWS, Azure, GCP). Oversee the architecture of complex cloud analytics solutions, ensuring alignment with business objectives and best practices. Manage and mentor a team of data engineers, fostering a culture of innovation, collaboration, and continuous improvement. Collaborate with clients to understand their analytics needs and deliver tailored solutions that drive business value. Ensure the quality, integrity, and security of data throughout the data lifecycle, implementing best practices in data governance. Drive end-to-end data pipeline development, including data ingestion, transformation, and storage, leveraging Databricks and other cloud services. Communicate effectively with stakeholders, including technical and non-technical audiences, to convey complex data concepts and project progress. Manage client relationships and expectations, ensuring high levels of satisfaction and engagement. Stay abreast of the latest trends and technologies in data engineering, cloud computing, and analytics. Strong analytical and problem‑solving abilities. Excellent communication skills, with the ability to convey complex information clearly. Proven experience in managing and delivering projects effectively. Ability to build and manage relationships with clients and stakeholders. To qualify for the role, you must have Bachelor's degree in computer science, Engineering, or a related field required; Master's degree preferred. Typically, no less than 4‑6 years relevant experience in data engineering, with a focus on cloud data solutions and analytics. Proven expertise in Databricks and experience with Spark for big data processing. Strong background in data architecture and design, with experience in building complex cloud analytics solutions. Experience in leading and managing teams, with a focus on mentoring and developing talent. Strong programming skills in languages such as Python, Scala, or SQL. Excellent problem‑solving skills and the ability to work independently and as part of a team. Strong communication and interpersonal skills, with a focus on client management. Required Expertise for Managerial Role Strategic Leadership: Ability to align data engineering initiatives with organizational goals and drive strategic vision. Project Management: Experience in managing multiple projects and teams, ensuring timely delivery and adherence to project scope. Stakeholder Engagement: Proficiency in engaging with various stakeholders, including executives, to understand their needs and present solutions effectively. Change Management: Skills in guiding clients through change processes related to data transformation and technology adoption. Risk Management: Ability to identify potential risks in data projects and develop mitigation strategies. Technical Leadership: Experience in leading technical discussions and making architectural decisions that impact project outcomes. Documentation and Reporting: Proficiency in creating comprehensive documentation and reports to communicate project progress and outcomes to clients. Large-Scale Implementation Programs Enterprise Data Lake Implementation: Led the design and deployment of a cloud-based data lake solution for a Fortune 500 retail client, integrating data from multiple sources (e.g., ERPs, POS systems, e‑commerce platforms) to enable advanced analytics and reporting capabilities. Real‑Time Analytics Platform: Managed the development of a real‑time analytics platform using Databricks for a financial services organization, enabling real‑time fraud detection and risk assessment through streaming data ingestion and processing. Data Warehouse Modernization: Oversaw the modernization of a legacy data warehouse to a cloud‑native architecture for a healthcare provider, implementing ETL processes with Databricks and improving data accessibility for analytics and reporting. Ideally, you'll also have Experience with advanced data analytics tools and techniques. Familiarity with machine learning concepts and applications. Knowledge of industry trends and best practices in data engineering. Familiarity with cloud platforms (AWS, Azure, GCP) and their data services. Knowledge of data governance and compliance standards. Experience with machine learning frameworks and tools. What we look for We seek individuals who are not only technically proficient but also possess the qualities of top performers, including a strong sense of collaboration, adaptability, and a passion for continuous learning. If you are driven by results and have a desire to make a meaningful impact, we want to hear from you. What we offer you At EY, we'll develop you with future‑focused skills and equip you with world‑class experiences. We'll empower you in a flexible environment, and fuel you and your extraordinary talents in a diverse and inclusive culture of globally connected teams. Learn more. We offer a comprehensive compensation and benefits package where you'll be rewarded based on your performance and recognized for the value you bring to the business. The base salary range for this job in all geographic locations in the US is $125,500 to $230,200. The base salary range for New York City Metro Area, Washington State and California (excluding Sacramento) is $150,700 to $261,600. Individual salaries within those ranges are determined through a wide variety of factors including but not limited to education, experience, knowledge, skills and geography. In addition, our Total Rewards package includes medical and dental coverage, pension and 401(k) plans, and a wide range of paid time off options. Join us in our team‑led and leader‑enabled hybrid model. Our expectation is for most people in external, client serving roles to work together in person 40‑60% of the time over the course of an engagement, project or year. Under our flexible vacation policy, you'll decide how much vacation time you need based on your own personal circumstances. You'll also be granted time off for designated EY Paid Holidays, Winter/Summer breaks, Personal/Family Care, and other leaves of absence when needed to support your physical, financial, and emotional well‑being. Are you ready to shape your future with confidence? Apply today. EY accepts applications for this position on an on‑going basis. For those living in California, please click here for additional information. EY focuses on high‑ethical standards and integrity among its employees and expects all candidates to demonstrate these qualities. EY | Building a better working world EY is building a better working world by creating new value for clients, people, society and the planet, while building trust in capital markets. Enabled by data, AI and advanced technology, EY teams help clients shape the future with confidence and develop answers for the most pressing issues of today and tomorrow. EY teams work across a full spectrum of services in assurance, consulting, tax, strategy and transactions. Fueled by sector insights, a globally connected, multi‑disciplinary network and diverse ecosystem partners, EY teams can provide services in more than 150 countries and territories. EY provides equal employment opportunities to applicants and employees without regard to race, color, religion, age, sex, sexual orientation, gender identity/expression, pregnancy, genetic information, national origin, protected veteran status, disability status, or any other legally protected basis, including arrest and conviction records, in accordance with applicable law. EY is committed to providing reasonable accommodation to qualified individuals with disabilities including veterans with disabilities. If you have a disability and either need assistance applying online or need to request an accommodation during any part of the application process, please call 1‑800‑EY‑HELP3, select Option 2 for candidate related inquiries, then select Option 1 for candidate queries and finally select Option 2 for candidates with an inquiry which will route you to EY's Talent Shared Services Team (TSS) or email the TSS at **************************. #J-18808-Ljbffr
    $150.7k-261.6k yearly 4d ago
  • Full-Stack Engineer: AI Data Editor

    Hex 3.9company rating

    Data engineer job in San Francisco, CA

    A cutting-edge data analytics firm in San Francisco is seeking a full-stack engineer to enhance user experiences and integrate AI tools within their platform. You will work on innovative projects that shape data interactions, collaborate with teams on product initiatives, and tackle UX challenges. Ideal candidates should possess 3+ years of software engineering experience, proficiency in React and Typescript, and a strong desire to work in AI development. This position offers a competitive salary and benefits, with a hybrid work model. #J-18808-Ljbffr
    $126k-178k yearly est. 21h ago
  • Staff Data Scientist - Post Sales

    Harnham

    Data engineer job in Fremont, CA

    Salary: $200-250k base + RSUs This fast-growing Series E AI SaaS company is redefining how modern engineering teams build and deploy applications. We're expanding our data science organization to accelerate customer success after the initial sale-driving onboarding, retention, expansion, and long-term revenue growth. About the Role As the senior data scientist supporting post-sales teams, you will use advanced analytics, experimentation, and predictive modeling to guide strategy across Customer Success, Account Management, and Renewals. Your insights will help leadership forecast expansion, reduce churn, and identify the levers that unlock sustainable net revenue retention. Key Responsibilities Forecast & Model Growth: Build predictive models for renewal likelihood, expansion potential, churn risk, and customer health scoring. Optimize the Customer Journey: Analyze onboarding flows, product adoption patterns, and usage signals to improve activation, engagement, and time-to-value. Experimentation & Causal Analysis: Design and evaluate experiments (A/B tests, uplift modeling) to measure the impact of onboarding programs, success initiatives, and pricing changes on retention and expansion. Revenue Insights: Partner with Customer Success and Sales to identify high-value accounts, cross-sell opportunities, and early warning signs of churn. Cross-Functional Partnership: Collaborate with Product, RevOps, Finance, and Marketing to align post-sales strategies with company growth goals. Data Infrastructure Collaboration: Work with Analytics Engineering to define data requirements, maintain data quality, and enable self-serve dashboards for Success and Finance teams. Executive Storytelling: Present clear, actionable recommendations to senior leadership that translate complex analysis into strategic decisions. About You Experience: 6+ years in data science or advanced analytics, with a focus on post-sales, customer success, or retention analytics in a B2B SaaS environment. Technical Skills: Expert SQL and proficiency in Python or R for statistical modeling, forecasting, and machine learning. Domain Knowledge: Deep understanding of SaaS metrics such as net revenue retention (NRR), gross churn, expansion ARR, and customer health scoring. Analytical Rigor: Strong background in experimentation design, causal inference, and predictive modeling to inform customer-lifecycle strategy. Communication: Exceptional ability to translate data into compelling narratives for executives and cross-functional stakeholders. Business Impact: Demonstrated success improving onboarding efficiency, retention rates, or expansion revenue through data-driven initiatives.
    $200k-250k yearly 2d ago
  • Data Partnerships Lead - Equity & Growth (SF)

    Exa

    Data engineer job in San Francisco, CA

    A cutting-edge AI search engine company in San Francisco is seeking a Data Partnerships specialist to build their data pipeline. The role involves owning the partnerships cycle, making strategic decisions, negotiating contracts, and potentially building a team. Candidates should have experience in contract negotiation and a Juris Doctor degree. This in-person role offers a competitive salary range of $160,000 - $250,000 with above-market equity. #J-18808-Ljbffr
    $160k-250k yearly 21h ago
  • Senior Energy Data Engineer - API & Spark Pipelines

    Medium 4.0company rating

    Data engineer job in San Francisco, CA

    A technology finance firm in San Francisco is seeking an experienced Data Engineer. The role involves building data pipelines, integrating data across various platforms, and developing scalable web applications. The ideal candidate will have a strong background in data analysis, software development, and experience with AWS. The salary range for this position is between $160,000 and $210,000, with potential bonuses and equity. #J-18808-Ljbffr
    $160k-210k yearly 3d ago
  • Senior Data Engineer: ML Pipelines & Signal Processing

    Zendar

    Data engineer job in Berkeley, CA

    An innovative tech firm in Berkeley seeks a Senior Data Engineer to manage complex data engineering pipelines. You will ensure data quality, support ML engineers across locations, and establish infrastructure standards. The ideal candidate has over 5 years of experience in Data Science or MLOps, strong algorithmic skills, and proficiency in GCP, Python, and SQL. This role offers competitive salary and the chance to impact a growing team in a dynamic field. #J-18808-Ljbffr
    $110k-157k yearly est. 3d ago
  • Founding ML Infra Engineer - Audio Data Platform

    David Ai

    Data engineer job in San Francisco, CA

    A pioneering audio tech company based in San Francisco is searching for a Founding Machine Learning Infrastructure Engineer. In this role, you will build and scale the core infrastructure that powers cutting-edge audio ML products. You will lead the development of systems for training and deploying models. Candidates should have over 5 years of backend experience with strong skills in cloud infrastructure and machine learning principles. The company offers benefits like unlimited PTO and comprehensive health coverage. #J-18808-Ljbffr
    $110k-157k yearly est. 3d ago
  • Data/Full Stack Engineer, Data Storage & Ingestion Consultant

    Eon Systems PBC

    Data engineer job in San Francisco, CA

    About us At Eon, we are at the forefront of large-scale neuroscientific data collection. Our mission is to enable the safe and scalable development of brain emulation technology to empower humanity over the next decade, beginning with the creation of a fully emulated digital twin of a mouse. Role We're a San Francisco team collecting very large microscopy datasets and we need an expert to design and implement our end-to-end data pipeline, from high-rate ingest to multi-petabyte storage and downstream processing. You'll own the strategy (on-prem vs. S3 or hybrid), the bill of materials, and the deployment, and you'll be on the floor wiring, racking, tuning, and validating performance. Our current instruments generate data at ~1+ GB/s sustained (higher during bursts) and the program will accumulate multiple petabyes total over time. You'll help us choose and implement the right architecture considering reliability and cost controls. Outcomes (what success looks like) Within 2 weeks: Implement an immediate data-handling strategy that reliably ingests our initial data streams. Within 2 weeks: Deliver a documented medium-term data architecture covering storage, networking, ingest, and durability. Within 1 month: Operationalize the medium-term pipeline in production (ingest → buffer → long-term store → compute access). Ongoing: Maintain ≥95% uptime for the end-to-end data-handling pipeline after setup. Responsibilities Architect ingest & storage: Choose and implement an on-prem hardware and data pipeline design or a cloud/S3 alternative with explicit cost and performance tradeoffs at multi-petabyte scale. Set up a sustained-write ingest path ≥1 GB/s with adequate burst headroom (camera/frame-to-disk), including networking considerations, cooling, and throttling safeguards. Optimize footprint & cost: Incorporate on-the-fly compression/downsampling options and quantify CPU budget vs. write-speed tradeoffs; document when/where to compress to control $/PB. Integrate with acquisition workflows ensuring image data and metadata are compatible with downstream stitching/flat-field correction pipelines. Enable downstream compute: Expose the data to segmentation/analysis stacks (local GPU nodes or cloud). Skills 5+ years designing and deploying high-throughput storage or HPC pipelines (≥1 GB/s sustained ingest) in production. Deep hands-on with: NVMe RAID/striping, ZFS/MDRAID/erasure coding, PCIe topology, NUMA pinning, Linux performance tuning, and NIC offload features. Proven delivery of multi-GB/s ingest systems and petabyte-scale storage in production (life-sciences, vision, HPC, or media). Experience building tiered storage systems (NVMe → HDD/object) and validating real-world throughput under sustained load. Practical S3/object-storage know-how (AWS S3 and/or on-prem S3-compatible systems) with lifecycle, versioning, and cost controls. Data integrity & reliability: snapshots, scrubs, replication, erasure coding, and backup/DR for PB-scale systems. Networking: ****25/40/100 GbE (SFP+/SFP28), RDMA/ RoCE/iWARP familiarity; switch config and path tuning. Ability to spec and rack hardware: selecting chassis/backplanes, RAID/HBA cards, NICs, and cooling strategies to prevent NVMe throttling under sustained writes. Ideal skills: Experience with microscopy or scientific imaging ingest at frame-to-disk speeds, including Micro-Manager-based pipelines and raw-to-containerized format conversions. Experience with life science imaging data a plus. Engagement details Contract (1099 or corp-to-corp); contract-to-hire if there's a mutual fit. On-site requirement: You must be physically present in San Francisco during build-out and initial operations; local field work (e.g., UCSF) as needed. Compensation: Contract, $100-300/hour Timeline: Immediate start #J-18808-Ljbffr
    $110k-157k yearly est. 4d ago
  • Global Data ML Engineer for Multilingual Speech & AI

    Cartesia

    Data engineer job in San Francisco, CA

    A leading technology company in San Francisco is seeking a Machine Learning Engineer to ensure the quality and coverage of data across diverse languages. You will design large-scale datasets, evaluate models, and implement quality control systems. The ideal candidate has expertise in multilingual datasets and a strong background in applied ML. This full-time role offers competitive benefits, including fully covered insurance and in-office perks, in a supportive team environment. #J-18808-Ljbffr
    $110k-157k yearly est. 21h ago
  • ML Data Engineer: Systems & Retrieval for LLMs

    Zyphra Technologies Inc.

    Data engineer job in Palo Alto, CA

    A leading AI technology company based in Palo Alto, CA is seeking a Machine Learning Data Engineer. You will build and optimize the data infrastructure for our machine learning systems while collaborating with ML engineers and infrastructure teams. The ideal candidate has a strong engineering background in Python, experience in production data pipelines, and a deep understanding of distributed systems. This role offers comprehensive benefits, a collaborative environment, and opportunities for innovative contributions. #J-18808-Ljbffr
    $110k-157k yearly est. 2d ago
  • Assoc Director, Data Scientist

    Gilead Sciences, Inc. 4.5company rating

    Data engineer job in Foster City, CA

    At Gilead, we're creating a healthier world for all people. For more than 35 years, we've tackled diseases such as HIV, viral hepatitis, COVID-19 and cancer - working relentlessly to develop therapies that help improve lives and to ensure access to these therapies across the globe. We continue to fight against the world's biggest health challenges, and our mission requires collaboration, determination and a relentless drive to make a difference. Every member of Gilead's team plays a critical role in the discovery and development of life-changing scientific innovations. Our employees are our greatest asset as we work to achieve our bold ambitions, and we're looking for the next wave of passionate and ambitious people ready to make a direct impact. We believe every employee deserves a great leader. People Leaders are the cornerstone to the employee experience at Gilead and Kite. As a people leader now or in the future, you are the key driver in evolving our culture and creating an environment where every employee feels included, developed and empowered to fulfil their aspirations. Join Gilead and help create possible, together. Gilead's AI Research Center(ARC) is looking for a Principal Data Scientist to spearhead the adoption of AI/ML and transform our clinical development processes. This is a pivotal role where you will provide key thought leadership and drive our strategic vision for advanced analytics, with the goal of optimizing clinical trials, enhancing data-driven decision-making, and providing support for Real-World Evidence (RWE), Clinical Pharmacology, and Biomarkers initiatives. You will be a thought leader in applying AI/ML to real-world clinical challenges, taking deep involvement in all stages of technical development-from coding and configuring compute environments to model evaluation, review, and architecture design. You'll work closely with a variety of cross-functional teams, including architects, data engineers, and product managers, to scope, develop, and operationalize our AI-driven applications, with a specific focus on leveraging AI/ML to advance insights within RWE, Clinical Pharmacology, and Biomarkers. Responsibilities: Innovate and Strategize: Spearhead the strategic vision for leveraging AI/ML within clinical development. You'll partner with cross-functional leaders to identify high-impact opportunities and design innovative solutions that transform how we conduct trials and make data-driven decisions. Lead with Expertise: Guide the full lifecycle of machine learning models from initial concept to real-world application. This includes architecting scalable solutions, hands-on algorithm development, and ensuring models are rigorously evaluated and operationalized for use in RWE, Clinical Pharmacology, and Biomarkers. Mentor and Empower: Act as a force multiplier for our data science team. You'll coach and mentor senior and junior data scientists, fostering a culture of technical excellence and continuous learning. Translate and Execute: Serve as a bridge between technical teams and business stakeholders. You'll translate complex business challenges into precise data science problems and, in a product manager-like role, drive the development of these solutions from proof-of-concept to production. Drive Breakthroughs: Research and develop cutting-edge algorithms to solve critical challenges. This could involve using NLP for patient insights, computer vision for biomarker analysis, or predictive models to optimize trial logistics. You'll be at the forefront of applying these techniques in a biotech context. Build the Foundation: Design and implement the technical and process building blocks needed to scale our AI/ML capabilities. This includes working with IT partners to curate and operationalize the datasets essential for fueling our analytical pipelines. Influence and Advise: Interface directly with internal stakeholders, acting as a trusted advisor to help them understand the potential of advanced analytics and apply data-driven approaches to optimize clinical trial operations. Stay Ahead: Continuously monitor the landscape of machine learning and biopharmaceutical innovation. You'll ensure our team is leveraging the latest state-of-the-art techniques to maintain a competitive edge. Technical Skills: Advanced Model Development & Operationalization: Deep expertise in developing, deploying, and managing complex machine learning and deep learning algorithms at scale. This includes a profound understanding of model evaluation, scoring methodologies, and mitigation of model bias to ensure robust, ethical, and reliable outcomes. Data & Computational Proficiency: Fluent in Python or R and SQL, with hands-on experience in building and optimizing data pipelines for analytical and model development purposes. Cloud-Native AI/ML: Demonstrated experience with Cloud DevOps on AWS as it pertains to the entire data science lifecycle, from data ingestion to model serving and monitoring. Translational Research: Proven ability to translate foundational AI/ML research into functional, production-ready packages and applications that directly support strategic initiatives in areas like RWE, Clinical Pharmacology, and Biomarkers. Basic Qualifications: Doctorate and 5+ years of relevant experience OR Master's and 8+ years of relevant experience OR Bachelor's and 10+ years of relevant experience Preferred Qualifications: Ability to translate stakeholder needs into clear technical requirements, including those related to RWE, Clinical Pharmacology, and Biomarkers. Skill in scoping project requirements and developing timelines. Knowledge of product management principles. Experience with code management using Git. Strong technical documentation skills. Join us at the AI Research Center to shape the future of clinical development with groundbreaking AI/ML solutions, and contribute to advancements in RWE, Clinical Pharmacology, and Biomarkers! The salary range for this position is: Bay Area: $210,375.00 - $272,250.00.Other US Locations: $191,250.00 - $247,500.00. At Gilead, we're creating a healthier world for all people. For more than 35 years, we've tackled diseases such as HIV, viral hepatitis, COVID-19 and cancer - working relentlessly to develop therapies that help improve lives and to ensure access to these therapies across the globe. We continue to fight against the world's biggest health challenges, and our mission requires collaboration, determination and a relentless drive to make a difference. Every member of Gilead's team plays a critical role in the discovery and development of life-changing scientific innovations. Our employees are our greatest asset as we work to achieve our bold ambitions, and we're looking for the next wave of passionate and ambitious people ready to make a direct impact. We believe every employee deserves a great leader. People Leaders are the cornerstone to the employee experience at Gilead and Kite. As a people leader now or in the future, you are the key driver in evolving our culture and creating an environment where every employee feels included, developed and empowered to fulfil their aspirations. Join Gilead and help create possible, together. Job Description Gilead's AI Research Center(ARC) is looking for a Principal Data Scientist to spearhead the adoption of AI/ML and transform our clinical development processes. This is a pivotal role where you will provide key thought leadership and drive our strategic vision for advanced analytics, with the goal of optimizing clinical trials, enhancing data-driven decision-making, and providing support for Real-World Evidence (RWE), Clinical Pharmacology, and Biomarkers initiatives. You will be a thought leader in applying AI/ML to real-world clinical challenges, taking deep involvement in all stages of technical development-from coding and configuring compute environments to model evaluation, review, and architecture design. You'll work closely with a variety of cross-functional teams, including architects, data engineers, and product managers, to scope, develop, and operationalize our AI-driven applications, with a specific focus on leveraging AI/ML to advance insights within RWE, Clinical Pharmacology, and Biomarkers. Responsibilities: Innovate and Strategize: Spearhead the strategic vision for leveraging AI/ML within clinical development. You'll partner with cross-functional leaders to identify high-impact opportunities and design innovative solutions that transform how we conduct trials and make data-driven decisions. Lead with Expertise: Guide the full lifecycle of machine learning models from initial concept to real-world application. This includes architecting scalable solutions, hands-on algorithm development, and ensuring models are rigorously evaluated and operationalized for use in RWE, Clinical Pharmacology, and Biomarkers. Mentor and Empower: Act as a force multiplier for our data science team. You'll coach and mentor senior and junior data scientists, fostering a culture of technical excellence and continuous learning. Translate and Execute: Serve as a bridge between technical teams and business stakeholders. You'll translate complex business challenges into precise data science problems and, in a product manager-like role, drive the development of these solutions from proof-of-concept to production. Drive Breakthroughs: Research and develop cutting-edge algorithms to solve critical challenges. This could involve using NLP for patient insights, computer vision for biomarker analysis, or predictive models to optimize trial logistics. You'll be at the forefront of applying these techniques in a biotech context. Build the Foundation: Design and implement the technical and process building blocks needed to scale our AI/ML capabilities. This includes working with IT partners to curate and operationalize the datasets essential for fueling our analytical pipelines. Influence and Advise: Interface directly with internal stakeholders, acting as a trusted advisor to help them understand the potential of advanced analytics and apply data-driven approaches to optimize clinical trial operations. Stay Ahead: Continuously monitor the landscape of machine learning and biopharmaceutical innovation. You'll ensure our team is leveraging the latest state-of-the-art techniques to maintain a competitive edge. Technical Skills: Advanced Model Development & Operationalization: Deep expertise in developing, deploying, and managing complex machine learning and deep learning algorithms at scale. This includes a profound understanding of model evaluation, scoring methodologies, and mitigation of model bias to ensure robust, ethical, and reliable outcomes. Data & Computational Proficiency: Fluent in Python or R and SQL, with hands-on experience in building and optimizing data pipelines for analytical and model development purposes. Cloud-Native AI/ML: Demonstrated experience with Cloud DevOps on AWS as it pertains to the entire data science lifecycle, from data ingestion to model serving and monitoring. Translational Research: Proven ability to translate foundational AI/ML research into functional, production-ready packages and applications that directly support strategic initiatives in areas like RWE, Clinical Pharmacology, and Biomarkers. Basic Qualifications: Doctorate and 5+ years of relevant experience OR Master's and 8+ years of relevant experience OR Bachelor's and 10+ years of relevant experience Preferred Qualifications: Ability to translate stakeholder needs into clear technical requirements, including those related to RWE, Clinical Pharmacology, and Biomarkers. Skill in scoping project requirements and developing timelines. Knowledge of product management principles. Experience with code management using Git. Strong technical documentation skills. Join us at the AI Research Center to shape the future of clinical development with groundbreaking AI/ML solutions, and contribute to advancements in RWE, Clinical Pharmacology, and Biomarkers! The salary range for this position is: Bay Area: $210,375.00 - $272,250.00.Other US Locations: $191,250.00 - $247,500.00. Gilead considers a variety of factors when determining base compensation, including experience, qualifications, and geographic location. These considerations mean actual compensation will vary. This position may also be eligible for a discretionary annual bonus, discretionary stock-based long-term incentives (eligibility may vary based on role), paid time off, and a benefits package. Benefits include company-sponsored medical, dental, vision, and life insurance plans*. For additional benefits information, visit: ****************************************************************** * Eligible employees may participate in benefit plans, subject to the terms and conditions of the applicable plans. For jobs in the United States: Gilead Sciences Inc. is committed to providing equal employment opportunities to all employees and applicants for employment, and is dedicated to fostering an inclusive work environment comprised of diverse perspectives, backgrounds, and experiences. Employment decisions regarding recruitment and selection will be made without discrimination based on race, color, religion, national origin, sex , age, sexual orientation, physical or mental disability, genetic information or characteristic, gender identity and expression, veteran status, or other non-job related characteristics or other prohibited grounds specified in applicable federal, state and local laws. In order to ensure reasonable accommodation for individuals protected by Section 503 of the Rehabilitation Act of 1973, the Vietnam Era Veterans' Readjustment Act of 1974, and Title I of the Americans with Disabilities Act of 1990, applicants who require accommodation in the job application process may contact ApplicantAccommodations@gilead.com for assistance. For more information about equal employment opportunity protections, please view the 'Know Your Rights' poster. NOTICE: EMPLOYEE POLYGRAPH PROTECTION ACT YOUR RIGHTS UNDER THE FAMILY AND MEDICAL LEAVE ACT PAY TRANSPARENCY NONDISCRIMINATION PROVISION Our environment respects individual differences and recognizes each employee as an integral member of our company. Our workforce reflects these values and celebrates the individuals who make up our growing team. Gilead provides a work environment free of harassment and prohibited conduct. We promote and support individual differences and diversity of thoughts and opinion. For Current Gilead Employees and Contractors: Please apply via the Internal Career Opportunities portal in Workday. Share: Job Requisition ID R0046852 Full Time/Part Time Full-Time Job Level Associate Director Click below to return to the Gilead Careers site Click below to see a list of upcoming events Click below to return to the Kite, a Gilead company Careers site #J-18808-Ljbffr
    $210.4k-272.3k yearly 4d ago
  • Distinguished Data Engineer - Card Data

    Capital One 4.7company rating

    Data engineer job in San Francisco, CA

    Distinguished Data Engineers are individual contributors who strive to be diverse in thought so we visualize the problem space. At Capital One, we believe diversity of thought strengthens our ability to influence, collaborate and provide the most innovative solutions across organizational boundaries. Distinguished Engineers will significantly impact our trajectory and devise clear roadmaps to deliver next generation technology solutions.**About the Team:** Capital One is seeking a Distinguished Data Engineer, to work in our Credit Card Technology Data Engineering Team and build the future of financial services. We are a fast-paced, mission-driven group responsible for managing and leveraging petabytes of sensitive, real-time and batch data that powers everything from fraud detection models and personalized reward systems to regulatory compliance reporting. As a leader in Data Engineering, you won't just move data; you'll architect high-availability that directly influence millions of customer experiences and secure billions in transactions daily. You'll own critical data domains end-to-end, working cross-functionally with ML Scientists, Product Managers, and Business Analysts teams etc to solve complex, high-stakes problems with cutting-edge cloud technologies (like Snowflake, Kafka, and AWS). If you thrive on technical challenges, demand data integrity, and want your work to have a clear, measurable impact on the bank's core profitability and security, this is your team.This leader must have the ability to attract and recruit the industry's best talent, and simultaneously have the technical chops to ensure that we build compelling, customer-oriented solutions in an iterative methodology. Success in the role requires an innovative mind, a proven track record of delivering next generation software and data products, rigorous analytical skills, and a passion for delivering customer value through automation, machine learning and predictive analytics.**Our Distinguished Engineers Are:*** Deep technical experts and thought leaders that help accelerate adoption of the very best engineering practices, while maintaining knowledge on industry innovations, trends and practices* Visionaries, collaborating on Capital One's toughest issues, to deliver on business needs that directly impact the lives of our customers and associates* Role models and mentors, helping to coach and strengthen the technical expertise and know-how of our engineering and product community* Evangelists, both internally and externally, helping to elevate the Distinguished Engineering community and establish themselves as a go-to resource on given technologies and technology-enabled capabilities**Responsibilities:*** Build awareness, increase knowledge and drive adoption of modern technologies, sharing consumer and engineering benefits to gain buy-in* Strike the right balance between lending expertise and providing an inclusive environment where others' ideas can be heard and championed; leverage expertise to grow skills in the broader Capital One team* Promote a culture of engineering excellence, using opportunities to reuse and innersource solutions where possible* Effectively communicate with and influence key stakeholders across the enterprise, at all levels of the organization* Operate as a trusted advisor for a specific technology, platform or capability domain, helping to shape use cases and implementation in an unified manner* Lead the way in creating next-generation talent for Tech, mentoring internal talent and actively recruiting external talent to bolster Capital One's Tech talent**Basic Qualifications:*** Bachelor's Degree* At least 7 years of experience in data engineering* At least 3 years of experience in data architecture* At least 2 years of experience building applications in AWS**Preferred Qualifications:*** Masters' Degree* 9+ years of experience in data engineering* 3+ years of data modeling experience* 2+ years of experience with ontology standards for defining a domain* 2+ years of experience using Python, SQL or Scala* 1+ year of experience deploying machine learning models* 3+ years of experience implementing big data processing solutions on AWS***Capital One will consider sponsoring a new qualified applicant for employment authorization for this position***Capital One offers a comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being. Learn more at the . Eligibility varies based on full or part-time status, exempt or non-exempt status, and management level. #J-18808-Ljbffr
    $106k-144k yearly est. 2d ago
  • Staff Machine Learning Data Engineer

    Backflip 3.7company rating

    Data engineer job in San Francisco, CA

    Mechanical design, the work done in CAD, is the rate-limiter for progress in the physical world. However, there are only 2-4 million people on Earth who know how to CAD. But what if hundreds of millions could? What if creating something in the real world were as easy as imagining the use case, or sketching it on paper? Backflip is building a foundation model for mechanical design: unifying the world's scattered engineering knowledge into an intelligent, end-to-end design environment. Our goal is to enable anyone to imagine a solution and hit “print.” Founded by a second-time CEO in the same space (first company: Markforged), Backflip combines deep industry insight with breakthrough AI research. Backed by a16z and NEA, we raised a $30M Series A and built a deeply technical, mission-driven team. We're building the AI foundation that tomorrow's space elevators, nanobots, and spaceships will be built in. If you're excited to define the next generation of hard tech, come build it with us. The Role We're looking for a Staff Machine Learning Data Engineer to lead and build the data pipelines powering Backflip's foundation model for manufacturing and CAD. You'll design the systems, tools, and strategies that turn the world's engineering knowledge - text, geometry, and design intent - into high-quality training data. This is a core leadership role within the AI team, driving the data architecture, augmentation, and evaluation that underpin our model's performance and evolution. You'll collaborate with Machine Learning Engineers to run data-driven experiments, analyze results, and deliver AI products that shape the future of the physical world. What You'll Do Architect and own Backflip's ML data pipeline, from ingestion to processing to evaluation. Define data strategy: establish best practices for data augmentation, filtering, and sampling at scale. Design scalable data systems for multimodal training (text, geometry, CAD, and more). Develop and automate data collection, curation, and validation workflows. Collaborate with MLEs to design and execute experiments that measure and improve model performance. Build tools and metrics for dataset analysis, monitoring, and quality assurance. Contribute to model development through insights grounded in data, shaping what, how, and when we train. Who You Are You've built and maintained ML data pipelines at scale, ideally for foundation or generative models, that shipped into production in the real world. You have deep experience with data engineering for ML, including distributed systems, data extraction, transformation, and loading, and large-scale data processing (e.g. PySpark, Beam, Ray, or similar). You're fluent in Python and experienced with ML frameworks and data formats (Parquet, TFRecord, HuggingFace datasets, etc.). You've developed data augmentation, sampling, or curation strategies that improved model performance. You think like both an engineer and an experimentalist: curious, analytical, and grounded in evidence. You collaborate well across AI development, infra, and product, and enjoy building the data systems that make great models possible. You care deeply about data quality, reproducibility, and scalability. You're excited to help shape the future of AI for physical design. Bonus points if: You are comfortable working with a variety of complex data formats, e.g. for 3D geometry kernels or rendering engines. You have an interest in math, geometry, topology, rendering, or computational geometry. You've worked in 3D printing, CAD, or computer graphics domains. Why Backflip This is a rare opportunity to own the data backbone of a frontier foundation model, and help define how AI learns to design the physical world. You'll join a world-class, mission-driven team operating at the intersection of research, engineering, and deep product sense, building systems that let people design the physical world as easily as they imagine it. Your work will directly shape the performance, capability, and impact of Backflip's foundation model, the core of how the world will build in the future. Let's build the tools the future will be made in. #J-18808-Ljbffr
    $126k-178k yearly est. 3d ago
  • ML Engineer: Fraud Detection & Big Data at Scale

    Datavisor 4.5company rating

    Data engineer job in Mountain View, CA

    A leading security technology firm in California is seeking a skilled Data Science Engineer. You will harness the power of unsupervised machine learning to detect fraudulent activities across various sectors. Ideal candidates have experience with Java/C++, data structures, and machine learning. The company offers competitive pay, flexible schedules, equity participation, health benefits, a collaborative environment, and unique perks such as catered lunches and game nights. #J-18808-Ljbffr
    $125k-177k yearly est. 4d ago
  • Foundry Data Engineer: ETL Automation & Dashboards

    Data Freelance Hub 4.5company rating

    Data engineer job in San Francisco, CA

    A data consulting firm based in San Francisco is seeking a Palantir Foundry Consultant for a contract position. The ideal candidate should have strong experience in Palantir Foundry, SQL, and PySpark, with proven skills in data pipeline development and ETL automation. Responsibilities include building data pipelines, implementing interactive dashboards, and leveraging data analysis for actionable insights. This on-site role offers an excellent opportunity for those experienced in the field. #J-18808-Ljbffr
    $114k-160k yearly est. 2d ago
  • Multi-Channel Demand Gen Leader - Data SaaS

    Motherduck Corporation

    Data engineer job in San Francisco, CA

    A growing technology firm based in San Francisco is seeking a Demand Generation Marketer to drive campaigns that turn prospects into lifelong customers. This role emphasizes creativity in marketing, collaboration with teams, and a strong data-driven mindset. The ideal candidate will have experience in B2B SaaS environments and a passion for engaging technical audiences. Flexible work environment and competitive compensation offered. #J-18808-Ljbffr
    $112k-157k yearly est. 4d ago
  • Staff Data Scientist - Sales Analytics

    Harnham

    Data engineer job in San Francisco, CA

    Salary: $200-250k base + RSUs This fast-growing Series E AI SaaS company is redefining how modern engineering teams build and deploy applications. We're looking for a Staff Data Scientist to drive Sales and Go-to-Market (GTM) analytics, applying advanced modeling and experimentation to accelerate revenue growth and optimize the full sales funnel. About the Role As the senior data scientist supporting Sales and GTM, you will combine statistical modeling, experimentation, and advanced analytics to inform strategy and guide decision-making across our revenue organization. Your work will help leadership understand pipeline health, predict outcomes, and identify the levers that unlock sustainable growth. Key Responsibilities Model the Business: Build forecasting and propensity models for pipeline generation, conversion rates, and revenue projections. Optimize the Sales Funnel: Analyze lead scoring, opportunity progression, and deal velocity to recommend improvements in acquisition, qualification, and close rates. Experimentation & Causal Analysis: Design and evaluate experiments (A/B tests, uplift modeling) to measure the impact of pricing, incentives, and campaign initiatives. Advanced Analytics for GTM: Apply machine learning and statistical techniques to segment accounts, predict churn/expansion, and identify high-value prospects. Cross-Functional Partnership: Work closely with Sales, Marketing, RevOps, and Product to influence GTM strategy and ensure data-driven decisions. Data Infrastructure Collaboration: Partner with Analytics Engineering to define data requirements, ensure data quality, and enable self-serve reporting. Strategic Insights: Present findings to executive leadership, translating complex analyses into actionable recommendations. About You Experience: 6+ years in data science or advanced analytics roles, with significant time spent in B2B SaaS or developer tools environments. Technical Depth: Expert in SQL and proficient in Python or R for statistical modeling, forecasting, and machine learning. Domain Knowledge: Strong understanding of sales analytics, revenue operations, and product-led growth (PLG) motions. Analytical Rigor: Skilled in experimentation design, causal inference, and building predictive models that influence GTM strategy. Communication: Exceptional ability to tell a clear story with data and influence senior stakeholders across technical and business teams. Business Impact: Proven record of driving measurable improvements in pipeline efficiency, conversion rates, or revenue outcomes.
    $200k-250k yearly 2d ago
  • Machine Learning Data Engineer - Systems & Retrieval

    Zyphra Technologies Inc.

    Data engineer job in Palo Alto, CA

    Zyphra is an artificial intelligence company based in Palo Alto, California. The Role: As a Machine Learning Data Engineer - Systems & Retrieval, you will build and optimize the data infrastructure that fuels our machine learning systems. This includes designing high-performance pipelines for collecting, transforming, indexing, and serving massive, heterogeneous datasets from raw web-scale data to enterprise document corpora. You'll play a central role in architecting retrieval systems for LLMs and enabling scalable training and inference with clean, accessible, and secure data. You'll have an impact across both research and product teams by shaping the foundation upon which intelligent systems are trained, retrieved, and reasoned over. You'll work across: Design and implementation of distributed data ingestion and transformation pipelines Building retrieval and indexing systems that support RAG and other LLM-based methods Mining and organizing large unstructured datasets, both in research and production environments Collaborating with ML engineers, systems engineers, and DevOps to scale pipelines and observability Ensuring compliance and access control in data handling, with security and auditability in mind Requirements: Strong software engineering background with fluency in Python Experience designing, building, and maintaining data pipelines in production environments Deep understanding of data structures, storage formats, and distributed data systems Familiarity with indexing and retrieval techniques for large-scale document corpora Understanding of database systems (SQL and NoSQL), their internals, and performance characteristics Strong attention to security, access controls, and compliance best practices (e.g., GDPR, SOC2) Excellent debugging, observability, and logging practices to support reliability at scale Strong communication skills and experience collaborating across ML, infra, and product teams Bonus Skill Set: Experience building or maintaining LLM-integrated retrieval systems (e.g, RAG pipelines) Academic or industry background in data mining, search, recommendation systems, or IR literature Experience with large-scale ETL systems and tools like Apache Beam, Spark, or similar Familiarity with vector databases (e.g., FAISS, Weaviate, Pinecone) and embedding-based retrieval Understanding of data validation and quality assurance in machine learning workflows Experience working on cross-functional infra and MLOps teams Knowledge of how data infrastructure supports training pipelines, inference serving, and feedback loops Comfort working across raw, unstructured data, structured databases, and model-ready formats Why Work at Zyphra: Our research methodology is to make grounded, methodical steps toward ambitious goals. Both deep research and engineering excellence are equally valued We strongly value new and crazy ideas and are very willing to bet big on new ideas We move as quickly as we can; we aim to minimize the bar to impact as low as possible We all enjoy what we do and love discussing AI Benefits and Perks: Comprehensive medical, dental, vision, and FSA plans Competitive compensation and 401(k) Relocation and immigration support on a case-by-case basis On-site meals prepared by a dedicated culinary team; Thursday Happy Hours In-person team in Palo Alto, CA, with a collaborative, high-energy environment If you're excited by the challenge of high-scale, high-performance data engineering in the context of cutting-edge AI, you'll thrive in this role. Apply Today! #J-18808-Ljbffr
    $110k-157k yearly est. 2d ago

Learn more about data engineer jobs

How much does a data engineer earn in Hayward, CA?

The average data engineer in Hayward, CA earns between $94,000 and $184,000 annually. This compares to the national average data engineer range of $80,000 to $149,000.

Average data engineer salary in Hayward, CA

$131,000

What are the biggest employers of Data Engineers in Hayward, CA?

The biggest employers of Data Engineers in Hayward, CA are:
  1. Veev
  2. Ndimensions It
Job type you want
Full Time
Part Time
Internship
Temporary