Post job

Data engineer jobs in Sunnyvale, CA

- 9,404 jobs
All
Data Engineer
Data Scientist
Data Architect
  • Staff Data Scientist

    Quantix Search

    Data engineer job in Fremont, CA

    Staff Data Scientist | San Francisco | $250K-$300K + Equity We're partnering with one of the fastest-growing AI companies in the world to hire a Staff Data Scientist. Backed by over $230M from top-tier investors and already valued at over $1B, they've secured customers that include some of the most recognizable names in tech. Their AI platform powers millions of daily interactions and is quickly becoming the enterprise standard for conversational AI. In this role, you'll bring rigorous analytics and experimentation leadership that directly shapes product strategy and company performance. What you'll do: Drive deep-dive analyses on user behavior, product performance, and growth drivers Design and interpret A/B tests to measure product impact at scale Build scalable data models, pipelines, and dashboards for company-wide use Partner with Product and Engineering to embed experimentation best practices Evaluate ML models, ensuring business relevance, performance, and trade-off clarity What we're looking for: 5+ years in data science or product analytics at scale (consumer or marketplace preferred) Advanced SQL and Python skills, with strong foundations in statistics and experimental design Proven record of designing, running, and analyzing large-scale experiments Ability to analyze and reason about ML models (classification, recommendation, LLMs) Strong communicator with a track record of influencing cross-functional teams If you're excited by the sound of this challenge- apply today and we'll be in touch.
    $250k-300k yearly 4d ago
  • Lead Data Scientist - Computer Vision

    Straive

    Data engineer job in Santa Clara, CA

    Lead Data Scientist - Computer Vision/Image Processing About the Role We are seeking a Lead Data Scientist to drive the strategy and execution of data science initiatives, with a particular focus on computer vision systems & image processing techniques. The ideal candidate has deep expertise in image processing techniques including Filtering, Binary Morphology, Perspective/Affine Transformation, Edge Detection. Responsibilities Solid knowledge of computer vision programs and image processing techniques: Filtering, Binary Morphology, Perspective/Affine Transformation, Edge Detection Strong understanding of machine learning: Regression, Supervised and Unsupervised Learning Proficiency in Python and libraries such as OpenCV, NumPy, scikit-learn, TensorFlow/PyTorch. Familiarity with version control (Git) and collaborative development practices
    $107k-154k yearly est. 4d ago
  • Data Scientist V

    Creospan Inc.

    Data engineer job in Mountain View, CA

    Job Title: Data Scientist V - Data Analytics & Engineering Location: Onsite preferred (Mountain View, CA); Remote considered for strong candidates (US time zones only) Duration: 12 months (possible extension) Required Skills: Strong project or product management experience Excellent communication and consulting skills Proficiency in SQL and Python Nice to Have: Experience with marketing analytics or campaigns Experience in large tech or fast-paced startup environments Familiarity with AI-driven workflows Why Join: High-visibility, cross-functional role Opportunity to work on advanced measurement and automation tools Small, agile team with enterprise-scale impact
    $107k-155k yearly est. 2d ago
  • Data Scientist

    Centraprise

    Data engineer job in Pleasanton, CA

    Key Responsibilities Design and develop marketing-focused machine learning models, including: Customer segmentation Propensity, churn, and lifetime value (LTV) models Campaign response and uplift models Attribution and marketing mix models (MMM) Build and deploy NLP solutions for: Customer sentiment analysis Text classification and topic modeling Social media, reviews, chat, and voice-of-customer analytics Apply advanced statistical and ML techniques to solve real-world business problems. Work with structured and unstructured data from multiple marketing channels (digital, CRM, social, email, web). Translate business objectives into analytical frameworks and actionable insights. Partner with stakeholders to define KPIs, success metrics, and experimentation strategies (A/B testing). Optimize and productionize models using MLOps best practices. Mentor junior data scientists and provide technical leadership. Communicate complex findings clearly to technical and non-technical audiences. Required Skills & Qualifications 7+ years of experience in Data Science, with a strong focus on marketing analytics. Strong expertise in Machine Learning (supervised & unsupervised techniques). Hands-on experience with NLP techniques, including: Text preprocessing and feature extraction Word embeddings (Word2Vec, GloVe, Transformers) Large Language Models (LLMs) is a plus Proficiency in Python (NumPy, Pandas, Scikit-learn, TensorFlow/PyTorch). Experience with SQL and large-scale data processing. Strong understanding of statistics, probability, and experimental design. Experience working with cloud platforms (AWS, Azure, or GCP). Ability to translate data insights into business impact. Nice to Have Experience with marketing automation or CRM platforms. Knowledge of MLOps, model monitoring, and deployment pipelines. Familiarity with GenAI/LLM-based NLP use cases for marketing. Prior experience in consumer, e-commerce, or digital marketing domains. EEO Centraprise is an equal opportunity employer. Your application and candidacy will not be considered based on race, color, sex, religion, creed, sexual orientation, gender identity, national origin, disability, genetic information, pregnancy, veteran status or any other characteristic protected by federal, state or local laws.
    $107k-155k yearly est. 1d ago
  • Staff Data Engineer

    Strativ Group

    Data engineer job in Fremont, CA

    🌎 San Francisco (Hybrid) 💼 Founding/Staff Data Engineer 💵 $200-300k base Our client is an elite applied AI research and product lab building AI-native systems for finance-and pushing frontier models into real production environments. Their work sits at the intersection of data, research, and high-stakes financial decision-making. As the Founding Data Engineer, you will own the data platform that powers everything: models, experiments, and user-facing products relied on by demanding financial customers. You'll make foundational architectural decisions, work directly with researchers and product engineers, and help define how data is built, trusted, and scaled from day one. What you'll do: Design and build the core data platform, ingesting, transforming, and serving large-scale financial and alternative datasets. Partner closely with researchers and ML engineers to ship production-grade data and feature pipelines that power cutting-edge models. Establish data quality, observability, lineage, and reproducibility across both experimentation and production workloads. Deploy and operate data services using Docker and Kubernetes in a modern cloud environment (AWS, GCP, or Azure). Make foundational choices on tooling, architecture, and best practices that will define how data works across the company. Continuously simplify and evolve systems-rewriting pipelines or infrastructure when it's the right long-term decision. Ideal candidate: Have owned or built high-performance data systems end-to-end, directly supporting production applications and ML models. Are strongest in backend and data infrastructure, with enough frontend literacy to integrate cleanly with web products when needed. Can design and evolve backend services and pipelines (Node.js or Python) to support new product features and research workflows. Are an expert in at least one statically typed language, with a strong bias toward type safety, correctness, and maintainable systems. Have deployed data workloads and services using Docker and Kubernetes on a major cloud provider. Are comfortable making hard calls-simplifying, refactoring, or rebuilding legacy pipelines when quality and scalability demand it. Use AI tools to accelerate your work, but rigorously review and validate AI-generated code, insisting on sound system design. Thrive in a high-bar, high-ownership environment with other exceptional engineers. Love deep technical problems in data infrastructure, distributed systems, and performance. Nice to have: Experience working with financial data (market, risk, portfolio, transactional, or alternative datasets). Familiarity with ML infrastructure, such as feature stores, experiment tracking, or model serving systems. Background in a high-growth startup or a foundational infrastructure role. Compensation & setup: Competitive salary and founder-level equity Hybrid role based in San Francisco, with close collaboration and significant ownership Small, elite team building core infrastructure with outsized impact
    $200k-300k yearly 3d ago
  • Data Scientist with Gen Ai and Python experience

    Droisys 4.3company rating

    Data engineer job in Palo Alto, CA

    About Company, Droisys is an innovation technology company focused on helping companies accelerate their digital initiatives from strategy and planning through execution. We leverage deep technical expertise, Agile methodologies, and data-driven intelligence to modernize systems of engagement and simplify human/tech interaction. Amazing things happen when we work in environments where everyone feels a true sense of belonging and when candidates have the requisite skills and opportunities to succeed. At Droisys, we invest in our talent and support career growth, and we are always on the lookout for amazing talent who can contribute to our growth by delivering top results for our clients. Join us to challenge yourself and accomplish work that matters. Here's the job details, Data Scientist with Gen Ai and Python experience Palo Alto CA- 5 days Onsite Interview Mode:-Phone & F2F Job Overview: Competent Data Scientist, who is independent, results driven and is capable of taking business requirements and building out the technologies to generate statistically sound analysis and production grade ML models DS skills with GenAI and LLM Knowledge, Expertise in Python/Spark and their related libraries and frameworks. Experience in building training ML pipelines and efforts involved in ML Model deployment. Experience in other ML concepts - Real time distributed model inferencing pipeline, Champion/Challenger framework, A/B Testing, Model. Familiar with DS/ML Production implementation. Excellent problem-solving skills, with attention to detail, focus on quality and timely delivery of assigned tasks. Azure cloud and Databricks prior knowledge will be a big plus. Droisys is an equal opportunity employer. We do not discriminate based on race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. Droisys believes in diversity, inclusion, and belonging, and we are committed to fostering a diverse work environment.
    $104k-146k yearly est. 2d ago
  • AI Data Engineer

    Hartleyco

    Data engineer job in Fremont, CA

    Member of Technical Staff - AI Data Engineer San Francisco (In-Office) $150K to $225K + Equity A high-growth, AI-native startup coming out of stealth is hiring AI Data Engineers to build the systems that power production-grade AI. The company has recently signed a Series A term sheet and is scaling rapidly. This role is central to unblocking current bottlenecks across data engineering, context modeling, and agent performance. Responsibilities: • Build distributed, reliable data pipelines using Airflow, Temporal, and n8n • Model SQL, vector, and NoSQL databases (Postgres, Qdrant, etc.) • Build API and function-based services in Python • Develop custom automations (Playwright, Stagehand, Zapier) • Work with AI researchers to define and expose context as services • Identify gaps in data quality and drive changes to upstream processes • Ship fast, iterate, and own outcomes end-to-end Required Experience: • Strong background in data engineering • Hands-on experience working with LLMs or LLM-powered applications • Data modeling skills across SQL and vector databases • Experience building distributed systems • Experience with Airflow, Temporal, n8n, or similar workflow engines • Python experience (API/services) • Startup mindset and bias toward rapid execution Nice To Have: • Experience with stream processing (Flink) • dbt or Clickhouse experience • CDC pipelines • Experience with context construction, RAG, or agent workflows • Analytical tooling (Posthog) What You Can Expect: • High-intensity, in-office environment • Fast decision-making and rapid shipping cycles • Real ownership over architecture and outcomes • Opportunity to work on AI systems operating at meaningful scale • Competitive compensation package • Meals provided plus full medical, dental, and vision benefits If this sounds like you, please apply now.
    $150k-225k yearly 5d ago
  • Optical Sensing, Hardware Data Analysis Engineer for a Global Consumer Device Company

    OSI Engineering 4.6company rating

    Data engineer job in Cupertino, CA

    Our optical sensing team develops optical sensors for next generation products. The team is seeking someone who has strong Python skills, a self-driven go-getter, with strong experience in optical instruments, data analysis and data visualization is required. Responsibilities: Manage and report the engineering build process using Python and JMP to analyze large sets of data and track key figures of merits. Validate the ambient light sensors' color and Liz sensing performance using Python and spectrometers. Assist with miscellaneous lab work to conduct failure analysis or research such as display light leakage, cover glass properties, affects from thermal environment, etc. Support in creating a performance simulation model using MATLAB. Lead end-to-end lab validation to support new optical sensor development. Develop and implement validation plan for hardware/software designs. Benchmark optical sensor performance from early prototype to product launch. Provide guidance and recommendation to production line testing requirements. Analyze data to draw conclusions and provide feedback to product design. Convert data to a visual plot and/or chart. Collaborate with cross-functional teams including Optical Engineering, Mechanical Engineering, Electrical Engineering and Process Engineering to deliver state-of-the-art sensing solutions. Deliver presentations of results in regular review with cross-functional teams. Requirements: Degree in Optics, Physics, Electrical Engineering or equivalent. B.S./M.S. and industry experience, or Ph.D. Strong background in optical measurements and data analysis. Experience in using Python or other coding languages for lab equipment control, data acquisition, and instrument automation. Need to be able to write/create, rewrite, revise customize and automate scripts. Hands-on experience with optical lab equipment (light sources, spectrometers, detectors, oscilloscopes, free space optics on optical bench, etc.). Excellent written and verbal communication skills. Solid teamwork and self-motivated for technical challenges. Preferred Skillset: Both Hardware and Software background Type: Contract (12+ months) Location: Cupertino, CA (100% onsite)
    $123k-175k yearly est. 1d ago
  • Senior Data Engineer

    Skale 3.7company rating

    Data engineer job in Fremont, CA

    We're hiring a Senior/Lead Data Engineer to join a fast-growing AI startup. The team comes from a billion dollar AI company, and has raised a $40M+ seed round. You'll need to be comfortable transforming and moving data in a new 'group level' data warehouse, from legacy sources. You'll have a strong data modeling background. Proven proficiency in modern data transformation tools, specifically dbt and/or SQLMesh. Exceptional ability to apply systems thinking and complex problem-solving to ambiguous challenges. Experience within a high-growth startup environment is highly valued. Deep, practical knowledge of the entire data lifecycle, from generation and governance through to advanced downstream applications (e.g., fueling AI/ML models, LLM consumption, and core product features). Outstanding ability to communicate technical complexity clearly, synthesizing information into actionable frameworks for executive and cross-functional teams.
    $126k-177k yearly est. 5d ago
  • Senior Data Engineer

    Sigmaways Inc.

    Data engineer job in Fremont, CA

    If you're hands on with modern data platforms, cloud tech, and big data tools and you like building solutions that are secure, repeatable, and fast, this role is for you. As a Senior Data Engineer, you will design, build, and maintain scalable data pipelines that transform raw information into actionable insights. The ideal candidate will have strong experience across modern data platforms, cloud environments, and big data technologies, with a focus on building secure, repeatable, and high-performing solutions. Responsibilities: Design, develop, and maintain secure, scalable data pipelines to ingest, transform, and deliver curated data into the Common Data Platform (CDP). Participate in Agile rituals and contribute to delivery within the Scaled Agile Framework (SAFe). Ensure quality and reliability of data products through automation, monitoring, and proactive issue resolution. Deploy alerting and auto-remediation for pipelines and data stores to maximize system availability. Apply a security first and automation-driven approach to all data engineering practices. Collaborate with cross-functional teams (data scientists, analysts, product managers, and business stakeholders) to align infrastructure with evolving data needs. Stay current on industry trends and emerging tools, recommending improvements to strengthen efficiency and scalability. Qualifications: Bachelor's degree in Computer Science, Information Systems, or related field (or equivalent experience). At least 3 years of experience with Python and PySpark, including Jupyter notebooks and unit testing. At least 2 years of experience with Databricks, Collibra, and Starburst. Proven work with relational and NoSQL databases, including STAR and dimensional modeling approaches. Hands-on experience with modern data stacks: object stores (S3), Spark, Airflow, lakehouse architectures, and cloud warehouses (Snowflake, Redshift). Strong background in ETL and big data engineering (on-prem and cloud). Work within enterprise cloud platforms (CFS2, Cloud Foundational Services 2/EDS) for governance and compliance. Experience building end-to-end pipelines for structured, semi-structured, and unstructured data using Spark.
    $110k-156k yearly est. 3d ago
  • Data Engineer

    Midjourney

    Data engineer job in Fremont, CA

    Midjourney is a research lab exploring new mediums to expand the imaginative powers of the human species. We are a small, self-funded team focused on design, human infrastructure, and AI. We have no investors, no big company controlling us, and no advertisers. We are 100% supported by our amazing community. Our tools are already used by millions of people to dream, to explore, and to create. But this is just the start. We think the story of the 2020s is about building the tools that will remake the world for the next century. We're making those tools, to expand what it means to be human. Core Responsibilities: Design and maintain data pipelines to consolidate information across multiple sources (subscription platforms, payment systems, infrastructure and usage monitoring, and financial systems) into a unified analytics environment Build and manage interactive dashboards and self-service BI tools that enable leadership to track key business metrics including revenue performance, infrastructure costs, customer retention, and operational efficiency Serve as technical owner of our financial planning platform (Pigment or similar), leading implementation and build-out of models, data connections, and workflows in partnership with Finance leadership to translate business requirements into functional system architecture Develop automated data quality checks and cleaning processes to ensure accuracy and consistency across financial and operational datasets Partner with Finance, Product and Operations teams to translate business questions into analytical frameworks, including cohort analysis, cost modeling, and performance trending Create and maintain documentation for data models, ETL processes, dashboard logic, and system workflows to ensure knowledge continuity Support strategic planning initiatives by building financial models, scenario analyses, and data-driven recommendations for resource allocation and growth investments Required Qualifications: 3-5+ years experience in data engineering, analytics engineering, or similar role with demonstrated ability to work with large-scale datasets Strong SQL skills and experience with modern data warehousing solutions (BigQuery, Snowflake, Redshift, etc.) Proficiency in at least one programming language (Python, R) for data manipulation and analysis Experience with BI/visualization tools (Looker, Tableau, Power BI, or similar) Hands-on experience administering enterprise financial systems (NetSuite, SAP, Oracle, or similar ERP platforms) Experience working with Stripe Billing or similar subscription management platforms, including data extraction and revenue reporting Ability to communicate technical concepts clearly to non-technical stakeholders
    $110k-156k yearly est. 1d ago
  • AI Data Engineer

    Franklin Fitch

    Data engineer job in Palo Alto, CA

    Experience Level: Mid-Senior (4+ years) About the Role This role supports the organization's data, analytics, and AI innovation efforts by building reliable data pipelines, developing machine learning solutions, and improving how information is collected, processed, and used across the business. This person is a collaborative problem-solver who can translate business needs into technical solutions, work confidently across large datasets, and deliver clear, actionable insights that help drive strategic outcomes. What They Do Support innovation projects by facilitating discussions, refining business processes, integrating data sources, and advising on best practices in analytics and AI. Build and maintain data architectures, APIs, and pipelines to support both operational systems and AI initiatives. Design and implement machine learning models, including developing prompts and AI-driven applications. Collaborate with technical and non-technical teams to improve data quality, reporting, and analytics across the organization. Manage large, complex datasets and develop tools to extract meaningful insights. Automate manual processes, optimize data delivery, and enhance internal infrastructure. Create dashboards and visualizations that clearly communicate insights to stakeholders. Apply statistical, computational, NLP, and ML techniques to solve business problems. Document processes, use version control best practices, and deploy models to production using enterprise platforms such as Azure. Ensure data security and privacy across systems, vendors, and applications. Experience & Skills 4+ years in a Data Engineering or DataOps/DevOps role. Strong experience with SQL, relational databases, cloud environments (Azure/AWS), and building/optimizing data pipelines. Ability to work with structured and unstructured data and perform root-cause analysis on data and integrations. Experience with AI/ML data preparation, preprocessing, feature engineering, and model deployment. Familiarity with vector databases, embeddings, and RAG-based applications. Strong communication skills and ability to collaborate across teams. Ability to handle confidential and sensitive information responsibly. Technologies Data & BI: SQL Server, T-SQL, SSIS/SSRS, ETL/ELT, Power BI, DAX, PowerQuery Programming: Python, R, C++, Julia, JavaScript, SQL File Formats: JSON, XML, SQL Extras (Nice to Have): PowerShell, Regex, VBA, documentation/process mapping, Tableau/Domo, Azure ML, legal tech platforms, LLM integration (OpenAI, Azure OpenAI, Claude), embedding models
    $110k-157k yearly est. 4d ago
  • Snowflake Data Architect

    Lumicity

    Data engineer job in Santa Clara, CA

    Senior Snowflake Data Engineer (Contract | Long-Term) We're partnering with an enterprise data platform team on a long-term initiative where Snowflake is the primary cloud data warehouse supporting analytics and reporting at scale. This role is ideal for someone whose core strength is Snowflake, with some experience working alongside Databricks in modern data ecosystems. What you'll be doing Building and maintaining ELT pipelines primarily in Snowflake Writing, optimizing, and troubleshooting complex Snowflake SQL Managing Snowflake objects: virtual warehouses, schemas, streams, tasks, and secure views Supporting performance tuning, cost optimization, and warehouse sizing Collaborating with analytics and business teams to deliver trusted datasets Integrating upstream or adjacent processing from Databricks where applicable What we're looking for Strong, hands-on Snowflake Data Engineering experience (primary platform) Advanced SQL expertise within Snowflake Experience designing ELT pipelines and analytical data models Working knowledge of Databricks / Spark in a production environment Understanding of Snowflake governance, security, and cost controls Nice to have dbt experience Experience supporting enterprise analytics or reporting teams Exposure to cloud-based data platforms in large-scale environments Engagement details Contract (long-term) Competitive hourly rate Remote or hybrid (US-based) This role is best suited for engineers who go deep in Snowflake and can collaborate across platforms when Databricks is part of the stack.
    $118k-166k yearly est. 5d ago
  • AWS Data Architect

    Fractal 4.2company rating

    Data engineer job in San Jose, CA

    Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner. Please visit Fractal | Intelligence for Imagination for more information about Fractal. Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines. Responsibilities: Design & Architecture of Scalable Data Platforms Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management). Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility. Client & Business Stakeholder Engagement Partner with business stakeholders to translate functional requirements into scalable technical solutions. Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases Data Pipeline Development & Collaboration Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets. Performance, Scalability, and Reliability Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques. Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools Security, Compliance & Governance Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies. Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging. Adoption of AI Copilots & Agentic Development Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks. Generating documentation and test cases to accelerate pipeline development. Interactive debugging and iterative code optimization within notebooks. Advocate for agentic AI workflows that use specialized agents for Data profiling and schema inference. Automated testing and validation. Innovation and Continuous Learning Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling. Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements. Requirements: Bachelor's or master's degree in computer science, Information Technology, or a related field. 8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark. Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL. Excellent hands on experience with workload automation tools such as Airflow, Prefect etc. Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage Experience designing Lakehouse architectures with bronze, silver, gold layering. Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing. Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.). Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions. Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc. Strong understanding of data privacy, access controls, and governance best practices. Experience working with RBAC, tokenization, and data classification frameworks Excellent communication skills for stakeholder interaction, solution presentations, and team coordination. Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements. Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality Must be able to work in PST time zone. Pay: The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period. Benefits: As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation. Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
    $150k-180k yearly 2d ago
  • Staff Data Scientist

    Quantix Search

    Data engineer job in San Jose, CA

    Staff Data Scientist | San Francisco | $250K-$300K + Equity We're partnering with one of the fastest-growing AI companies in the world to hire a Staff Data Scientist. Backed by over $230M from top-tier investors and already valued at over $1B, they've secured customers that include some of the most recognizable names in tech. Their AI platform powers millions of daily interactions and is quickly becoming the enterprise standard for conversational AI. In this role, you'll bring rigorous analytics and experimentation leadership that directly shapes product strategy and company performance. What you'll do: Drive deep-dive analyses on user behavior, product performance, and growth drivers Design and interpret A/B tests to measure product impact at scale Build scalable data models, pipelines, and dashboards for company-wide use Partner with Product and Engineering to embed experimentation best practices Evaluate ML models, ensuring business relevance, performance, and trade-off clarity What we're looking for: 5+ years in data science or product analytics at scale (consumer or marketplace preferred) Advanced SQL and Python skills, with strong foundations in statistics and experimental design Proven record of designing, running, and analyzing large-scale experiments Ability to analyze and reason about ML models (classification, recommendation, LLMs) Strong communicator with a track record of influencing cross-functional teams If you're excited by the sound of this challenge- apply today and we'll be in touch.
    $250k-300k yearly 4d ago
  • Staff Data Engineer

    Strativ Group

    Data engineer job in San Jose, CA

    🌎 San Francisco (Hybrid) 💼 Founding/Staff Data Engineer 💵 $200-300k base Our client is an elite applied AI research and product lab building AI-native systems for finance-and pushing frontier models into real production environments. Their work sits at the intersection of data, research, and high-stakes financial decision-making. As the Founding Data Engineer, you will own the data platform that powers everything: models, experiments, and user-facing products relied on by demanding financial customers. You'll make foundational architectural decisions, work directly with researchers and product engineers, and help define how data is built, trusted, and scaled from day one. What you'll do: Design and build the core data platform, ingesting, transforming, and serving large-scale financial and alternative datasets. Partner closely with researchers and ML engineers to ship production-grade data and feature pipelines that power cutting-edge models. Establish data quality, observability, lineage, and reproducibility across both experimentation and production workloads. Deploy and operate data services using Docker and Kubernetes in a modern cloud environment (AWS, GCP, or Azure). Make foundational choices on tooling, architecture, and best practices that will define how data works across the company. Continuously simplify and evolve systems-rewriting pipelines or infrastructure when it's the right long-term decision. Ideal candidate: Have owned or built high-performance data systems end-to-end, directly supporting production applications and ML models. Are strongest in backend and data infrastructure, with enough frontend literacy to integrate cleanly with web products when needed. Can design and evolve backend services and pipelines (Node.js or Python) to support new product features and research workflows. Are an expert in at least one statically typed language, with a strong bias toward type safety, correctness, and maintainable systems. Have deployed data workloads and services using Docker and Kubernetes on a major cloud provider. Are comfortable making hard calls-simplifying, refactoring, or rebuilding legacy pipelines when quality and scalability demand it. Use AI tools to accelerate your work, but rigorously review and validate AI-generated code, insisting on sound system design. Thrive in a high-bar, high-ownership environment with other exceptional engineers. Love deep technical problems in data infrastructure, distributed systems, and performance. Nice to have: Experience working with financial data (market, risk, portfolio, transactional, or alternative datasets). Familiarity with ML infrastructure, such as feature stores, experiment tracking, or model serving systems. Background in a high-growth startup or a foundational infrastructure role. Compensation & setup: Competitive salary and founder-level equity Hybrid role based in San Francisco, with close collaboration and significant ownership Small, elite team building core infrastructure with outsized impact
    $200k-300k yearly 3d ago
  • AI Data Engineer

    Hartleyco

    Data engineer job in San Jose, CA

    Member of Technical Staff - AI Data Engineer San Francisco (In-Office) $150K to $225K + Equity A high-growth, AI-native startup coming out of stealth is hiring AI Data Engineers to build the systems that power production-grade AI. The company has recently signed a Series A term sheet and is scaling rapidly. This role is central to unblocking current bottlenecks across data engineering, context modeling, and agent performance. Responsibilities: • Build distributed, reliable data pipelines using Airflow, Temporal, and n8n • Model SQL, vector, and NoSQL databases (Postgres, Qdrant, etc.) • Build API and function-based services in Python • Develop custom automations (Playwright, Stagehand, Zapier) • Work with AI researchers to define and expose context as services • Identify gaps in data quality and drive changes to upstream processes • Ship fast, iterate, and own outcomes end-to-end Required Experience: • Strong background in data engineering • Hands-on experience working with LLMs or LLM-powered applications • Data modeling skills across SQL and vector databases • Experience building distributed systems • Experience with Airflow, Temporal, n8n, or similar workflow engines • Python experience (API/services) • Startup mindset and bias toward rapid execution Nice To Have: • Experience with stream processing (Flink) • dbt or Clickhouse experience • CDC pipelines • Experience with context construction, RAG, or agent workflows • Analytical tooling (Posthog) What You Can Expect: • High-intensity, in-office environment • Fast decision-making and rapid shipping cycles • Real ownership over architecture and outcomes • Opportunity to work on AI systems operating at meaningful scale • Competitive compensation package • Meals provided plus full medical, dental, and vision benefits If this sounds like you, please apply now.
    $150k-225k yearly 5d ago
  • Senior Data Engineer

    Skale 3.7company rating

    Data engineer job in San Jose, CA

    We're hiring a Senior/Lead Data Engineer to join a fast-growing AI startup. The team comes from a billion dollar AI company, and has raised a $40M+ seed round. You'll need to be comfortable transforming and moving data in a new 'group level' data warehouse, from legacy sources. You'll have a strong data modeling background. Proven proficiency in modern data transformation tools, specifically dbt and/or SQLMesh. Exceptional ability to apply systems thinking and complex problem-solving to ambiguous challenges. Experience within a high-growth startup environment is highly valued. Deep, practical knowledge of the entire data lifecycle, from generation and governance through to advanced downstream applications (e.g., fueling AI/ML models, LLM consumption, and core product features). Outstanding ability to communicate technical complexity clearly, synthesizing information into actionable frameworks for executive and cross-functional teams.
    $125k-177k yearly est. 5d ago
  • Senior ML Data Engineer

    Midjourney

    Data engineer job in San Jose, CA

    We're the data team behind Midjourney's image generation models. We handle the dataset side: processing, filtering, scoring, captioning, and all the distributed compute that makes high-quality training data possible. What you'd be working on: Large-scale dataset processing and filtering pipelines Training classifiers for content moderation and quality assessment Models for data quality and aesthetic evaluation Data visualization tools for experimenting on dataset samples Testing/simulating distributed inference pipelines Monitoring dashboards for data quality and pipeline health Performance optimization and infrastructure scaling Occasionally jumping into inference optimization and other cross-team projects Our current stack: PySpark, Slurm, distributed batch processing across hybrid cloud setup. We're pragmatic about tools - if there's something better, we'll switch. We're looking for someone strong in either: Data engineering/ML pipelines at scale, or Cloud/infrastructure with distributed systems experience Don't need exact tech matches - comfort with adjacent technologies and willingness to learn matters more. We work with our own hardware plus GCP and other providers, so adaptability across different environments is valuable. Location: SF office a few times per week (we may make exceptions on location for truly exceptional candidates) The role offers variety, our team members often get pulled into different projects across the company, from dataset work to inference optimization. If you're interested in the intersection of large-scale data processing and cutting-edge generative AI, we'd love to hear from you.
    $110k-156k yearly est. 4d ago
  • Senior Data Engineer

    Sigmaways Inc.

    Data engineer job in San Jose, CA

    If you're hands on with modern data platforms, cloud tech, and big data tools and you like building solutions that are secure, repeatable, and fast, this role is for you. As a Senior Data Engineer, you will design, build, and maintain scalable data pipelines that transform raw information into actionable insights. The ideal candidate will have strong experience across modern data platforms, cloud environments, and big data technologies, with a focus on building secure, repeatable, and high-performing solutions. Responsibilities: Design, develop, and maintain secure, scalable data pipelines to ingest, transform, and deliver curated data into the Common Data Platform (CDP). Participate in Agile rituals and contribute to delivery within the Scaled Agile Framework (SAFe). Ensure quality and reliability of data products through automation, monitoring, and proactive issue resolution. Deploy alerting and auto-remediation for pipelines and data stores to maximize system availability. Apply a security first and automation-driven approach to all data engineering practices. Collaborate with cross-functional teams (data scientists, analysts, product managers, and business stakeholders) to align infrastructure with evolving data needs. Stay current on industry trends and emerging tools, recommending improvements to strengthen efficiency and scalability. Qualifications: Bachelor's degree in Computer Science, Information Systems, or related field (or equivalent experience). At least 3 years of experience with Python and PySpark, including Jupyter notebooks and unit testing. At least 2 years of experience with Databricks, Collibra, and Starburst. Proven work with relational and NoSQL databases, including STAR and dimensional modeling approaches. Hands-on experience with modern data stacks: object stores (S3), Spark, Airflow, lakehouse architectures, and cloud warehouses (Snowflake, Redshift). Strong background in ETL and big data engineering (on-prem and cloud). Work within enterprise cloud platforms (CFS2, Cloud Foundational Services 2/EDS) for governance and compliance. Experience building end-to-end pipelines for structured, semi-structured, and unstructured data using Spark.
    $110k-156k yearly est. 3d ago

Learn more about data engineer jobs

How much does a data engineer earn in Sunnyvale, CA?

The average data engineer in Sunnyvale, CA earns between $94,000 and $183,000 annually. This compares to the national average data engineer range of $80,000 to $149,000.

Average data engineer salary in Sunnyvale, CA

$131,000

What are the biggest employers of Data Engineers in Sunnyvale, CA?

The biggest employers of Data Engineers in Sunnyvale, CA are:
  1. Redolent
  2. Meta
  3. Catalyst Labs
  4. ETEK International
  5. Intuit
  6. Microsoft
  7. Health GPT Inc.
  8. Zyphra
  9. Stanford Health Care
  10. Amazon
Job type you want
Full Time
Part Time
Internship
Temporary