Post job

Data engineer jobs in Walnut Creek, CA

- 9,966 jobs
All
Data Engineer
Data Scientist
Software Engineer
Data Architect
Senior Data Architect
Lead Data Architect
Data Warehouse Developer
  • Staff Data Scientist

    Quantix Search

    Data engineer job in San Francisco, CA

    Staff Data Scientist | San Francisco | $250K-$300K + Equity We're partnering with one of the fastest-growing AI companies in the world to hire a Staff Data Scientist. Backed by over $230M from top-tier investors and already valued at over $1B, they've secured customers that include some of the most recognizable names in tech. Their AI platform powers millions of daily interactions and is quickly becoming the enterprise standard for conversational AI. In this role, you'll bring rigorous analytics and experimentation leadership that directly shapes product strategy and company performance. What you'll do: Drive deep-dive analyses on user behavior, product performance, and growth drivers Design and interpret A/B tests to measure product impact at scale Build scalable data models, pipelines, and dashboards for company-wide use Partner with Product and Engineering to embed experimentation best practices Evaluate ML models, ensuring business relevance, performance, and trade-off clarity What we're looking for: 5+ years in data science or product analytics at scale (consumer or marketplace preferred) Advanced SQL and Python skills, with strong foundations in statistics and experimental design Proven record of designing, running, and analyzing large-scale experiments Ability to analyze and reason about ML models (classification, recommendation, LLMs) Strong communicator with a track record of influencing cross-functional teams If you're excited by the sound of this challenge- apply today and we'll be in touch.
    $250k-300k yearly 2d ago
  • Data Scientist

    Skale 3.7company rating

    Data engineer job in San Francisco, CA

    We're working with a Series A health tech start-up pioneering a revolutionary approach to healthcare AI, developing neurosymbolic systems that combine statistical learning with structured medical knowledge. Their technology is being adopted by leading health systems and insurers to enhance patient outcomes through advanced predictive analytics. We're seeking Machine Learning Engineers who excel at the intersection of data science, modeling, and software engineering. You'll design and implement models that extract insights from longitudinal healthcare data, balancing analytical rigor, interpretability, and scalability. This role offers a unique opportunity to tackle foundational modeling challenges in healthcare, where your contributions will directly influence clinical, actuarial, and policy decisions. Key Responsibilities Develop predictive models to forecast disease progression, healthcare utilization, and costs using temporal clinical data (claims, EHR, laboratory results, pharmacy records) Design interpretable and explainable ML solutions that earn the trust of clinicians, actuaries, and healthcare decision-makers Research and prototype innovative approaches leveraging both classical and modern machine learning techniques Build robust, scalable ML pipelines for training, validation, and deployment in distributed computing environments Collaborate cross-functionally with data engineers, clinicians, and product teams to ensure models address real-world healthcare needs Communicate findings and methodologies effectively through visualizations, documentation, and technical presentations Required Qualifications Strong foundation in statistical modeling, machine learning, or data science, with preference for experience in temporal or longitudinal data analysis Proficiency in Python and ML frameworks (PyTorch, JAX, NumPyro, PyMC, etc.) Proven track record of transitioning models from research prototypes to production systems Experience with probabilistic methods, survival analysis, or Bayesian inference (highly valued) Bonus Qualifications Experience working with clinical data and healthcare terminologies (ICD, CPT, SNOMED CT, LOINC) Background in actuarial modeling, claims forecasting, or risk adjustment methodologies
    $123k-171k yearly est. 5d ago
  • Data Scientist

    Randomtrees

    Data engineer job in San Francisco, CA

    Key Responsibilities Design and productionize models for opportunity scanning, anomaly detection, and significant change detection across CRM, streaming, ecommerce, and social data. Define and tune alerting logic (thresholds, SLOs, precision/recall) to minimize noise while surfacing high-value marketing actions. Partner with marketing, product, and data engineering to operationalize insights into campaigns, playbooks, and automated workflows, with clear monitoring and experimentation. Required Qualifications Strong proficiency in Python (pandas, NumPy, scikit-learn; plus experience with PySpark or similar for large-scale data) and SQL on modern warehouses (e.g., BigQuery, Snowflake, Redshift). Hands-on experience with time-series modeling and anomaly / changepoint / significant-movement detection(e.g., STL decomposition, EWMA/CUSUM, Bayesian/prophet-style models, isolation forests, robust statistics). Experience building and deploying production ML pipelines (batch and/or streaming), including feature engineering, model training, CI/CD, and monitoring for performance and data drift. Solid background in statistics and experimentation: hypothesis testing, power analysis, A/B testing frameworks, uplift/propensity modeling, and basic causal inference techniques. Familiarity with cloud platforms (GCP/AWS/Azure), orchestration tools (e.g., Airflow/Prefect), and dashboarding/visualization tools to expose alerts and model outputs to business users.
    $108k-155k yearly est. 1d ago
  • Data Engineer

    Brooksource 4.1company rating

    Data engineer job in San Francisco, CA

    Elevate Data Engineer Hybrid, CA Brooksource is searching for an Associate Data Engineer to join our HealthCare partner to support their data analytics groups. This position is through Brooksource's Elevate Program, and will include additional technical training including, but not limited to: SQL, Python, DBT, Azure, etc. Responsibilities Assist in the design, development, and implementation of ELT/ETL data pipelines using Azure-based technologies Support data warehouse environments for large-scale enterprise systems Help implement and maintain data models following best practices Participate in data integration efforts to support reporting and analytics needs Perform data validation, troubleshooting, and incident resolution for data pipelines Support documentation of data flows, transformations, and architecture DevOps & Platform Support Assist with DevOps activities related to data platforms, including deployments and environment support Help build and maintain automation scripts and reusable frameworks for data operations Support CI/CD pipelines for data engineering workflows Assist with monitoring, alerting, and basic performance optimization Collaborate with senior engineers to support infrastructure-as-code and cloud resource management Collaboration & Delivery Work closely with data engineers, solution leads, data modelers, analysts, and business partners Help translate business requirements into technical data solutions Participate in code reviews, sprint planning, and team ceremonies Follow established architecture, security, and data governance standards Required Qualifications Bachelor's degree in Computer Science, Engineering, Information Systems, or related field (or equivalent experience) Foundational knowledge of data engineering concepts, including ETL/ELT and data warehousing Experience or coursework with SQL and relational databases Familiarity with Microsoft Azure or another cloud platform Basic scripting experience (Python, SQL, PowerShell, or Bash) Understanding of version control (Git) Preferred / Nice-to-Have Skills Exposure to Azure services such as Azure Data Factory, Synapse Analytics, Azure SQL, or Data Lake Basic understanding of CI/CD pipelines and DevOps concepts Familiarity with data modeling concepts (star schema, normalization) Experience of fa Interest in automation, cloud infrastructure, and reliability engineering Internship or project experience in data engineering or DevOps environments
    $124k-172k yearly est. 2d ago
  • Founding Data Scientist (GTM)

    Greylock Partners 4.5company rating

    Data engineer job in San Francisco, CA

    An early-stage investment of ours is looking to make their first IC hire in data science. This company builds tools that help teams understand how their AI systems perform and improve them over time (and they already have a lot of enterprise customers). We're looking for a Sr Data Scientist to lead analytics for sales, marketing, and customer success. The job is about finding insights in data, running analyses and experiments, and helping the business make better decisions. Responsibilities: Analyze data to improve how the company finds, converts, and supports customers Create models that predict lead quality, conversion, and customer value Build clear dashboards and reports for leadership Work with teams across the company to answer key questions Take initiative, communicate clearly, and dig into data to solve problems Try new methods and tools to keep improving the company's GTM approach Qualifications: 5+ years related industry experience working with data and supporting business teams. Solid experience analyzing GTM or revenue-related data Strong skills in SQL and modern analytics tools (Snowflake, Hex, dbt etc.) Comfortable owning data workflows-from cleaning and modeling to presenting insights. Able to work independently, prioritize well, and move projects forward without much direction Clear thinker and communicator who can turn data into actionable recommendations Adaptable and willing to learn new methods in a fast-paced environment About Us: Greylock is an early-stage investor in hundreds of remarkable companies including Airbnb, LinkedIn, Dropbox, Workday, Cloudera, Facebook, Instagram, Roblox, Coinbase, Palo Alto Networks, among others. More can be found about us here: ********************* How We Work: We are full-time, salaried employees of Greylock and provide free candidate referrals/introductions to our active investments. We will contact anyone who looks like a potential match--requesting to schedule a call with you immediately. Due to the selective nature of this service and the volume of applicants we typically receive from our job postings, a follow-up email will not be sent until a match is identified with one of our investments. Please note: We are not recruiting for any roles within Greylock at this time. This job posting is for direct employment with a startup in our portfolio.
    $116k-155k yearly est. 3d ago
  • Senior Data Engineer - Spark, Airflow

    Sigmaways Inc.

    Data engineer job in San Francisco, CA

    We are seeking an experienced Data Engineer to design and optimize scalable data pipelines that drive our global data and analytics initiatives. In this role, you will leverage technologies such as Apache Spark, Airflow, and Python to build high performance data processing systems and ensure data quality, reliability, and lineage across Mastercard's data ecosystem. The ideal candidate combines strong technical expertise with hands-on experience in distributed data systems, workflow automation, and performance tuning to deliver impactful, data-driven solutions at enterprise scale. Responsibilities: Design and optimize Spark-based ETL pipelines for large-scale data processing. Build and manage Airflow DAGs for scheduling, orchestration, and checkpointing. Implement partitioning and shuffling strategies to improve Spark performance. Ensure data lineage, quality, and traceability across systems. Develop Python scripts for data transformation, aggregation, and validation. Execute and tune Spark jobs using spark-submit. Perform DataFrame joins and aggregations for analytical insights. Automate multi-step processes through shell scripting and variable management. Collaborate with data, DevOps, and analytics teams to deliver scalable data solutions. Qualifications: Bachelor's degree in Computer Science, Data Engineering, or related field (or equivalent experience). At least 7 years of experience in data engineering or big data development. Strong expertise in Apache Spark architecture, optimization, and job configuration. Proven experience with Airflow DAGs using authoring, scheduling, checkpointing, monitoring. Skilled in data shuffling, partitioning strategies, and performance tuning in distributed systems. Expertise in Python programming including data structures and algorithmic problem-solving. Hands-on with Spark DataFrames and PySpark transformations using joins, aggregations, filters. Proficient in shell scripting, including managing and passing variables between scripts. Experienced with spark submit for deployment and tuning. Solid understanding of ETL design, workflow automation, and distributed data systems. Excellent debugging and problem-solving skills in large-scale environments. Experience with AWS Glue, EMR, Databricks, or similar Spark platforms. Knowledge of data lineage and data quality frameworks like Apache Atlas. Familiarity with CI/CD pipelines, Docker/Kubernetes, and data governance tools.
    $110k-157k yearly est. 5d ago
  • Data Engineer

    Midjourney

    Data engineer job in San Francisco, CA

    Midjourney is a research lab exploring new mediums to expand the imaginative powers of the human species. We are a small, self-funded team focused on design, human infrastructure, and AI. We have no investors, no big company controlling us, and no advertisers. We are 100% supported by our amazing community. Our tools are already used by millions of people to dream, to explore, and to create. But this is just the start. We think the story of the 2020s is about building the tools that will remake the world for the next century. We're making those tools, to expand what it means to be human. Core Responsibilities: Design and maintain data pipelines to consolidate information across multiple sources (subscription platforms, payment systems, infrastructure and usage monitoring, and financial systems) into a unified analytics environment Build and manage interactive dashboards and self-service BI tools that enable leadership to track key business metrics including revenue performance, infrastructure costs, customer retention, and operational efficiency Serve as technical owner of our financial planning platform (Pigment or similar), leading implementation and build-out of models, data connections, and workflows in partnership with Finance leadership to translate business requirements into functional system architecture Develop automated data quality checks and cleaning processes to ensure accuracy and consistency across financial and operational datasets Partner with Finance, Product and Operations teams to translate business questions into analytical frameworks, including cohort analysis, cost modeling, and performance trending Create and maintain documentation for data models, ETL processes, dashboard logic, and system workflows to ensure knowledge continuity Support strategic planning initiatives by building financial models, scenario analyses, and data-driven recommendations for resource allocation and growth investments Required Qualifications: 3-5+ years experience in data engineering, analytics engineering, or similar role with demonstrated ability to work with large-scale datasets Strong SQL skills and experience with modern data warehousing solutions (BigQuery, Snowflake, Redshift, etc.) Proficiency in at least one programming language (Python, R) for data manipulation and analysis Experience with BI/visualization tools (Looker, Tableau, Power BI, or similar) Hands-on experience administering enterprise financial systems (NetSuite, SAP, Oracle, or similar ERP platforms) Experience working with Stripe Billing or similar subscription management platforms, including data extraction and revenue reporting Ability to communicate technical concepts clearly to non-technical stakeholders
    $110k-157k yearly est. 4d ago
  • Data Engineer, Knowledge Graphs

    Mithrl

    Data engineer job in San Francisco, CA

    We imagine a world where new medicines reach patients in months, not years, and where scientific breakthroughs happen at the speed of thought. Mithrl is building the world's first commercially available AI Co-Scientist. It is a discovery engine that transforms messy biological data into insights in minutes. Scientists ask questions in natural language, and Mithrl responds with analysis, novel targets, hypotheses, and patent-ready reports. No coding. No waiting. No bioinformatics bottlenecks. We are one of the fastest growing tech bio companies in the Bay Area with 12x year over year revenue growth. Our platform is used across three continents by leading biotechs and big pharmas. We power breakthroughs from early target discovery to mechanism-of-action. And we are just getting started. ABOUT THE ROLE We are hiring a Data Engineer, Knowledge Graphs to build the infrastructure that powers Mithrl's biological knowledge layer. You will partner closely with the Data Scientist, Knowledge Graphs to take curated knowledge sources and transform them into scalable, reliable, production ready systems that serve the entire platform. Your work includes building ETL pipelines for large biological datasets, designing schemas and storage models for graph structured data, and creating the API surfaces that allow ML engineers, application teams, and the AI Co-Scientist to query and use the knowledge graph efficiently. You will also own the reliability, performance, and versioning of knowledge graph infrastructure across releases. This role is the bridge between biological knowledge ingestion and the high performance engineering systems that use it. If you enjoy working on data modeling, schema design, graph storage, ETL, and scalable infrastructure, this is an opportunity to have deep impact on the intelligence layer of Mithrl. WHAT YOU WILL DO Build and maintain ETL pipelines for large public biological datasets and curated knowledge sources Design, implement, and evolve schemas and storage models for graph structured biological data Create efficient APIs and query surfaces that allow internal teams and AI systems to retrieve nodes, relationships, pathways, annotations, and graph analytics Partner closely with the Data Scientists to operationalize curated relationships, harmonized variable IDs, metadata standards, and ontology mappings Build data models that support multi tenant access, versioning, and reproducibility across releases Implement scalable storage and indexing strategies for high volume graph data Maintain data quality, validate data integrity, and build monitoring around ingestion and usage Work with ML engineers and application teams to ensure the knowledge graph infrastructure supports downstream reasoning, analysis, and discovery applications Support data warehousing, documentation, and API reliability Ensure performance, reliability, and uptime for knowledge graph services WHAT YOU BRING Required Qualifications Strong experience as a data engineer or backend engineer working with data intensive systems Experience building ETL or ELT pipelines for large structured or semi structured datasets Strong understanding of database design, schema modeling, and data architecture Experience with graph data models or willingness to learn graph storage concepts Proficiency in Python or similar languages for data engineering Experience designing and maintaining APIs for data access Understanding of versioning, provenance, validation, and reproducibility in data systems Experience with cloud infrastructure and modern data stack tools Strong communication skills and ability to work closely with scientific and engineering teams Nice to Have Experience with graph databases or graph query languages Experience with biological or chemical data sources Familiarity with ontologies, controlled vocabularies, and metadata standards Experience with data warehousing and analytical storage formats Previous work in a tech bio company or scientific platform environment WHAT YOU WILL LOVE AT MITHRL You will build the core infrastructure that makes the biological knowledge graph fast, reliable, and usable Team: Join a tight-knit, talent-dense team of engineers, scientists, and builders Culture: We value consistency, clarity, and hard work. We solve hard problems through focused daily execution Speed: We ship fast (2x/week) and improve continuously based on real user feedback Location: Beautiful SF office with a high-energy, in-person culture Benefits: Comprehensive PPO health coverage through Anthem (medical, dental, and vision) + 401(k) with top-tier plans
    $110k-157k yearly est. 1d ago
  • Imaging Data Engineer/Architect

    Intuitive.Ai

    Data engineer job in San Francisco, CA

    About us: Intuitive is an innovation-led engineering company delivering business outcomes for 100's of Enterprises globally. With the reputation of being a Tiger Team & a Trusted Partner of enterprise technology leaders, we help solve the most complex Digital Transformation challenges across following Intuitive Superpowers: Modernization & Migration Application & Database Modernization Platform Engineering (IaC/EaC, DevSecOps & SRE) Cloud Native Engineering, Migration to Cloud, VMware Exit FinOps Data & AI/ML Data (Cloud Native / DataBricks / Snowflake) Machine Learning, AI/GenAI Cybersecurity Infrastructure Security Application Security Data Security AI/Model Security SDx & Digital Workspace (M365, G-suite) SDDC, SD-WAN, SDN, NetSec, Wireless/Mobility Email, Collaboration, Directory Services, Shared Files Services Intuitive Services: Professional and Advisory Services Elastic Engineering Services Managed Services Talent Acquisition & Platform Resell Services About the job: Title: Imaging Data Engineer/Architect Start Date: Immediate # of Positions: 1 Position Type: Contract/ Full-Time Location: San Francisco, CA Notes: Imaging data Engineer/architect who understands Radiology and Digital pathology, related clinical data and metadata. Hands-on experience on above technologies, and with good knowledge in the biomedical imaging, and data pipelines overall. About the Role We are seeking a highly skilled Imaging Data Engineer/Architect to join our San Francisco team as a Subject Matter Expert (SME) in radiology and digital pathology. This role will design and manage imaging data pipelines, ensuring seamless integration of clinical data and metadata to support advanced diagnostic and research applications. The ideal candidate will have deep expertise in medical imaging standards, cloud-based data architectures, and healthcare interoperability, contributing to innovative solutions that enhance patient outcomes. Responsibilities Design and implement scalable data architectures for radiology and digital pathology imaging data, including DICOM, HL7, and FHIR standards. Develop and optimize data pipelines to process and store large-scale imaging datasets (e.g., MRI, CT, histopathology slides) and associated metadata. Collaborate with clinical teams to understand radiology and pathology workflows, ensuring data solutions align with clinical needs. Ensure data integrity, security, and compliance with healthcare regulations (e.g., HIPAA, GDPR). Integrate imaging data with AI/ML models for diagnostic and predictive analytics, working closely with data scientists. Build and maintain metadata schemas to support data discoverability and interoperability across systems. Provide technical expertise to cross-functional teams, including product managers and software engineers, to drive imaging data strategy. Conduct performance tuning and optimization of imaging data storage and retrieval systems in cloud environments (e.g., AWS, Google Cloud, Azure). Document data architectures and processes, ensuring knowledge transfer to internal teams and external partners. Stay updated on emerging imaging technologies and standards, proposing innovative solutions to enhance data workflows. Qualifications Education: Bachelor's degree in computer science, Biomedical Engineering, or a related field (master's preferred). Experience: 5+ years in data engineering or architecture, with at least 3 years focused on medical imaging (radiology and/or digital pathology). Proven experience with DICOM, HL7, FHIR, and imaging metadata standards (e.g., SNOMED, LOINC). Hands-on experience with cloud platforms (AWS, Google Cloud, or Azure) for imaging data storage and processing. Technical Skills: Proficiency in programming languages (e.g., Python, Java, SQL) for data pipeline development. Expertise in ETL processes, data warehousing, and database management (e.g., Snowflake, BigQuery, PostgreSQL). Familiarity with AI/ML integration for imaging data analytics. Knowledge of containerization (e.g., Docker, Kubernetes) for deploying data solutions. Domain Knowledge: Deep understanding of radiology and digital pathology workflows, including PACS and LIS systems. Familiarity with clinical data integration and healthcare interoperability standards. Soft Skills: Strong analytical and problem-solving skills to address complex data challenges. Excellent communication skills to collaborate with clinical and technical stakeholders. Ability to work independently in a fast-paced environment, with a proactive approach to innovation. Certifications (preferred): AWS Certified Solutions Architect, Google Cloud Professional Data Engineer, or equivalent. Certifications in medical imaging (e.g., CIIP - Certified Imaging Informatics Professional).
    $110k-157k yearly est. 4d ago
  • Data Engineer / Analytics Specialist

    Ittconnect

    Data engineer job in San Francisco, CA

    Citizenship Requirement: U.S. Citizens Only ITTConnect is seeking a Data Engineer / Analytics to work for one of our clients, a major Technology Consulting firm with headquarters in Europe. They are experts in tailored technology consulting and services to banks, investment firms and other Financial vertical clients. Job location: San Francisco Bay area or NY City. Work Model: Ability to come into the office as requested Seniority: 10+ years of total experience About the role: The Data Engineer / Analytics Specialist will support analytics, product insights, and AI initiatives. You will build robust data pipelines, integrate data sources, and enhance the organization's analytical foundations. Responsibilities: Build and operate Snowflake-based analytics environments. Develop ETL/ELT pipelines (DBT, Airflow, etc.). Integrate APIs, external data sources, and streaming inputs. Perform query optimization, basic data modeling, and analytics support. Enable downstream GenAI and analytics use cases. Requirements: 10+ years of overall technology experience 3+ years hands-on AWS experience required Strong SQL and Snowflake experience. Hands-on pipeline engineering with DBT, Airflow, or similar. Experience with API integrations and modern data architectures.
    $110k-157k yearly est. 2d ago
  • Data Engineer

    Odiin

    Data engineer job in San Francisco, CA

    You'll work closely with engineering, analytics, and product teams to ensure data is accurate, accessible, and efficiently processed across the organization. Key Responsibilities: Design, develop, and maintain scalable data pipelines and architectures. Collect, process, and transform data from multiple sources into structured, usable formats. Ensure data quality, reliability, and security across all systems. Work with data analysts and data scientists to optimize data models for analytics and machine learning. Implement ETL (Extract, Transform, Load) processes and automate workflows. Monitor and troubleshoot data infrastructure, ensuring minimal downtime and high performance. Collaborate with cross-functional teams to define data requirements and integrate new data sources. Maintain comprehensive documentation for data systems and processes. Requirements: Proven experience as a Data Engineer, ETL Developer, or similar role. Strong programming skills in Python, SQL, or Scala. Experience with data pipeline tools (Airflow, dbt, Luigi, etc.). Familiarity with big data technologies (Spark, Hadoop, Kafka, etc.). Hands-on experience with cloud data platforms (AWS, GCP, Azure, Snowflake, or Databricks). Understanding of data modeling, warehousing, and schema design. Solid knowledge of database systems (PostgreSQL, MySQL, NoSQL). Strong analytical and problem-solving skills.
    $110k-157k yearly est. 2d ago
  • Data Engineer (SQL / SQL Server Focus)

    Franklin Fitch

    Data engineer job in San Francisco, CA

    Data Engineer (SQL / SQL Server Focus) (Kind note, we cannot provide sponsorship for this role) A leading professional services organization is seeking an experienced Data Engineer to join its team. This role supports enterprise-wide systems, analytics, and reporting initiatives, with a strong emphasis on SQL Server-based data platforms. Key Responsibilities Design, develop, and optimize SQL Server-centric ETL/ELT pipelines to ensure reliable, accurate, and timely data movement across enterprise systems. Develop and maintain SQL Server data models, schemas, and tables to support financial analytics and reporting. Write, optimize, and maintain complex T-SQL queries, stored procedures, functions, and views with a strong focus on performance and scalability. Build and support SQL Server Reporting Services (SSRS) solutions, translating business requirements into clear, actionable reports. Partner with finance and business stakeholders to define KPIs and ensure consistent, trusted reporting outputs. Monitor, troubleshoot, and tune SQL Server workloads, including query performance, indexing strategies, and execution plans. Ensure adherence to data governance, security, and access control standards within SQL Server environments. Support documentation, version control, and change management for database and reporting solutions. Collaborate closely with business analysts, data engineers, and IT teams to deliver end-to-end data solutions. Mentor junior team members and contribute to database development standards and best practices. Act as a key contributor to enterprise data architecture and reporting strategy, particularly around SQL Server platforms. Required Education & Experience Bachelor's or Master's degree in Computer Science, Information Systems, Data Engineering, or a related field. 8+ years of hands-on experience working with SQL Server in enterprise data warehouse or financial reporting environments. Advanced expertise in T-SQL, including: Query optimization Index design and maintenance Stored procedures and performance tuning Strong experience with SQL Server Integration Services (SSIS) and SSRS. Solid understanding of data warehousing concepts, including star and snowflake schemas, and OLAP vs. OLTP design. Experience supporting large, business-critical databases with high reliability and performance requirements. Familiarity with Azure-based SQL Server deployments (Azure SQL, Managed Instance, or SQL Server on Azure VMs) is a plus. Strong analytical, problem-solving, and communication skills, with the ability to work directly with non-technical stakeholders.
    $110k-157k yearly est. 3d ago
  • Data Architect

    Highbrow LLC 3.8company rating

    Data engineer job in Emeryville, CA

    Role: Data Architect (SAP Data exp) Reporting to the VP, Development & Data, build and lead a team responsible for leveraging data-driven insights and advanced analytics to optimize decision-making, improve operational efficiency, and drive strategic business value across the organization. You will guide the design and implementation of Grocery Outlet's data science, AI, and data governance programs-ensuring robust data management, governance practices, and the development of innovative AI-enabled solutions that support key business areas such as merchandising, supply chain, marketing, and operations. You will collaborate closely with senior stakeholders across Business Intelligence, Enterprise Data Architecture, and Business Solutions to align data and analytics investments with enterprise goals. You are a visionary, strategic leader who sees data and AI as powerful enablers of business success. You combine deep technical expertise in data science, AI, and analytics with strong business acumen and excellent stakeholder communication. You are passionate about turning data into actionable insights that move the needle. Requirements Experience & Expertise • 12+ years of experience in data science, analytics, or data governance roles • 5+ years leading data science, AI, or enterprise analytics functions at a senior level • Proven track record of successfully delivering AI and analytics solutions that drive measurable business impact • Deep knowledge of machine learning techniques, predictive modeling, statistical analysis, and data visualization tools • Strong understanding of data governance frameworks, data privacy, security, and regulatory compliance (e.g., CCPA, GDPR) • Experience building and scaling analytics and data teams within retail, consumer products, or supply chain organizations • Bachelor's degree in computer science, Data Science, Engineering, or related field required; Masters or advanced degree strongly preferred • Experience with SAP Data models and familiarity with various functions within SAP will be a big plus. Leadership & Skills • Exceptional executive communication and stakeholder engagement capabilities • Collaborative and inclusive leader who builds strong relationships across business and technology functions • Skilled at translating complex data concepts into clear business insights and strategic recommendations • Metrics-driven and outcome-focused; disciplined in demonstrating the ROI of analytics investments • Strong team-builder and mentor with a proven ability to attract, grow, and retain analytics talent • Innovative thinker who proactively explores new techniques, tools, and industry trends Job responsibilities Strategic Leadership & Vision • Define and execute comprehensive data science, data governance and AI strategy aligned with corporate priorities • Act as an influential advisor to executive leaders and business stakeholders on leveraging AI and data-driven insights • Champion the responsible and ethical use of AI and data, ensuring that all initiatives balance business value with transparency, security, and regulatory compliance • Lead the development and execution of Grocery Outlet's enterprise data governance framework AI & Data Science Development • Lead the vision and roadmap for advancing enterprise data capabilities and scaling AI, and advanced analytics across core business areas • Lead enterprise data development initiatives to improve data quality, cleanse and standardize master data • Drive close collaboration across the data team, business data organization, IT, and functional stakeholders to accelerate prototyping, testing, and deployment of analytics solutions • Stay current on emerging AI techniques, platforms, and trends; introduce innovation and best practices Data Governance & Management • Establish and enforce standards, policies, and procedures for data quality, accuracy, security, privacy, and compliance • Oversee the design and implementation of data lineage, and master data management (MDM) practices • Collaborate closely with Enterprise Data Architect to ensure data availability, integrity, and accessibility across the enterprise • Define and publish KPIs and metrics for data governance effectiveness and maturity People & Team Leadership • Recruit, build, and lead high-performing teams of data & AI engineers, and data governance specialists • Foster a culture of collaboration, accountability, and continuous learning within your teams • Provide coaching and professional development to drive growth and career progression • Manage relationships and performance with external vendors, consultants, and analytics partners Performance & Value Realization • Measure, track, and communicate the business impact and value of AI and data initiatives • Develop clear success criteria, metrics, and dashboards for analytics-driven outcomes • Partner with Finance and the PMO to quantify and articulate the ROI of investments in AI, data science, and governance • Ensure transparency, timely delivery, and alignment of analytics projects with organizational goals
    $124k-174k yearly est. 2d ago
  • Senior Data Warehouse & BI Developer

    Ariat International 4.7company rating

    Data engineer job in San Leandro, CA

    About the Role We're looking for a Senior Data Warehouse & BI Developer to join our Data & Analytics team and help shape the future of Ariat's enterprise data ecosystem. You'll design and build data solutions that power decision-making across the company, from eCommerce to finance and operations. In this role, you'll take ownership of data modeling, and BI reporting using Cognos and Tableau, and contribute to the development of SAP HANA Calculation Views. If you're passionate about data architecture, visualization, and collaboration - and love learning new tools - this role is for you. You'll Make a Difference By Designing and maintaining Ariat's enterprise data warehouse and reporting architecture. Developing and optimizing Cognos reports for business users. Collaborating with the SAP HANA team to develop and enhance Calculation Views. Translating business needs into technical data models and actionable insights. Ensuring data quality through validation, testing, and governance practices. Partnering with teams across the business to improve data literacy and reporting capabilities. Staying current with modern BI and data technologies to continuously evolve Ariat's analytics stack. About You 7+ years of hands-on experience in BI and Data Warehouse development. Advanced skills in Cognos (Framework Manager, Report Studio). Strong SQL skills and experience with data modeling (star schemas, dimensional modeling). Experience building and maintaining ETL processes. Excellent analytical and communication skills. A collaborative, learning-oriented mindset. Experience developing SAP HANA Calculation Views preferred Experience with Tableau (Desktop, Server) preferred Knowledge of cloud data warehouses (Snowflake, BigQuery, etc.). Background in retail or eCommerce analytics. Familiarity with Agile/Scrum methodologies. About Ariat Ariat is an innovative, outdoor global brand with roots in equestrian performance. We develop high-quality footwear and apparel for people who ride, work, and play outdoors, and care about performance, quality, comfort, and style. The salary range for this position is $120,000 - $150,000 per year. The salary is determined by the education, experience, knowledge, skills, and abilities of the applicant, internal equity, and alignment with market data for geographic locations. Ariat in good faith believes that this posted compensation range is accurate for this role at this location at the time of this posting. This range may be modified in the future. Ariat's holistic benefits package for full-time team members includes (but is not limited to): Medical, dental, vision, and life insurance options Expanded wellness and mental health benefits Paid time off (PTO), paid holidays, and paid volunteer days 401(k) with company match Bonus incentive plans Team member discount on Ariat merchandise Note: Availability of benefits may be subject to location & employment type and may have certain eligibility requirements. Ariat reserves the right to alter these benefits in whole or in part at any time without advance notice. Ariat will consider qualified applicants, including those with criminal histories, in a manner consistent with state and local laws. Ariat is an Equal Opportunity Employer and considers applicants for employment without regard to race, color, religion, sex, orientation, national origin, age, disability, genetics or any other basis protected under federal, state, or local law. Ariat is committed to providing reasonable accommodations to candidates with disabilities. If you need an accommodation during the application process, email *************************. Please see our Employment Candidate Privacy Policy at ********************* to learn more about how we collect, use, retain and disclose Personal Information. Please note that Ariat does not accept unsolicited resumes from recruiters or employment agencies. In the absence of a signed Agreement, Ariat will not consider or agree to payment of any referral compensation or recruiter/agency placement fee. In the event a recruiter or agency submits a resume or candidate without a previously signed Agreement, Ariat explicitly reserves the right to pursue and hire those candidate(s) without any financial obligation to the recruiter or agency. Any unsolicited resumes, including those submitted directly to hiring managers, are deemed to be the property of Ariat.
    $120k-150k yearly 3d ago
  • AWS Data Architect

    Fractal 4.2company rating

    Data engineer job in San Francisco, CA

    Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner. Please visit Fractal | Intelligence for Imagination for more information about Fractal. Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines. Responsibilities: Design & Architecture of Scalable Data Platforms Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management). Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility. Client & Business Stakeholder Engagement Partner with business stakeholders to translate functional requirements into scalable technical solutions. Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases Data Pipeline Development & Collaboration Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets. Performance, Scalability, and Reliability Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques. Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools Security, Compliance & Governance Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies. Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging. Adoption of AI Copilots & Agentic Development Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks. Generating documentation and test cases to accelerate pipeline development. Interactive debugging and iterative code optimization within notebooks. Advocate for agentic AI workflows that use specialized agents for Data profiling and schema inference. Automated testing and validation. Innovation and Continuous Learning Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling. Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements. Requirements: Bachelor's or master's degree in computer science, Information Technology, or a related field. 8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark. Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL. Excellent hands on experience with workload automation tools such as Airflow, Prefect etc. Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage Experience designing Lakehouse architectures with bronze, silver, gold layering. Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing. Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.). Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions. Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc. Strong understanding of data privacy, access controls, and governance best practices. Experience working with RBAC, tokenization, and data classification frameworks Excellent communication skills for stakeholder interaction, solution presentations, and team coordination. Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements. Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality Must be able to work in PST time zone. Pay: The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period. Benefits: As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation. Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
    $150k-180k yearly 5d ago
  • Senior Data Manager/ Data architect with DICOM

    Nexify Infosystems

    Data engineer job in San Francisco, CA

    Image Curation and Data products team transforms biomedical imaging data from clinical trials and RWD, by applying the tools and workflows to deliver high quality, FAIR imaging data sets enabling imaging data scientists to discover and utilize for exploratory use to algorithm development. Job Description: Key Responsibilities: ● Imaging Data Pipeline delivery : Design, implement and maintain automated pipelines for onboarding, verifying, transforming and curating biomedical imaging data from clinical trials and real world data sources, for therapeutic areas, Oncology, Neurology, Ophthalmology covering all image fi le formats. ● Data Quality and Integrity : Develop and implement solutions to detect and correct anomalies and inconsistencies, to achieve highest data quality of the imaging data set, per industry standards DICOM, and internal specifications like FFS, RTS, GDSR, etc. Ensure de-identification, PHI/PII controls, and image specific QC checks are implemented at scale. ● Data Analysis and integration : Integrate ML and AI assisted tools in the pipelines for inline image analysis, classifications, segmentations, to extract and enrich metadata for various analyses, optimize performance etc.. ● Image Data Management : Build and maintain large scale catalogs of curated imaging data sets enhancing FAIR principles, and providing easy to discover and access imaging data assets. ● Compliance and Controls : Ensure applicable compliance and privacy controls are followed as required GXP, CSV validations. ● Collaboration : Working closely with Image scientists, data scientists, clinops, and biomarker research teams, supporting data needs for various primary and secondary endpoint analyses. ● External collaboration : Work with external partners e.g. CROs in ensuring the imaging data received conforms to the established agreements and quality standards and complete. ● Lead to the delivery team to ensure timely delivery of product backlog / features ● Participate with the team and lead various agile ceremonies throughout the planning and execution. Ideal candidate would have, (multiple competencies from list below); 1. Worked with medical imaging data and platforms , PACS, VNAs etc. 2. Worked with imaging Radiology data e.g. CT, PET, MRI, nifti and Ophtha imaging OCT, FA,CFP etc. 3. Good understanding of DICOM standard, structure, metadata parsing, tags, multi-frame images 4. Worked with Clinical information data standards like, SDTM, ADaM 5. Data integration across diverse data sources, e.g. imaging data with tabular clinical data 6. De-identifi cation methodologies, PHI/PII detection and privacy controls 7. Good understanding of GXP and CSV validation frameworks 8. Proficient in Python, its libraries : pandas , pydicom, SimpleITK , dicom-numpy, dcm2niix 9. Hands on experience with ETL/ELT involving large medical imaging datasets 10. Apache Airfl ow, Spark, Talend or similar orchestration of complex workflows. 11. Profi ciency with SQL and no Sql , and image meta data stores. (PostgreSQL, Mongo Etc) 12. Practical experience with AWS infrastructure Data services , e.g. RDS, Athena, Glue , EC2 Lambda, S3 …. 13. Familiar with EKS, Docker , and HPC 14. Experience in data analysis and report generation, e.g Tibco, Tableau, AWS Quicksite etc 15. Good knowledge of Git, Gitlab and DevOps tools like Jenkins, Terraform 16. Familiar with using ML workfl ows for Computer Vision tasks like segmentation, classifi cation, etc.. 17. Nice to have: implemented solutions on NLP and GenAI 18. Worked with cross functional global teams in a dynamic Agile environment 19. Lead and mentor agile team members. 20. Has 10+ years of experience with data platforms, analysis and insights
    $125k-174k yearly est. 3d ago
  • Java Software Engineer

    Mindlance 4.6company rating

    Data engineer job in Concord, CA

    Role: Senior Software Engineer (Java) Contract: 12 to 24 months Skills Needed: Backend Java, API development, Microservices, Oracle, Splunk Client JD- We are seeking a Senior Software Engineer (SE3) with strong backend Java experience to support the development of APIs and microservices within a large-scale banking/transaction environment. The role involves modernizing monolithic applications, contributing to cloud migration (OCP), and ensuring platform stability, performance, and security. Key Responsibilities Design, develop, test, and support backend APIs and microservices. Work on modernization and cloud migration efforts. Ensure scalability, resiliency, and secure SDLC practices. Handle production support, monitoring, and issue resolution. Collaborate with product managers, architects, and engineering teams. Guide junior developers when needed. Required Skills 4+ years Java/Spring development 4+ years API/microservices experience 2+ years Oracle database experience Experience with Splunk or similar monitoring tools Agile/Scrum experience Nice to Have Experience decomposing monolithic apps Cloud/OCP migration experience Kafka or event-driven architecture API management tools (e.g., Apigee) Exposure to GenAI/Copilot (bonus) EEO: “Mindlance is an Equal Opportunity Employer and does not discriminate in employment based on - Minority/Gender/Disability/Religion/LGBTQI/Age/Veterans.”
    $113k-153k yearly est. 1d ago
  • Founding Software Engineer / Protocol Engineer

    The Crypto Recruiters 3.3company rating

    Data engineer job in San Francisco, CA

    We are actively searching for a Founding Protocol Engineer to join our team on a permanent basis. In this position you will If you are someone that is impressed with what Hyperliquid has accomplished then this role is for you. We are on a mission to build next generation lending and debt protocols. We are open to both Senior level and Architect level candidates for this role. Your Rhythm: Drive the architecture, technical design, and implementation of our lending protocol. Collaborate closely with researchers to validate and test designs Collaborate with auditors and security engineers to ensure safety of the protocol Participate in code reviews, providing constructive feedback and ensuring adherence to established coding standards and best practices Your Vibe: 5+ years of professional software Engineering experience 3+ years of experience working in Solidity in EVM in production environments, specifically focused in DeFi products 2+ years of experience working with a modern backend languages (Go, Rust, Python, etc) in distributed architectures Experience building lending protocols in a smart contract language Open to collaborating onsite a few days a week at our downtown SF office Our Vibe: Relaxed work environment 100% paid top of the line health care benefits Full ownership, no micro management Strong equity package 401K Unlimited vacation An actual work/life balance, we aren't trying to run you into the ground. We have families and enjoy life too!
    $123k-170k yearly est. 2d ago
  • Software Engineer

    Premier Group 4.5company rating

    Data engineer job in San Francisco, CA

    Founding Engineer $140K - $200K + equity San Francisco (Onsite Role) Direct Hire A fast growing early-stage start who recently secured a significant amount at Seed is actively hiring 3x software engineers to join their founding team. They're looking for people who are scrappy, move fast, challenge assumptions, and are driven to win. They build quickly and expect teammates to push boundaries. Who You Are Make quick, reversible (“two-way door”) decisions Proactively fix problems before being asked Comfortable working across a modern engineering stack (e.g., TypeScript, Python, containerisation, ML/LLM tooling, databases, cloud environments, mobile frameworks) Have built real, shipped products Thrive in ambiguity and fast-moving environments What You'll Do Talk directly with users to understand their workflows, pain points, and needs Architect systems that support large enterprise usage Build automated pipelines and intelligent agents that process and verify large volumes of data Maintain scalable, robust infrastructure Ship quickly - progress over perfection The Reality You'll work closely with the founding team and directly with customers User value beats hype, trends, and “cool tech” Expect a demanding, high-output culture If you're a Software Engineer with 2 + years' experience and want to work in a growing start-up, please do apply now for immediate consideration.
    $140k-200k yearly 3d ago
  • Senior ML Data Engineer

    Midjourney

    Data engineer job in San Francisco, CA

    We're the data team behind Midjourney's image generation models. We handle the dataset side: processing, filtering, scoring, captioning, and all the distributed compute that makes high-quality training data possible. What you'd be working on: Large-scale dataset processing and filtering pipelines Training classifiers for content moderation and quality assessment Models for data quality and aesthetic evaluation Data visualization tools for experimenting on dataset samples Testing/simulating distributed inference pipelines Monitoring dashboards for data quality and pipeline health Performance optimization and infrastructure scaling Occasionally jumping into inference optimization and other cross-team projects Our current stack: PySpark, Slurm, distributed batch processing across hybrid cloud setup. We're pragmatic about tools - if there's something better, we'll switch. We're looking for someone strong in either: Data engineering/ML pipelines at scale, or Cloud/infrastructure with distributed systems experience Don't need exact tech matches - comfort with adjacent technologies and willingness to learn matters more. We work with our own hardware plus GCP and other providers, so adaptability across different environments is valuable. Location: SF office a few times per week (we may make exceptions on location for truly exceptional candidates) The role offers variety, our team members often get pulled into different projects across the company, from dataset work to inference optimization. If you're interested in the intersection of large-scale data processing and cutting-edge generative AI, we'd love to hear from you.
    $110k-157k yearly est. 2d ago

Learn more about data engineer jobs

How much does a data engineer earn in Walnut Creek, CA?

The average data engineer in Walnut Creek, CA earns between $93,000 and $183,000 annually. This compares to the national average data engineer range of $80,000 to $149,000.

Average data engineer salary in Walnut Creek, CA

$131,000

What are the biggest employers of Data Engineers in Walnut Creek, CA?

The biggest employers of Data Engineers in Walnut Creek, CA are:
  1. Pacific Service Credit Union
  2. Pacific Service Cu
Job type you want
Full Time
Part Time
Internship
Temporary