Staff Data Scientist - Post Sales
Data engineer job in Fremont, CA
Salary: $200-250k base + RSUs
This fast-growing Series E AI SaaS company is redefining how modern engineering teams build and deploy applications. We're expanding our data science organization to accelerate customer success after the initial sale-driving onboarding, retention, expansion, and long-term revenue growth.
About the Role
As the senior data scientist supporting post-sales teams, you will use advanced analytics, experimentation, and predictive modeling to guide strategy across Customer Success, Account Management, and Renewals. Your insights will help leadership forecast expansion, reduce churn, and identify the levers that unlock sustainable net revenue retention.
Key Responsibilities
Forecast & Model Growth: Build predictive models for renewal likelihood, expansion potential, churn risk, and customer health scoring.
Optimize the Customer Journey: Analyze onboarding flows, product adoption patterns, and usage signals to improve activation, engagement, and time-to-value.
Experimentation & Causal Analysis: Design and evaluate experiments (A/B tests, uplift modeling) to measure the impact of onboarding programs, success initiatives, and pricing changes on retention and expansion.
Revenue Insights: Partner with Customer Success and Sales to identify high-value accounts, cross-sell opportunities, and early warning signs of churn.
Cross-Functional Partnership: Collaborate with Product, RevOps, Finance, and Marketing to align post-sales strategies with company growth goals.
Data Infrastructure Collaboration: Work with Analytics Engineering to define data requirements, maintain data quality, and enable self-serve dashboards for Success and Finance teams.
Executive Storytelling: Present clear, actionable recommendations to senior leadership that translate complex analysis into strategic decisions.
About You
Experience: 6+ years in data science or advanced analytics, with a focus on post-sales, customer success, or retention analytics in a B2B SaaS environment.
Technical Skills: Expert SQL and proficiency in Python or R for statistical modeling, forecasting, and machine learning.
Domain Knowledge: Deep understanding of SaaS metrics such as net revenue retention (NRR), gross churn, expansion ARR, and customer health scoring.
Analytical Rigor: Strong background in experimentation design, causal inference, and predictive modeling to inform customer-lifecycle strategy.
Communication: Exceptional ability to translate data into compelling narratives for executives and cross-functional stakeholders.
Business Impact: Demonstrated success improving onboarding efficiency, retention rates, or expansion revenue through data-driven initiatives.
Staff Data Scientist
Data engineer job in Fremont, CA
Staff Data Scientist | San Francisco | $250K-$300K + Equity
We're partnering with one of the fastest-growing AI companies in the world to hire a Staff Data Scientist. Backed by over $230M from top-tier investors and already valued at over $1B, they've secured customers that include some of the most recognizable names in tech. Their AI platform powers millions of daily interactions and is quickly becoming the enterprise standard for conversational AI.
In this role, you'll bring rigorous analytics and experimentation leadership that directly shapes product strategy and company performance.
What you'll do:
Drive deep-dive analyses on user behavior, product performance, and growth drivers
Design and interpret A/B tests to measure product impact at scale
Build scalable data models, pipelines, and dashboards for company-wide use
Partner with Product and Engineering to embed experimentation best practices
Evaluate ML models, ensuring business relevance, performance, and trade-off clarity
What we're looking for:
5+ years in data science or product analytics at scale (consumer or marketplace preferred)
Advanced SQL and Python skills, with strong foundations in statistics and experimental design
Proven record of designing, running, and analyzing large-scale experiments
Ability to analyze and reason about ML models (classification, recommendation, LLMs)
Strong communicator with a track record of influencing cross-functional teams
If you're excited by the sound of this challenge- apply today and we'll be in touch.
Data Scientist
Data engineer job in San Francisco, CA
We're working with a Series A health tech start-up pioneering a revolutionary approach to healthcare AI, developing neurosymbolic systems that combine statistical learning with structured medical knowledge. Their technology is being adopted by leading health systems and insurers to enhance patient outcomes through advanced predictive analytics.
We're seeking Machine Learning Engineers who excel at the intersection of data science, modeling, and software engineering. You'll design and implement models that extract insights from longitudinal healthcare data, balancing analytical rigor, interpretability, and scalability.
This role offers a unique opportunity to tackle foundational modeling challenges in healthcare, where your contributions will directly influence clinical, actuarial, and policy decisions.
Key Responsibilities
Develop predictive models to forecast disease progression, healthcare utilization, and costs using temporal clinical data (claims, EHR, laboratory results, pharmacy records)
Design interpretable and explainable ML solutions that earn the trust of clinicians, actuaries, and healthcare decision-makers
Research and prototype innovative approaches leveraging both classical and modern machine learning techniques
Build robust, scalable ML pipelines for training, validation, and deployment in distributed computing environments
Collaborate cross-functionally with data engineers, clinicians, and product teams to ensure models address real-world healthcare needs
Communicate findings and methodologies effectively through visualizations, documentation, and technical presentations
Required Qualifications
Strong foundation in statistical modeling, machine learning, or data science, with preference for experience in temporal or longitudinal data analysis
Proficiency in Python and ML frameworks (PyTorch, JAX, NumPyro, PyMC, etc.)
Proven track record of transitioning models from research prototypes to production systems
Experience with probabilistic methods, survival analysis, or Bayesian inference (highly valued)
Bonus Qualifications
Experience working with clinical data and healthcare terminologies (ICD, CPT, SNOMED CT, LOINC)
Background in actuarial modeling, claims forecasting, or risk adjustment methodologies
Lead Data Scientist - Computer Vision
Data engineer job in Santa Clara, CA
Lead Data Scientist - Computer Vision/Image Processing
About the Role
We are seeking a Lead Data Scientist to drive the strategy and execution of data science initiatives, with a particular focus on computer vision systems & image processing techniques. The ideal candidate has deep expertise in image processing techniques including Filtering, Binary Morphology, Perspective/Affine Transformation, Edge Detection.
Responsibilities
Solid knowledge of computer vision programs and image processing techniques: Filtering, Binary Morphology, Perspective/Affine Transformation, Edge Detection
Strong understanding of machine learning: Regression, Supervised and Unsupervised Learning
Proficiency in Python and libraries such as OpenCV, NumPy, scikit-learn, TensorFlow/PyTorch.
Familiarity with version control (Git) and collaborative development practices
Data Scientist V
Data engineer job in Mountain View, CA
Job Title: Data Scientist V - Data Analytics & Engineering
Location: Onsite preferred (Mountain View, CA); Remote considered for strong candidates (US time zones only)
Duration: 12 months (possible extension)
Required Skills:
Strong project or product management experience
Excellent communication and consulting skills
Proficiency in SQL and Python
Nice to Have:
Experience with marketing analytics or campaigns
Experience in large tech or fast-paced startup environments
Familiarity with AI-driven workflows
Why Join:
High-visibility, cross-functional role
Opportunity to work on advanced measurement and automation tools
Small, agile team with enterprise-scale impact
Data Scientist
Data engineer job in Pleasanton, CA
Key Responsibilities
Design and develop marketing-focused machine learning models, including:
Customer segmentation
Propensity, churn, and lifetime value (LTV) models
Campaign response and uplift models
Attribution and marketing mix models (MMM)
Build and deploy NLP solutions for:
Customer sentiment analysis
Text classification and topic modeling
Social media, reviews, chat, and voice-of-customer analytics
Apply advanced statistical and ML techniques to solve real-world business problems.
Work with structured and unstructured data from multiple marketing channels (digital, CRM, social, email, web).
Translate business objectives into analytical frameworks and actionable insights.
Partner with stakeholders to define KPIs, success metrics, and experimentation strategies (A/B testing).
Optimize and productionize models using MLOps best practices.
Mentor junior data scientists and provide technical leadership.
Communicate complex findings clearly to technical and non-technical audiences.
Required Skills & Qualifications
7+ years of experience in Data Science, with a strong focus on marketing analytics.
Strong expertise in Machine Learning (supervised & unsupervised techniques).
Hands-on experience with NLP techniques, including:
Text preprocessing and feature extraction
Word embeddings (Word2Vec, GloVe, Transformers)
Large Language Models (LLMs) is a plus
Proficiency in Python (NumPy, Pandas, Scikit-learn, TensorFlow/PyTorch).
Experience with SQL and large-scale data processing.
Strong understanding of statistics, probability, and experimental design.
Experience working with cloud platforms (AWS, Azure, or GCP).
Ability to translate data insights into business impact.
Nice to Have
Experience with marketing automation or CRM platforms.
Knowledge of MLOps, model monitoring, and deployment pipelines.
Familiarity with GenAI/LLM-based NLP use cases for marketing.
Prior experience in consumer, e-commerce, or digital marketing domains.
EEO
Centraprise is an equal opportunity employer. Your application and candidacy will not be considered based on race, color, sex, religion, creed, sexual orientation, gender identity, national origin, disability, genetic information, pregnancy, veteran status or any other characteristic protected by federal, state or local laws.
Data Scientist with Gen Ai and Python experience
Data engineer job in Palo Alto, CA
About Company,
Droisys is an innovation technology company focused on helping companies accelerate their digital initiatives from strategy and planning through execution. We leverage deep technical expertise, Agile methodologies, and data-driven intelligence to modernize systems of engagement and simplify human/tech interaction.
Amazing things happen when we work in environments where everyone feels a true sense of belonging and when candidates have the requisite skills and opportunities to succeed. At Droisys, we invest in our talent and support career growth, and we are always on the lookout for amazing talent who can contribute to our growth by delivering top results for our clients. Join us to challenge yourself and accomplish work that matters.
Here's the job details,
Data Scientist with Gen Ai and Python experience
Palo Alto CA- 5 days Onsite
Interview Mode:-Phone & F2F
Job Overview:
Competent Data Scientist, who is independent, results driven and is capable of taking business requirements and building out the technologies to generate statistically sound analysis and production grade ML models
DS skills with GenAI and LLM Knowledge,
Expertise in Python/Spark and their related libraries and frameworks.
Experience in building training ML pipelines and efforts involved in ML Model deployment.
Experience in other ML concepts - Real time distributed model inferencing pipeline, Champion/Challenger framework, A/B Testing, Model.
Familiar with DS/ML Production implementation.
Excellent problem-solving skills, with attention to detail, focus on quality and timely delivery of assigned tasks.
Azure cloud and Databricks prior knowledge will be a big plus.
Droisys is an equal opportunity employer. We do not discriminate based on race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. Droisys believes in diversity, inclusion, and belonging, and we are committed to fostering a diverse work environment.
Snowflake Data Engineer
Data engineer job in Santa Clara, CA
Title: Snowflake Data Architect - Onsite all 5 days
Note: The candidate is required to attend the final interview in person at Santa Clara, CA 95054
Key Requirements:
Strong hands-on experience in data engineering using Snowflake with proven ability to build and optimize large-scale data pipelines.
Deep understanding of data architecture principles, including ingestion, transformation, storage, and access control.
Solid experience in system design and solution architecture, focusing on scalability, reliability, and maintainability.
Expertise in ETL/ELT pipeline design, including data extraction, transformation, validation, and load processes.
In-depth knowledge of data modeling techniques (dimensional modeling, star, and snowflake schemas).
Skilled in optimizing compute and storage costs across Snowflake environments.
Strong proficiency in administration, including database design, schema management, user roles, permissions, and access control policies.
Hands-on experience implementing data lineage, quality, and monitoring frameworks.
Advanced proficiency in SQL for data processing, transformation, and automation.
Experience with reporting and visualization tools such as Power BI and Sigma Computing.
Excellent communication and collaboration skills, with the ability to work independently and drive technical initiatives.
Senior Data Engineer
Data engineer job in Fremont, CA
The Company:
A data services company based in the heart of San Francisco, are looking for a Senior Data Engineer. They are a team of passionate engineers and data experts that are working on a variety of different project, primarily in the financial services sector, helping organizations build scalable, modern data platforms. This is a hands-on, full-time role with close collaboration alongside the CTO and senior engineers, offering strong influence over technical direction and delivery.
The Role:
This is an on-site position in the downtown San Francisco where you will be working as part of a close-knit team, collaborating on projects in their brand new office. You will be working across end-to-end data projects, including:
Building and maintaining data pipelines and ETL processes.
Sourcing and integrating third-party APIs and datasets.
Batch and near-real-time processing (cloud agnostic).
Downstream analytics and reporting using tools like Sigma Computing and Omnium Analytics.
Collaborating with the CTO and engineering team to deliver client solutions.
Key Skills:
5+ years' data engineering experience
Strong Python, BigQuery, and cloud (GCP or similar)
Solid ETL and pipeline background
Comfortable with large-scale data
Nice to Have
Beam, Dataflow, Spark, or Hadoop
Tableau or Looker
ML/AI exposure
Kafka or Pub/Sub
Given the varied nature of the work, a broad range of technology experience is valued. You don't need to have experience with every tool listed below to be considered, so we encourage you to apply.
This role is 5 days a week on-site in downtown San Francisco. Looking to pay between $170,000-$220,000 with a bonus between 15-20%.
Benefits
Health, Dental & Vision covered
Unlimited PTO
401(k) with employer contribution
Commuter benefits.
Data Engineer
Data engineer job in San Francisco, CA
Job Title: Data Engineer - Retail / E-Commerce
🏢 Company: Aaratech Inc
🛑 Eligibility: Only U.S. Citizens and Green Card holders are eligible.
Please note that we do not offer visa sponsorship.
Aaratech Inc. is seeking a results-driven Data Engineer - Retail / E-Commerce to support customer, sales, and product data platforms. The role focuses on building scalable pipelines that enable real-time and batch analytics for business growth.
Key Responsibilities:
🔹 Data Pipeline Development
Develop and maintain data pipelines for sales, customer, and product data.
Integrate data from e-commerce platforms and marketing systems.
🔹 Data Modeling
Design data models to support analytics and BI reporting.
Optimize performance and scalability.
🔹 Data Quality
Ensure data accuracy, completeness, and consistency.
Implement monitoring and error-handling processes.
🔹 Collaboration
Work closely with data analysts, product, and marketing teams.
🔹 Tools & Technologies
Use SQL, Python, ETL tools, and cloud data platforms.
Qualifications:
✅ Bachelor's degree in Computer Science, Engineering, or related field
✅ Minimum 2 years of Data Engineering experience
✅ Strong SQL and Python skills
✅ Experience with data pipelines and analytics platforms
✅ Retail or e-commerce data experience preferred
✅ Strong problem-solving skills
Data Engineer, Knowledge Graphs
Data engineer job in San Francisco, CA
We imagine a world where new medicines reach patients in months, not years, and where scientific breakthroughs happen at the speed of thought.
Mithrl is building the world's first commercially available AI Co-Scientist. It is a discovery engine that transforms messy biological data into insights in minutes. Scientists ask questions in natural language, and Mithrl responds with analysis, novel targets, hypotheses, and patent-ready reports.
No coding. No waiting. No bioinformatics bottlenecks.
We are one of the fastest growing tech bio companies in the Bay Area with 12x year over year revenue growth. Our platform is used across three continents by leading biotechs and big pharmas. We power breakthroughs from early target discovery to mechanism-of-action. And we are just getting started.
ABOUT THE ROLE
We are hiring a Data Engineer, Knowledge Graphs to build the infrastructure that powers Mithrl's biological knowledge layer. You will partner closely with the Data Scientist, Knowledge Graphs to take curated knowledge sources and transform them into scalable, reliable, production ready systems that serve the entire platform.
Your work includes building ETL pipelines for large biological datasets, designing schemas and storage models for graph structured data, and creating the API surfaces that allow ML engineers, application teams, and the AI Co-Scientist to query and use the knowledge graph efficiently. You will also own the reliability, performance, and versioning of knowledge graph infrastructure across releases.
This role is the bridge between biological knowledge ingestion and the high performance engineering systems that use it. If you enjoy working on data modeling, schema design, graph storage, ETL, and scalable infrastructure, this is an opportunity to have deep impact on the intelligence layer of Mithrl.
WHAT YOU WILL DO
Build and maintain ETL pipelines for large public biological datasets and curated knowledge sources
Design, implement, and evolve schemas and storage models for graph structured biological data
Create efficient APIs and query surfaces that allow internal teams and AI systems to retrieve nodes, relationships, pathways, annotations, and graph analytics
Partner closely with the Data Scientists to operationalize curated relationships, harmonized variable IDs, metadata standards, and ontology mappings
Build data models that support multi tenant access, versioning, and reproducibility across releases
Implement scalable storage and indexing strategies for high volume graph data
Maintain data quality, validate data integrity, and build monitoring around ingestion and usage
Work with ML engineers and application teams to ensure the knowledge graph infrastructure supports downstream reasoning, analysis, and discovery applications
Support data warehousing, documentation, and API reliability
Ensure performance, reliability, and uptime for knowledge graph services
WHAT YOU BRING
Required Qualifications
Strong experience as a data engineer or backend engineer working with data intensive systems
Experience building ETL or ELT pipelines for large structured or semi structured datasets
Strong understanding of database design, schema modeling, and data architecture
Experience with graph data models or willingness to learn graph storage concepts
Proficiency in Python or similar languages for data engineering
Experience designing and maintaining APIs for data access
Understanding of versioning, provenance, validation, and reproducibility in data systems
Experience with cloud infrastructure and modern data stack tools
Strong communication skills and ability to work closely with scientific and engineering teams
Nice to Have
Experience with graph databases or graph query languages
Experience with biological or chemical data sources
Familiarity with ontologies, controlled vocabularies, and metadata standards
Experience with data warehousing and analytical storage formats
Previous work in a tech bio company or scientific platform environment
WHAT YOU WILL LOVE AT MITHRL
You will build the core infrastructure that makes the biological knowledge graph fast, reliable, and usable
Team: Join a tight-knit, talent-dense team of engineers, scientists, and builders
Culture: We value consistency, clarity, and hard work. We solve hard problems through focused daily execution
Speed: We ship fast (2x/week) and improve continuously based on real user feedback
Location: Beautiful SF office with a high-energy, in-person culture
Benefits: Comprehensive PPO health coverage through Anthem (medical, dental, and vision) + 401(k) with top-tier plans
Data Engineer
Data engineer job in San Francisco, CA
Midjourney is a research lab exploring new mediums to expand the imaginative powers of the human species. We are a small, self-funded team focused on design, human infrastructure, and AI. We have no investors, no big company controlling us, and no advertisers. We are 100% supported by our amazing community.
Our tools are already used by millions of people to dream, to explore, and to create. But this is just the start. We think the story of the 2020s is about building the tools that will remake the world for the next century. We're making those tools, to expand what it means to be human.
Core Responsibilities:
Design and maintain data pipelines to consolidate information across multiple sources (subscription platforms, payment systems, infrastructure and usage monitoring, and financial systems) into a unified analytics environment
Build and manage interactive dashboards and self-service BI tools that enable leadership to track key business metrics including revenue performance, infrastructure costs, customer retention, and operational efficiency
Serve as technical owner of our financial planning platform (Pigment or similar), leading implementation and build-out of models, data connections, and workflows in partnership with Finance leadership to translate business requirements into functional system architecture
Develop automated data quality checks and cleaning processes to ensure accuracy and consistency across financial and operational datasets
Partner with Finance, Product and Operations teams to translate business questions into analytical frameworks, including cohort analysis, cost modeling, and performance trending
Create and maintain documentation for data models, ETL processes, dashboard logic, and system workflows to ensure knowledge continuity
Support strategic planning initiatives by building financial models, scenario analyses, and data-driven recommendations for resource allocation and growth investments
Required Qualifications:
3-5+ years experience in data engineering, analytics engineering, or similar role with demonstrated ability to work with large-scale datasets
Strong SQL skills and experience with modern data warehousing solutions (BigQuery, Snowflake, Redshift, etc.)
Proficiency in at least one programming language (Python, R) for data manipulation and analysis
Experience with BI/visualization tools (Looker, Tableau, Power BI, or similar)
Hands-on experience administering enterprise financial systems (NetSuite, SAP, Oracle, or similar ERP platforms)
Experience working with Stripe Billing or similar subscription management platforms, including data extraction and revenue reporting
Ability to communicate technical concepts clearly to non-technical stakeholders
Data Engineer / Analytics Specialist
Data engineer job in San Francisco, CA
Citizenship Requirement: U.S. Citizens Only
ITTConnect is seeking a Data Engineer / Analytics to work for one of our clients, a major Technology Consulting firm with headquarters in Europe. They are experts in tailored technology consulting and services to banks, investment firms and other Financial vertical clients.
Job location: San Francisco Bay area or NY City.
Work Model: Ability to come into the office as requested
Seniority: 10+ years of total experience
About the role:
The Data Engineer / Analytics Specialist will support analytics, product insights, and AI initiatives. You will build robust data pipelines, integrate data sources, and enhance the organization's analytical foundations.
Responsibilities:
Build and operate Snowflake-based analytics environments.
Develop ETL/ELT pipelines (DBT, Airflow, etc.).
Integrate APIs, external data sources, and streaming inputs.
Perform query optimization, basic data modeling, and analytics support.
Enable downstream GenAI and analytics use cases.
Requirements:
10+ years of overall technology experience
3+ years hands-on AWS experience required
Strong SQL and Snowflake experience.
Hands-on pipeline engineering with DBT, Airflow, or similar.
Experience with API integrations and modern data architectures.
Imaging Data Engineer/Architect
Data engineer job in San Francisco, CA
About us:
Intuitive is an innovation-led engineering company delivering business outcomes for 100's of Enterprises globally. With the reputation of being a Tiger Team & a Trusted Partner of enterprise technology leaders, we help solve the most complex Digital Transformation challenges across following Intuitive Superpowers:
Modernization & Migration
Application & Database Modernization
Platform Engineering (IaC/EaC, DevSecOps & SRE)
Cloud Native Engineering, Migration to Cloud, VMware Exit
FinOps
Data & AI/ML
Data (Cloud Native / DataBricks / Snowflake)
Machine Learning, AI/GenAI
Cybersecurity
Infrastructure Security
Application Security
Data Security
AI/Model Security
SDx & Digital Workspace (M365, G-suite)
SDDC, SD-WAN, SDN, NetSec, Wireless/Mobility
Email, Collaboration, Directory Services, Shared Files Services
Intuitive Services:
Professional and Advisory Services
Elastic Engineering Services
Managed Services
Talent Acquisition & Platform Resell Services
About the job:
Title: Imaging Data Engineer/Architect
Start Date: Immediate
# of Positions: 1
Position Type: Contract/ Full-Time
Location: San Francisco, CA
Notes:
Imaging data Engineer/architect who understands Radiology and Digital pathology, related clinical data and metadata.
Hands-on experience on above technologies, and with good knowledge in the biomedical imaging, and data pipelines overall.
About the Role
We are seeking a highly skilled Imaging Data Engineer/Architect to join our San Francisco team as a Subject Matter Expert (SME) in radiology and digital pathology. This role will design and manage imaging data pipelines, ensuring seamless integration of clinical data and metadata to support advanced diagnostic and research applications. The ideal candidate will have deep expertise in medical imaging standards, cloud-based data architectures, and healthcare interoperability, contributing to innovative solutions that enhance patient outcomes.
Responsibilities
Design and implement scalable data architectures for radiology and digital pathology imaging data, including DICOM, HL7, and FHIR standards.
Develop and optimize data pipelines to process and store large-scale imaging datasets (e.g., MRI, CT, histopathology slides) and associated metadata.
Collaborate with clinical teams to understand radiology and pathology workflows, ensuring data solutions align with clinical needs.
Ensure data integrity, security, and compliance with healthcare regulations (e.g., HIPAA, GDPR).
Integrate imaging data with AI/ML models for diagnostic and predictive analytics, working closely with data scientists.
Build and maintain metadata schemas to support data discoverability and interoperability across systems.
Provide technical expertise to cross-functional teams, including product managers and software engineers, to drive imaging data strategy.
Conduct performance tuning and optimization of imaging data storage and retrieval systems in cloud environments (e.g., AWS, Google Cloud, Azure).
Document data architectures and processes, ensuring knowledge transfer to internal teams and external partners.
Stay updated on emerging imaging technologies and standards, proposing innovative solutions to enhance data workflows.
Qualifications
Education: Bachelor's degree in computer science, Biomedical Engineering, or a related field (master's preferred).
Experience:
5+ years in data engineering or architecture, with at least 3 years focused on medical imaging (radiology and/or digital pathology).
Proven experience with DICOM, HL7, FHIR, and imaging metadata standards (e.g., SNOMED, LOINC).
Hands-on experience with cloud platforms (AWS, Google Cloud, or Azure) for imaging data storage and processing.
Technical Skills:
Proficiency in programming languages (e.g., Python, Java, SQL) for data pipeline development.
Expertise in ETL processes, data warehousing, and database management (e.g., Snowflake, BigQuery, PostgreSQL).
Familiarity with AI/ML integration for imaging data analytics.
Knowledge of containerization (e.g., Docker, Kubernetes) for deploying data solutions.
Domain Knowledge:
Deep understanding of radiology and digital pathology workflows, including PACS and LIS systems.
Familiarity with clinical data integration and healthcare interoperability standards.
Soft Skills:
Strong analytical and problem-solving skills to address complex data challenges.
Excellent communication skills to collaborate with clinical and technical stakeholders.
Ability to work independently in a fast-paced environment, with a proactive approach to innovation.
Certifications (preferred):
AWS Certified Solutions Architect, Google Cloud Professional Data Engineer, or equivalent.
Certifications in medical imaging (e.g., CIIP - Certified Imaging Informatics Professional).
Senior Data Engineer - Spark, Airflow
Data engineer job in San Francisco, CA
We are seeking an experienced Data Engineer to design and optimize scalable data pipelines that drive our global data and analytics initiatives.
In this role, you will leverage technologies such as Apache Spark, Airflow, and Python to build high performance data processing systems and ensure data quality, reliability, and lineage across Mastercard's data ecosystem.
The ideal candidate combines strong technical expertise with hands-on experience in distributed data systems, workflow automation, and performance tuning to deliver impactful, data-driven solutions at enterprise scale.
Responsibilities:
Design and optimize Spark-based ETL pipelines for large-scale data processing.
Build and manage Airflow DAGs for scheduling, orchestration, and checkpointing.
Implement partitioning and shuffling strategies to improve Spark performance.
Ensure data lineage, quality, and traceability across systems.
Develop Python scripts for data transformation, aggregation, and validation.
Execute and tune Spark jobs using spark-submit.
Perform DataFrame joins and aggregations for analytical insights.
Automate multi-step processes through shell scripting and variable management.
Collaborate with data, DevOps, and analytics teams to deliver scalable data solutions.
Qualifications:
Bachelor's degree in Computer Science, Data Engineering, or related field (or equivalent experience).
At least 7 years of experience in data engineering or big data development.
Strong expertise in Apache Spark architecture, optimization, and job configuration.
Proven experience with Airflow DAGs using authoring, scheduling, checkpointing, monitoring.
Skilled in data shuffling, partitioning strategies, and performance tuning in distributed systems.
Expertise in Python programming including data structures and algorithmic problem-solving.
Hands-on with Spark DataFrames and PySpark transformations using joins, aggregations, filters.
Proficient in shell scripting, including managing and passing variables between scripts.
Experienced with spark submit for deployment and tuning.
Solid understanding of ETL design, workflow automation, and distributed data systems.
Excellent debugging and problem-solving skills in large-scale environments.
Experience with AWS Glue, EMR, Databricks, or similar Spark platforms.
Knowledge of data lineage and data quality frameworks like Apache Atlas.
Familiarity with CI/CD pipelines, Docker/Kubernetes, and data governance tools.
Senior Snowflake Data Engineer
Data engineer job in Santa Clara, CA
About the job
Why Zensar?
We're a bunch of hardworking, fun-loving, people-oriented technology enthusiasts. We love what we do, and we're passionate about helping our clients thrive in an increasingly complex digital world. Zensar is an organization focused on building relationships, with our clients and with each other-and happiness is at the core of everything we do. In fact, we're so into happiness that we've created a Global Happiness Council, and we send out a Happiness Survey to our employees each year. We've learned that employee happiness requires more than a competitive paycheck, and our employee value proposition-grow, own, achieve, learn (GOAL)-lays out the core opportunities we seek to foster for every employee. Teamwork and collaboration are critical to Zensar's mission and success, and our teams work on a diverse and challenging mix of technologies across a broad industry spectrum. These industries include banking and financial services, high-tech and manufacturing, healthcare, insurance, retail, and consumer services. Our employees enjoy flexible work arrangements and a competitive benefits package, including medical, dental, vision, 401(k), among other benefits. If you are looking for a place to have an immediate impact, to grow and contribute, where we work hard, play hard, and support each other, consider joining team Zensar!
Zensar is seeking an Senior Snowflake Data Engineer -Santa Clara, CA-Work from office all 5 days-This is open for Full time with excellent benefits and growth opportunities and contract role as well.
Job Description:
Key Requirements:
Strong hands-on experience in data engineering using Snowflake with proven ability to build and optimize large-scale data pipelines.
Deep understanding of data architecture principles, including ingestion, transformation, storage, and access control.
Solid experience in system design and solution architecture, focusing on scalability, reliability, and maintainability.
Expertise in ETL/ELT pipeline design, including data extraction, transformation, validation, and load processes.
In-depth knowledge of data modeling techniques (dimensional modeling, star, and snowflake schemas).
Skilled in optimizing compute and storage costs across Snowflake environments.
Strong proficiency in administration, including database design, schema management, user roles, permissions, and access control policies.
Hands-on experience implementing data lineage, quality, and monitoring frameworks.
Advanced proficiency in SQL for data processing, transformation, and automation.
Experience with reporting and visualization tools such as Power BI and Sigma Computing.
Excellent communication and collaboration skills, with the ability to work independently and drive technical initiatives.
Zensar believes that diversity of backgrounds, thought, experience, and expertise fosters the robust exchange of ideas that enables the highest quality collaboration and work product. Zensar is an equal opportunity employer. All employment decisions shall be made without regard to age, race, creed, color, religion, sex, national origin, ancestry, disability status, veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by federal, state, or local law. Zensar is committed to providing veteran employment opportunities to our service men and women. Zensar is committed to providing equal employment opportunities for persons with disabilities or religious observances, including reasonable accommodation when needed. Accommodations made to facilitate the recruiting process are not a guarantee of future or continued accommodations once hired.
Zensar does not facilitate/sponsor any work authorization for this position.
Candidates who are currently employed by a client or vendor of Zensar may be ineligible for consideration.
Zensar values your privacy. We'll use your data in accordance with our privacy statement located at: *********************************
Staff Data Scientist - Sales Analytics
Data engineer job in San Francisco, CA
Salary: $200-250k base + RSUs
This fast-growing Series E AI SaaS company is redefining how modern engineering teams build and deploy applications. We're looking for a Staff Data Scientist to drive Sales and Go-to-Market (GTM) analytics, applying advanced modeling and experimentation to accelerate revenue growth and optimize the full sales funnel.
About the Role
As the senior data scientist supporting Sales and GTM, you will combine statistical modeling, experimentation, and advanced analytics to inform strategy and guide decision-making across our revenue organization. Your work will help leadership understand pipeline health, predict outcomes, and identify the levers that unlock sustainable growth.
Key Responsibilities
Model the Business: Build forecasting and propensity models for pipeline generation, conversion rates, and revenue projections.
Optimize the Sales Funnel: Analyze lead scoring, opportunity progression, and deal velocity to recommend improvements in acquisition, qualification, and close rates.
Experimentation & Causal Analysis: Design and evaluate experiments (A/B tests, uplift modeling) to measure the impact of pricing, incentives, and campaign initiatives.
Advanced Analytics for GTM: Apply machine learning and statistical techniques to segment accounts, predict churn/expansion, and identify high-value prospects.
Cross-Functional Partnership: Work closely with Sales, Marketing, RevOps, and Product to influence GTM strategy and ensure data-driven decisions.
Data Infrastructure Collaboration: Partner with Analytics Engineering to define data requirements, ensure data quality, and enable self-serve reporting.
Strategic Insights: Present findings to executive leadership, translating complex analyses into actionable recommendations.
About You
Experience: 6+ years in data science or advanced analytics roles, with significant time spent in B2B SaaS or developer tools environments.
Technical Depth: Expert in SQL and proficient in Python or R for statistical modeling, forecasting, and machine learning.
Domain Knowledge: Strong understanding of sales analytics, revenue operations, and product-led growth (PLG) motions.
Analytical Rigor: Skilled in experimentation design, causal inference, and building predictive models that influence GTM strategy.
Communication: Exceptional ability to tell a clear story with data and influence senior stakeholders across technical and business teams.
Business Impact: Proven record of driving measurable improvements in pipeline efficiency, conversion rates, or revenue outcomes.
Staff Data Scientist
Data engineer job in San Francisco, CA
Staff Data Scientist | San Francisco | $250K-$300K + Equity
We're partnering with one of the fastest-growing AI companies in the world to hire a Staff Data Scientist. Backed by over $230M from top-tier investors and already valued at over $1B, they've secured customers that include some of the most recognizable names in tech. Their AI platform powers millions of daily interactions and is quickly becoming the enterprise standard for conversational AI.
In this role, you'll bring rigorous analytics and experimentation leadership that directly shapes product strategy and company performance.
What you'll do:
Drive deep-dive analyses on user behavior, product performance, and growth drivers
Design and interpret A/B tests to measure product impact at scale
Build scalable data models, pipelines, and dashboards for company-wide use
Partner with Product and Engineering to embed experimentation best practices
Evaluate ML models, ensuring business relevance, performance, and trade-off clarity
What we're looking for:
5+ years in data science or product analytics at scale (consumer or marketplace preferred)
Advanced SQL and Python skills, with strong foundations in statistics and experimental design
Proven record of designing, running, and analyzing large-scale experiments
Ability to analyze and reason about ML models (classification, recommendation, LLMs)
Strong communicator with a track record of influencing cross-functional teams
If you're excited by the sound of this challenge- apply today and we'll be in touch.
Senior Data Engineer
Data engineer job in San Francisco, CA
The Company:
A data services company based in the heart of San Francisco, are looking for a Senior Data Engineer. They are a team of passionate engineers and data experts that are working on a variety of different project, primarily in the financial services sector, helping organizations build scalable, modern data platforms. This is a hands-on, full-time role with close collaboration alongside the CTO and senior engineers, offering strong influence over technical direction and delivery.
The Role:
This is an on-site position in the downtown San Francisco where you will be working as part of a close-knit team, collaborating on projects in their brand new office. You will be working across end-to-end data projects, including:
Building and maintaining data pipelines and ETL processes.
Sourcing and integrating third-party APIs and datasets.
Batch and near-real-time processing (cloud agnostic).
Downstream analytics and reporting using tools like Sigma Computing and Omnium Analytics.
Collaborating with the CTO and engineering team to deliver client solutions.
Key Skills:
5+ years' data engineering experience
Strong Python, BigQuery, and cloud (GCP or similar)
Solid ETL and pipeline background
Comfortable with large-scale data
Nice to Have
Beam, Dataflow, Spark, or Hadoop
Tableau or Looker
ML/AI exposure
Kafka or Pub/Sub
Given the varied nature of the work, a broad range of technology experience is valued. You don't need to have experience with every tool listed below to be considered, so we encourage you to apply.
This role is 5 days a week on-site in downtown San Francisco. Looking to pay between $170,000-$220,000 with a bonus between 15-20%.
Benefits
Health, Dental & Vision covered
Unlimited PTO
401(k) with employer contribution
Commuter benefits.
Data Engineer - Scientific Data Ingestion
Data engineer job in San Francisco, CA
We envision a world where novel drugs and therapies reach patients in months, not years, accelerating breakthroughs that save lives.
Mithrl is building the world's first commercially available AI Co-Scientist-a discovery engine that empowers life science teams to go from messy biological data to novel insights in minutes. Scientists ask questions in natural language, and Mithrl answers with real analysis, novel targets, and patent-ready reports. No coding. No waiting. No bioinformatics bottlenecks.
We are the fastest growing tech-bio startup in the Bay Area with over 12X YoY revenue growth. Our platform is already being used by teams at some of the largest biotechs and big pharma across three continents to accelerate and uncover breakthroughs-from target discovery to mechanism of action.
WHAT YOU WILL DO
Build and own an AI-powered ingestion & normalization pipeline to import data from a wide variety of sources - unprocessed Excel/CSV uploads, lab and instrument exports, as well as processed data from internal pipelines.
Develop robust schema mapping, coercion, and conversion logic (think: units normalization, metadata standardization, variable-name harmonization, vendor-instrument quirks, plate-reader formats, reference-genome or annotation updates, batch-effect correction, etc.).
Use LLM-driven and classical data-engineering tools to structure “semi-structured” or messy tabular data - extracting metadata, inferring column roles/types, cleaning free-text headers, fixing inconsistencies, and preparing final clean datasets.
Ensure all transformations that should only happen once (normalization, coercion, batch-correction) execute during ingestion - so downstream analytics / the AI “Co-Scientist” always works with clean, canonical data.
Build validation, verification, and quality-control layers to catch ambiguous, inconsistent, or corrupt data before it enters the platform.
Collaborate with product teams, data science / bioinformatics colleagues, and infrastructure engineers to define and enforce data standards, and ensure pipeline outputs integrate cleanly into downstream analysis and storage systems.
WHAT YOU BRING
Must-have
5+ years of experience in data engineering / data wrangling with real-world tabular or semi-structured data.
Strong fluency in Python, and data processing tools (Pandas, Polars, PyArrow, or similar).
Excellent experience dealing with messy Excel / CSV / spreadsheet-style data - inconsistent headers, multiple sheets, mixed formats, free-text fields - and normalizing it into clean structures.
Comfort designing and maintaining robust ETL/ELT pipelines, ideally for scientific or lab-derived data.
Ability to combine classical data engineering with LLM-powered data normalization / metadata extraction / cleaning.
Strong desire and ability to own the ingestion & normalization layer end-to-end - from raw upload → final clean dataset - with an eye for maintainability, reproducibility, and scalability.
Good communication skills; able to collaborate across teams (product, bioinformatics, infra) and translate real-world messy data problems into robust engineering solutions.
Nice-to-have
Familiarity with scientific data types and “modalities” (e.g. plate-readers, genomics metadata, time-series, batch-info, instrumentation outputs).
Experience with workflow orchestration tools (e.g. Nextflow, Prefect, Airflow, Dagster), or building pipeline abstractions.
Experience with cloud infrastructure and data storage (AWS S3, data lakes/warehouses, database schemas) to support multi-tenant ingestion.
Past exposure to LLM-based data transformation or cleansing agents - building or integrating tools that clean or structure messy data automatically.
Any background in computational biology / lab-data / bioinformatics is a bonus - though not required.
WHAT YOU WILL LOVE AT MITHRL
Mission-driven impact: you'll be the gatekeeper of data quality - ensuring that all scientific data entering Mithrl becomes clean, consistent, and analysis-ready. You'll have outsized influence over the reliability and trustworthiness of our entire data + AI stack.
High ownership & autonomy: this role is yours to shape. You decide how ingestion works, define the standards, build the pipelines. You'll work closely with our product, data science, and infrastructure teams - shaping how data is ingested, stored, and exposed to end users or AI agents.
Team: Join a tight-knit, talent-dense team of engineers, scientists, and builders
Culture: We value consistency, clarity, and hard work. We solve hard problems through focused daily execution
Speed: We ship fast (2x/week) and improve continuously based on real user feedback
Location: Beautiful SF office with a high-energy, in-person culture
Benefits: Comprehensive PPO health coverage through Anthem (medical, dental, and vision) + 401(k) with top-tier plans