Staff Data Scientist
Data scientist job in San Francisco, CA
Staff Data Scientist | San Francisco | $250K-$300K + Equity
We're partnering with one of the fastest-growing AI companies in the world to hire a Staff Data Scientist. Backed by over $230M from top-tier investors and already valued at over $1B, they've secured customers that include some of the most recognizable names in tech. Their AI platform powers millions of daily interactions and is quickly becoming the enterprise standard for conversational AI.
In this role, you'll bring rigorous analytics and experimentation leadership that directly shapes product strategy and company performance.
What you'll do:
Drive deep-dive analyses on user behavior, product performance, and growth drivers
Design and interpret A/B tests to measure product impact at scale
Build scalable data models, pipelines, and dashboards for company-wide use
Partner with Product and Engineering to embed experimentation best practices
Evaluate ML models, ensuring business relevance, performance, and trade-off clarity
What we're looking for:
5+ years in data science or product analytics at scale (consumer or marketplace preferred)
Advanced SQL and Python skills, with strong foundations in statistics and experimental design
Proven record of designing, running, and analyzing large-scale experiments
Ability to analyze and reason about ML models (classification, recommendation, LLMs)
Strong communicator with a track record of influencing cross-functional teams
If you're excited by the sound of this challenge- apply today and we'll be in touch.
Data Scientist
Data scientist job in Sonoma, CA
Key Responsibilities
Design and productionize models for opportunity scanning, anomaly detection, and significant change detection across CRM, streaming, ecommerce, and social data.
Define and tune alerting logic (thresholds, SLOs, precision/recall) to minimize noise while surfacing high-value marketing actions.
Partner with marketing, product, and data engineering to operationalize insights into campaigns, playbooks, and automated workflows, with clear monitoring and experimentation.
Required Qualifications
Strong proficiency in Python (pandas, NumPy, scikit-learn; plus experience with PySpark or similar for large-scale data) and SQL on modern warehouses (e.g., BigQuery, Snowflake, Redshift).
Hands-on experience with time-series modeling and anomaly / changepoint / significant-movement detection(e.g., STL decomposition, EWMA/CUSUM, Bayesian/prophet-style models, isolation forests, robust statistics).
Experience building and deploying production ML pipelines (batch and/or streaming), including feature engineering, model training, CI/CD, and monitoring for performance and data drift.
Solid background in statistics and experimentation: hypothesis testing, power analysis, A/B testing frameworks, uplift/propensity modeling, and basic causal inference techniques.
Familiarity with cloud platforms (GCP/AWS/Azure), orchestration tools (e.g., Airflow/Prefect), and dashboarding/visualization tools to expose alerts and model outputs to business users.
Data Scientist
Data scientist job in San Francisco, CA
We're working with a Series A health tech start-up pioneering a revolutionary approach to healthcare AI, developing neurosymbolic systems that combine statistical learning with structured medical knowledge. Their technology is being adopted by leading health systems and insurers to enhance patient outcomes through advanced predictive analytics.
We're seeking Machine Learning Engineers who excel at the intersection of data science, modeling, and software engineering. You'll design and implement models that extract insights from longitudinal healthcare data, balancing analytical rigor, interpretability, and scalability.
This role offers a unique opportunity to tackle foundational modeling challenges in healthcare, where your contributions will directly influence clinical, actuarial, and policy decisions.
Key Responsibilities
Develop predictive models to forecast disease progression, healthcare utilization, and costs using temporal clinical data (claims, EHR, laboratory results, pharmacy records)
Design interpretable and explainable ML solutions that earn the trust of clinicians, actuaries, and healthcare decision-makers
Research and prototype innovative approaches leveraging both classical and modern machine learning techniques
Build robust, scalable ML pipelines for training, validation, and deployment in distributed computing environments
Collaborate cross-functionally with data engineers, clinicians, and product teams to ensure models address real-world healthcare needs
Communicate findings and methodologies effectively through visualizations, documentation, and technical presentations
Required Qualifications
Strong foundation in statistical modeling, machine learning, or data science, with preference for experience in temporal or longitudinal data analysis
Proficiency in Python and ML frameworks (PyTorch, JAX, NumPyro, PyMC, etc.)
Proven track record of transitioning models from research prototypes to production systems
Experience with probabilistic methods, survival analysis, or Bayesian inference (highly valued)
Bonus Qualifications
Experience working with clinical data and healthcare terminologies (ICD, CPT, SNOMED CT, LOINC)
Background in actuarial modeling, claims forecasting, or risk adjustment methodologies
Data Scientist
Data scientist job in Pleasanton, CA
Key Responsibilities
Design and develop marketing-focused machine learning models, including:
Customer segmentation
Propensity, churn, and lifetime value (LTV) models
Campaign response and uplift models
Attribution and marketing mix models (MMM)
Build and deploy NLP solutions for:
Customer sentiment analysis
Text classification and topic modeling
Social media, reviews, chat, and voice-of-customer analytics
Apply advanced statistical and ML techniques to solve real-world business problems.
Work with structured and unstructured data from multiple marketing channels (digital, CRM, social, email, web).
Translate business objectives into analytical frameworks and actionable insights.
Partner with stakeholders to define KPIs, success metrics, and experimentation strategies (A/B testing).
Optimize and productionize models using MLOps best practices.
Mentor junior data scientists and provide technical leadership.
Communicate complex findings clearly to technical and non-technical audiences.
Required Skills & Qualifications
7+ years of experience in Data Science, with a strong focus on marketing analytics.
Strong expertise in Machine Learning (supervised & unsupervised techniques).
Hands-on experience with NLP techniques, including:
Text preprocessing and feature extraction
Word embeddings (Word2Vec, GloVe, Transformers)
Large Language Models (LLMs) is a plus
Proficiency in Python (NumPy, Pandas, Scikit-learn, TensorFlow/PyTorch).
Experience with SQL and large-scale data processing.
Strong understanding of statistics, probability, and experimental design.
Experience working with cloud platforms (AWS, Azure, or GCP).
Ability to translate data insights into business impact.
Nice to Have
Experience with marketing automation or CRM platforms.
Knowledge of MLOps, model monitoring, and deployment pipelines.
Familiarity with GenAI/LLM-based NLP use cases for marketing.
Prior experience in consumer, e-commerce, or digital marketing domains.
EEO
Centraprise is an equal opportunity employer. Your application and candidacy will not be considered based on race, color, sex, religion, creed, sexual orientation, gender identity, national origin, disability, genetic information, pregnancy, veteran status or any other characteristic protected by federal, state or local laws.
Founding Data Scientist (GTM)
Data scientist job in San Francisco, CA
An early-stage investment of ours is looking to make their first IC hire in data science. This company builds tools that help teams understand how their AI systems perform and improve them over time (and they already have a lot of enterprise customers).
We're looking for a Sr Data Scientist to lead analytics for sales, marketing, and customer success. The job is about finding insights in data, running analyses and experiments, and helping the business make better decisions.
Responsibilities:
Analyze data to improve how the company finds, converts, and supports customers
Create models that predict lead quality, conversion, and customer value
Build clear dashboards and reports for leadership
Work with teams across the company to answer key questions
Take initiative, communicate clearly, and dig into data to solve problems
Try new methods and tools to keep improving the company's GTM approach
Qualifications:
5+ years related industry experience working with data and supporting business teams.
Solid experience analyzing GTM or revenue-related data
Strong skills in SQL and modern analytics tools (Snowflake, Hex, dbt etc.)
Comfortable owning data workflows-from cleaning and modeling to presenting insights.
Able to work independently, prioritize well, and move projects forward without much direction
Clear thinker and communicator who can turn data into actionable recommendations
Adaptable and willing to learn new methods in a fast-paced environment
About Us:
Greylock is an early-stage investor in hundreds of remarkable companies including Airbnb, LinkedIn, Dropbox, Workday, Cloudera, Facebook, Instagram, Roblox, Coinbase, Palo Alto Networks, among others. More can be found about us here: *********************
How We Work:
We are full-time, salaried employees of Greylock and provide free candidate referrals/introductions to our active investments. We will contact anyone who looks like a potential match--requesting to schedule a call with you immediately.
Due to the selective nature of this service and the volume of applicants we typically receive from our job postings, a follow-up email will not be sent until a match is identified with one of our investments.
Please note: We are not recruiting for any roles within Greylock at this time. This job posting is for direct employment with a startup in our portfolio.
Senior Data Scientist
Data scientist job in Pleasanton, CA
Net2Source is a Global Workforce Solutions Company headquartered at NJ, USA with its branch offices in Asia Pacific Region. We are one of the fastest growing IT Consulting company across the USA and we are hiring
" Senior Data Scientist
" for one of our clients. We offer a wide gamut of consulting solutions customized to our 450+ clients ranging from Fortune 500/1000 to Start-ups across various verticals like Technology, Financial Services, Healthcare, Life Sciences, Oil & Gas, Energy, Retail, Telecom, Utilities, Technology, Manufacturing, the Internet, and Engineering.
Position: Senior Data Scientist
Location: Pleasanton, CA (Onsite) - Locals Only
Type: Contract
Exp Level - 10+ Years
Required Skills
Design, develop, and deploy advanced marketing models, including:
Build and productionize NLP solutions.
Partner with Marketing and Business stakeholders to translate business objectives into data science solutions.
Work with large-scale structured and unstructured datasets using SQL, Python, and distributed systems.
Evaluate and implement state-of-the-art ML/NLP techniques to improve model performance and business impact.
Communicate insights, results, and recommendations clearly to both technical and non-technical audiences.
Required Qualifications
5+ years of experience in data science or applied machine learning, with a strong focus on marketing analytics.
Hands-on experience building predictive marketing models (e.g., segmentation, attribution, personalization).
Strong expertise in NLP techniques and libraries (e.g., spa Cy, NLTK, Hugging Face, Gensim).
Proficiency in Python, SQL, and common data science libraries (pandas, NumPy, scikit-learn).
Solid understanding of statistics, machine learning algorithms, and model evaluation.
Experience deploying models into production environments.
Strong communication and stakeholder management skills.
Why Work With Us?
We believe in more than just jobs-we build careers. At Net2Source, we champion leadership at all levels, celebrate diverse perspectives, and empower you to make an impact. Think work-life balance, professional growth, and a collaborative culture where your ideas matter.
Our Commitment to Inclusion & Equity
Net2Source is an equal opportunity employer, dedicated to fostering a workplace where diverse talents and perspectives are valued. We make all employment decisions based on merit, ensuring a culture of respect, fairness, and opportunity for all, regardless of age, gender, ethnicity, disability, or other protected characteristics.
Awards & Recognition
America's Most Honored Businesses (Top 10%)
Fastest-Growing Staffing Firm by Staffing Industry Analysts
INC 5000 List for Eight Consecutive Years
Top 100 by
Dallas Business Journal
Spirit of Alliance Award by Agile1
Maddhuker Singh
Sr Account & Delivery Manager
***********************
Staff Data Engineer
Data scientist job in Fremont, CA
🌎 San Francisco (Hybrid)
💼 Founding/Staff Data Engineer
💵 $200-300k base
Our client is an elite applied AI research and product lab building AI-native systems for finance-and pushing frontier models into real production environments. Their work sits at the intersection of data, research, and high-stakes financial decision-making.
As the Founding Data Engineer, you will own the data platform that powers everything: models, experiments, and user-facing products relied on by demanding financial customers. You'll make foundational architectural decisions, work directly with researchers and product engineers, and help define how data is built, trusted, and scaled from day one.
What you'll do:
Design and build the core data platform, ingesting, transforming, and serving large-scale financial and alternative datasets.
Partner closely with researchers and ML engineers to ship production-grade data and feature pipelines that power cutting-edge models.
Establish data quality, observability, lineage, and reproducibility across both experimentation and production workloads.
Deploy and operate data services using Docker and Kubernetes in a modern cloud environment (AWS, GCP, or Azure).
Make foundational choices on tooling, architecture, and best practices that will define how data works across the company.
Continuously simplify and evolve systems-rewriting pipelines or infrastructure when it's the right long-term decision.
Ideal candidate:
Have owned or built high-performance data systems end-to-end, directly supporting production applications and ML models.
Are strongest in backend and data infrastructure, with enough frontend literacy to integrate cleanly with web products when needed.
Can design and evolve backend services and pipelines (Node.js or Python) to support new product features and research workflows.
Are an expert in at least one statically typed language, with a strong bias toward type safety, correctness, and maintainable systems.
Have deployed data workloads and services using Docker and Kubernetes on a major cloud provider.
Are comfortable making hard calls-simplifying, refactoring, or rebuilding legacy pipelines when quality and scalability demand it.
Use AI tools to accelerate your work, but rigorously review and validate AI-generated code, insisting on sound system design.
Thrive in a high-bar, high-ownership environment with other exceptional engineers.
Love deep technical problems in data infrastructure, distributed systems, and performance.
Nice to have:
Experience working with financial data (market, risk, portfolio, transactional, or alternative datasets).
Familiarity with ML infrastructure, such as feature stores, experiment tracking, or model serving systems.
Background in a high-growth startup or a foundational infrastructure role.
Compensation & setup:
Competitive salary and founder-level equity
Hybrid role based in San Francisco, with close collaboration and significant ownership
Small, elite team building core infrastructure with outsized impact
AI Data Engineer
Data scientist job in San Francisco, CA
Member of Technical Staff - AI Data Engineer
San Francisco (In-Office)
$150K to $225K + Equity
A high-growth, AI-native startup coming out of stealth is hiring AI Data Engineers to build the systems that power production-grade AI. The company has recently signed a Series A term sheet and is scaling rapidly. This role is central to unblocking current bottlenecks across data engineering, context modeling, and agent performance.
Responsibilities:
• Build distributed, reliable data pipelines using Airflow, Temporal, and n8n
• Model SQL, vector, and NoSQL databases (Postgres, Qdrant, etc.)
• Build API and function-based services in Python
• Develop custom automations (Playwright, Stagehand, Zapier)
• Work with AI researchers to define and expose context as services
• Identify gaps in data quality and drive changes to upstream processes
• Ship fast, iterate, and own outcomes end-to-end
Required Experience:
• Strong background in data engineering
• Hands-on experience working with LLMs or LLM-powered applications
• Data modeling skills across SQL and vector databases
• Experience building distributed systems
• Experience with Airflow, Temporal, n8n, or similar workflow engines
• Python experience (API/services)
• Startup mindset and bias toward rapid execution
Nice To Have:
• Experience with stream processing (Flink)
• dbt or Clickhouse experience
• CDC pipelines
• Experience with context construction, RAG, or agent workflows
• Analytical tooling (Posthog)
What You Can Expect:
• High-intensity, in-office environment
• Fast decision-making and rapid shipping cycles
• Real ownership over architecture and outcomes
• Opportunity to work on AI systems operating at meaningful scale
• Competitive compensation package
• Meals provided plus full medical, dental, and vision benefits
If this sounds like you, please apply now.
Data Engineer
Data scientist job in San Francisco, CA
Elevate Data Engineer
Hybrid, CA
Brooksource is searching for an Associate Data Engineer to join our HealthCare partner to support their data analytics groups. This position is through Brooksource's Elevate Program, and will include additional technical training including, but not limited to: SQL, Python, DBT, Azure, etc.
Responsibilities
Assist in the design, development, and implementation of ELT/ETL data pipelines using Azure-based technologies
Support data warehouse environments for large-scale enterprise systems
Help implement and maintain data models following best practices
Participate in data integration efforts to support reporting and analytics needs
Perform data validation, troubleshooting, and incident resolution for data pipelines
Support documentation of data flows, transformations, and architecture
DevOps & Platform Support
Assist with DevOps activities related to data platforms, including deployments and environment support
Help build and maintain automation scripts and reusable frameworks for data operations
Support CI/CD pipelines for data engineering workflows
Assist with monitoring, alerting, and basic performance optimization
Collaborate with senior engineers to support infrastructure-as-code and cloud resource management
Collaboration & Delivery
Work closely with data engineers, solution leads, data modelers, analysts, and business partners
Help translate business requirements into technical data solutions
Participate in code reviews, sprint planning, and team ceremonies
Follow established architecture, security, and data governance standards
Required Qualifications
Bachelor's degree in Computer Science, Engineering, Information Systems, or related field (or equivalent experience)
Foundational knowledge of data engineering concepts, including ETL/ELT and data warehousing
Experience or coursework with SQL and relational databases
Familiarity with Microsoft Azure or another cloud platform
Basic scripting experience (Python, SQL, PowerShell, or Bash)
Understanding of version control (Git)
Preferred / Nice-to-Have Skills
Exposure to Azure services such as Azure Data Factory, Synapse Analytics, Azure SQL, or Data Lake
Basic understanding of CI/CD pipelines and DevOps concepts
Familiarity with data modeling concepts (star schema, normalization)
Experience of fa
Interest in automation, cloud infrastructure, and reliability engineering
Internship or project experience in data engineering or DevOps environments
Senior Data Engineer
Data scientist job in San Francisco, CA
If you're hands on with modern data platforms, cloud tech, and big data tools and you like building solutions that are secure, repeatable, and fast, this role is for you.
As a Senior Data Engineer, you will design, build, and maintain scalable data pipelines that transform raw information into actionable insights. The ideal candidate will have strong experience across modern data platforms, cloud environments, and big data technologies, with a focus on building secure, repeatable, and high-performing solutions.
Responsibilities:
Design, develop, and maintain secure, scalable data pipelines to ingest, transform, and deliver curated data into the Common Data Platform (CDP).
Participate in Agile rituals and contribute to delivery within the Scaled Agile Framework (SAFe).
Ensure quality and reliability of data products through automation, monitoring, and proactive issue resolution.
Deploy alerting and auto-remediation for pipelines and data stores to maximize system availability.
Apply a security first and automation-driven approach to all data engineering practices.
Collaborate with cross-functional teams (data scientists, analysts, product managers, and business stakeholders) to align infrastructure with evolving data needs.
Stay current on industry trends and emerging tools, recommending improvements to strengthen efficiency and scalability.
Qualifications:
Bachelor's degree in Computer Science, Information Systems, or related field (or equivalent experience).
At least 3 years of experience with Python and PySpark, including Jupyter notebooks and unit testing.
At least 2 years of experience with Databricks, Collibra, and Starburst.
Proven work with relational and NoSQL databases, including STAR and dimensional modeling approaches.
Hands-on experience with modern data stacks: object stores (S3), Spark, Airflow, lakehouse architectures, and cloud warehouses (Snowflake, Redshift).
Strong background in ETL and big data engineering (on-prem and cloud).
Work within enterprise cloud platforms (CFS2, Cloud Foundational Services 2/EDS) for governance and compliance.
Experience building end-to-end pipelines for structured, semi-structured, and unstructured data using Spark.
Data Engineer, Knowledge Graphs
Data scientist job in San Francisco, CA
We imagine a world where new medicines reach patients in months, not years, and where scientific breakthroughs happen at the speed of thought.
Mithrl is building the world's first commercially available AI Co-Scientist. It is a discovery engine that transforms messy biological data into insights in minutes. Scientists ask questions in natural language, and Mithrl responds with analysis, novel targets, hypotheses, and patent-ready reports.
No coding. No waiting. No bioinformatics bottlenecks.
We are one of the fastest growing tech bio companies in the Bay Area with 12x year over year revenue growth. Our platform is used across three continents by leading biotechs and big pharmas. We power breakthroughs from early target discovery to mechanism-of-action. And we are just getting started.
ABOUT THE ROLE
We are hiring a Data Engineer, Knowledge Graphs to build the infrastructure that powers Mithrl's biological knowledge layer. You will partner closely with the Data Scientist, Knowledge Graphs to take curated knowledge sources and transform them into scalable, reliable, production ready systems that serve the entire platform.
Your work includes building ETL pipelines for large biological datasets, designing schemas and storage models for graph structured data, and creating the API surfaces that allow ML engineers, application teams, and the AI Co-Scientist to query and use the knowledge graph efficiently. You will also own the reliability, performance, and versioning of knowledge graph infrastructure across releases.
This role is the bridge between biological knowledge ingestion and the high performance engineering systems that use it. If you enjoy working on data modeling, schema design, graph storage, ETL, and scalable infrastructure, this is an opportunity to have deep impact on the intelligence layer of Mithrl.
WHAT YOU WILL DO
Build and maintain ETL pipelines for large public biological datasets and curated knowledge sources
Design, implement, and evolve schemas and storage models for graph structured biological data
Create efficient APIs and query surfaces that allow internal teams and AI systems to retrieve nodes, relationships, pathways, annotations, and graph analytics
Partner closely with the Data Scientists to operationalize curated relationships, harmonized variable IDs, metadata standards, and ontology mappings
Build data models that support multi tenant access, versioning, and reproducibility across releases
Implement scalable storage and indexing strategies for high volume graph data
Maintain data quality, validate data integrity, and build monitoring around ingestion and usage
Work with ML engineers and application teams to ensure the knowledge graph infrastructure supports downstream reasoning, analysis, and discovery applications
Support data warehousing, documentation, and API reliability
Ensure performance, reliability, and uptime for knowledge graph services
WHAT YOU BRING
Required Qualifications
Strong experience as a data engineer or backend engineer working with data intensive systems
Experience building ETL or ELT pipelines for large structured or semi structured datasets
Strong understanding of database design, schema modeling, and data architecture
Experience with graph data models or willingness to learn graph storage concepts
Proficiency in Python or similar languages for data engineering
Experience designing and maintaining APIs for data access
Understanding of versioning, provenance, validation, and reproducibility in data systems
Experience with cloud infrastructure and modern data stack tools
Strong communication skills and ability to work closely with scientific and engineering teams
Nice to Have
Experience with graph databases or graph query languages
Experience with biological or chemical data sources
Familiarity with ontologies, controlled vocabularies, and metadata standards
Experience with data warehousing and analytical storage formats
Previous work in a tech bio company or scientific platform environment
WHAT YOU WILL LOVE AT MITHRL
You will build the core infrastructure that makes the biological knowledge graph fast, reliable, and usable
Team: Join a tight-knit, talent-dense team of engineers, scientists, and builders
Culture: We value consistency, clarity, and hard work. We solve hard problems through focused daily execution
Speed: We ship fast (2x/week) and improve continuously based on real user feedback
Location: Beautiful SF office with a high-energy, in-person culture
Benefits: Comprehensive PPO health coverage through Anthem (medical, dental, and vision) + 401(k) with top-tier plans
Senior ML Data Engineer
Data scientist job in San Francisco, CA
We're the data team behind Midjourney's image generation models. We handle the dataset side: processing, filtering, scoring, captioning, and all the distributed compute that makes high-quality training data possible.
What you'd be working on:
Large-scale dataset processing and filtering pipelines
Training classifiers for content moderation and quality assessment
Models for data quality and aesthetic evaluation
Data visualization tools for experimenting on dataset samples
Testing/simulating distributed inference pipelines
Monitoring dashboards for data quality and pipeline health
Performance optimization and infrastructure scaling
Occasionally jumping into inference optimization and other cross-team projects
Our current stack: PySpark, Slurm, distributed batch processing across hybrid cloud setup. We're pragmatic about tools - if there's something better, we'll switch.
We're looking for someone strong in either:
Data engineering/ML pipelines at scale, or
Cloud/infrastructure with distributed systems experience
Don't need exact tech matches - comfort with adjacent technologies and willingness to learn matters more. We work with our own hardware plus GCP and other providers, so adaptability across different environments is valuable.
Location: SF office a few times per week (we may make exceptions on location for truly exceptional candidates)
The role offers variety, our team members often get pulled into different projects across the company, from dataset work to inference optimization. If you're interested in the intersection of large-scale data processing and cutting-edge generative AI, we'd love to hear from you.
Data Engineer / Analytics Specialist
Data scientist job in San Francisco, CA
Citizenship Requirement: U.S. Citizens Only
ITTConnect is seeking a Data Engineer / Analytics to work for one of our clients, a major Technology Consulting firm with headquarters in Europe. They are experts in tailored technology consulting and services to banks, investment firms and other Financial vertical clients.
Job location: San Francisco Bay area or NY City.
Work Model: Ability to come into the office as requested
Seniority: 10+ years of total experience
About the role:
The Data Engineer / Analytics Specialist will support analytics, product insights, and AI initiatives. You will build robust data pipelines, integrate data sources, and enhance the organization's analytical foundations.
Responsibilities:
Build and operate Snowflake-based analytics environments.
Develop ETL/ELT pipelines (DBT, Airflow, etc.).
Integrate APIs, external data sources, and streaming inputs.
Perform query optimization, basic data modeling, and analytics support.
Enable downstream GenAI and analytics use cases.
Requirements:
10+ years of overall technology experience
3+ years hands-on AWS experience required
Strong SQL and Snowflake experience.
Hands-on pipeline engineering with DBT, Airflow, or similar.
Experience with API integrations and modern data architectures.
Data Engineer
Data scientist job in San Francisco, CA
You'll work closely with engineering, analytics, and product teams to ensure data is accurate, accessible, and efficiently processed across the organization.
Key Responsibilities:
Design, develop, and maintain scalable data pipelines and architectures.
Collect, process, and transform data from multiple sources into structured, usable formats.
Ensure data quality, reliability, and security across all systems.
Work with data analysts and data scientists to optimize data models for analytics and machine learning.
Implement ETL (Extract, Transform, Load) processes and automate workflows.
Monitor and troubleshoot data infrastructure, ensuring minimal downtime and high performance.
Collaborate with cross-functional teams to define data requirements and integrate new data sources.
Maintain comprehensive documentation for data systems and processes.
Requirements:
Proven experience as a Data Engineer, ETL Developer, or similar role.
Strong programming skills in Python, SQL, or Scala.
Experience with data pipeline tools (Airflow, dbt, Luigi, etc.).
Familiarity with big data technologies (Spark, Hadoop, Kafka, etc.).
Hands-on experience with cloud data platforms (AWS, GCP, Azure, Snowflake, or Databricks).
Understanding of data modeling, warehousing, and schema design.
Solid knowledge of database systems (PostgreSQL, MySQL, NoSQL).
Strong analytical and problem-solving skills.
Data Engineer (SQL / SQL Server Focus)
Data scientist job in San Francisco, CA
Data Engineer (SQL / SQL Server Focus) (Kind note, we cannot provide sponsorship for this role)
A leading professional services organization is seeking an experienced Data Engineer to join its team. This role supports enterprise-wide systems, analytics, and reporting initiatives, with a strong emphasis on SQL Server-based data platforms.
Key Responsibilities
Design, develop, and optimize SQL Server-centric ETL/ELT pipelines to ensure reliable, accurate, and timely data movement across enterprise systems.
Develop and maintain SQL Server data models, schemas, and tables to support financial analytics and reporting.
Write, optimize, and maintain complex T-SQL queries, stored procedures, functions, and views with a strong focus on performance and scalability.
Build and support SQL Server Reporting Services (SSRS) solutions, translating business requirements into clear, actionable reports.
Partner with finance and business stakeholders to define KPIs and ensure consistent, trusted reporting outputs.
Monitor, troubleshoot, and tune SQL Server workloads, including query performance, indexing strategies, and execution plans.
Ensure adherence to data governance, security, and access control standards within SQL Server environments.
Support documentation, version control, and change management for database and reporting solutions.
Collaborate closely with business analysts, data engineers, and IT teams to deliver end-to-end data solutions.
Mentor junior team members and contribute to database development standards and best practices.
Act as a key contributor to enterprise data architecture and reporting strategy, particularly around SQL Server platforms.
Required Education & Experience
Bachelor's or Master's degree in Computer Science, Information Systems, Data Engineering, or a related field.
8+ years of hands-on experience working with SQL Server in enterprise data warehouse or financial reporting environments.
Advanced expertise in T-SQL, including:
Query optimization
Index design and maintenance
Stored procedures and performance tuning
Strong experience with SQL Server Integration Services (SSIS) and SSRS.
Solid understanding of data warehousing concepts, including star and snowflake schemas, and OLAP vs. OLTP design.
Experience supporting large, business-critical databases with high reliability and performance requirements.
Familiarity with Azure-based SQL Server deployments (Azure SQL, Managed Instance, or SQL Server on Azure VMs) is a plus.
Strong analytical, problem-solving, and communication skills, with the ability to work directly with non-technical stakeholders.
Imaging Data Engineer/Architect
Data scientist job in San Francisco, CA
About us:
Intuitive is an innovation-led engineering company delivering business outcomes for 100's of Enterprises globally. With the reputation of being a Tiger Team & a Trusted Partner of enterprise technology leaders, we help solve the most complex Digital Transformation challenges across following Intuitive Superpowers:
Modernization & Migration
Application & Database Modernization
Platform Engineering (IaC/EaC, DevSecOps & SRE)
Cloud Native Engineering, Migration to Cloud, VMware Exit
FinOps
Data & AI/ML
Data (Cloud Native / DataBricks / Snowflake)
Machine Learning, AI/GenAI
Cybersecurity
Infrastructure Security
Application Security
Data Security
AI/Model Security
SDx & Digital Workspace (M365, G-suite)
SDDC, SD-WAN, SDN, NetSec, Wireless/Mobility
Email, Collaboration, Directory Services, Shared Files Services
Intuitive Services:
Professional and Advisory Services
Elastic Engineering Services
Managed Services
Talent Acquisition & Platform Resell Services
About the job:
Title: Imaging Data Engineer/Architect
Start Date: Immediate
# of Positions: 1
Position Type: Contract/ Full-Time
Location: San Francisco, CA
Notes:
Imaging data Engineer/architect who understands Radiology and Digital pathology, related clinical data and metadata.
Hands-on experience on above technologies, and with good knowledge in the biomedical imaging, and data pipelines overall.
About the Role
We are seeking a highly skilled Imaging Data Engineer/Architect to join our San Francisco team as a Subject Matter Expert (SME) in radiology and digital pathology. This role will design and manage imaging data pipelines, ensuring seamless integration of clinical data and metadata to support advanced diagnostic and research applications. The ideal candidate will have deep expertise in medical imaging standards, cloud-based data architectures, and healthcare interoperability, contributing to innovative solutions that enhance patient outcomes.
Responsibilities
Design and implement scalable data architectures for radiology and digital pathology imaging data, including DICOM, HL7, and FHIR standards.
Develop and optimize data pipelines to process and store large-scale imaging datasets (e.g., MRI, CT, histopathology slides) and associated metadata.
Collaborate with clinical teams to understand radiology and pathology workflows, ensuring data solutions align with clinical needs.
Ensure data integrity, security, and compliance with healthcare regulations (e.g., HIPAA, GDPR).
Integrate imaging data with AI/ML models for diagnostic and predictive analytics, working closely with data scientists.
Build and maintain metadata schemas to support data discoverability and interoperability across systems.
Provide technical expertise to cross-functional teams, including product managers and software engineers, to drive imaging data strategy.
Conduct performance tuning and optimization of imaging data storage and retrieval systems in cloud environments (e.g., AWS, Google Cloud, Azure).
Document data architectures and processes, ensuring knowledge transfer to internal teams and external partners.
Stay updated on emerging imaging technologies and standards, proposing innovative solutions to enhance data workflows.
Qualifications
Education: Bachelor's degree in computer science, Biomedical Engineering, or a related field (master's preferred).
Experience:
5+ years in data engineering or architecture, with at least 3 years focused on medical imaging (radiology and/or digital pathology).
Proven experience with DICOM, HL7, FHIR, and imaging metadata standards (e.g., SNOMED, LOINC).
Hands-on experience with cloud platforms (AWS, Google Cloud, or Azure) for imaging data storage and processing.
Technical Skills:
Proficiency in programming languages (e.g., Python, Java, SQL) for data pipeline development.
Expertise in ETL processes, data warehousing, and database management (e.g., Snowflake, BigQuery, PostgreSQL).
Familiarity with AI/ML integration for imaging data analytics.
Knowledge of containerization (e.g., Docker, Kubernetes) for deploying data solutions.
Domain Knowledge:
Deep understanding of radiology and digital pathology workflows, including PACS and LIS systems.
Familiarity with clinical data integration and healthcare interoperability standards.
Soft Skills:
Strong analytical and problem-solving skills to address complex data challenges.
Excellent communication skills to collaborate with clinical and technical stakeholders.
Ability to work independently in a fast-paced environment, with a proactive approach to innovation.
Certifications (preferred):
AWS Certified Solutions Architect, Google Cloud Professional Data Engineer, or equivalent.
Certifications in medical imaging (e.g., CIIP - Certified Imaging Informatics Professional).
Data Engineer
Data scientist job in Pleasanton, CA
Hi
Job Title: Data Engineer
HM prefers candidate to be on site at Pleasanton
Proficiency in Spark, Python, and SQL is essential for this role. 10+ Experience with relational databases such as Oracle, NoSQL databases including MongoDB and Cassandra, and big data technologies, particularly Databricks, is required. Strong knowledge of data modeling techniques is necessary for designing efficient and scalable data structures. Familiarity with APIs and web services, including REST and SOAP, is important for integrating various data sources and ensuring seamless data flow. This role involves leveraging these technical skills to build and maintain robust data pipelines and support advanced data analytics.
SKILLS:
- Spark/Python/SQL
- Relational Database (Oracle) / NoSQL Database (MongoDB/ Cassandra) / Databricks
- Big Data technologies - Databricks preferred
- Data modelling techniques
- APIs and web services (REST/ SOAP)
If interested, Please share below details with update resume:
Full Name:
Phone:
E-mail:
Rate:
Location:
Visa Status:
Availability:
SSN (Last 4 digit):
Date of Birth:
LinkedIn Profile:
Availability for the interview:
Availability for the project:
Data Scientist 4
Data scientist job in Fremont, CA
Analyze large, complex datasets from diverse sources to uncover insights and identify opportunities for innovation. Design, build, and deploy robust machine learning models with meaningful uncertainty quantification. Perform rigorous data engineering and model evaluation, including feature engineering, hyperparameter tuning, and model selection.
Collaborate with engineering teams to integrate models into production codebases, promoting best practices in code quality and maintainability.
Communicate findings and technical results clearly to both technical and non-technical stakeholders.
Master's degree with 6+ years of experience or Ph.
D.
with 3+ years in Computer Science, Engineering, Physics, Applied Mathematics, Statistics, or a related quantitative field.
Machine Learning Expertise: Strong theoretical foundation and hands-on experience in ML algorithms, deep learning, AI, statistics, or optimization.
Programming Skills: Proficient in Python, with motivation to write efficient, maintainable, testable, and well-documented code.
ML Frameworks: Experience with modern ML frameworks such as PyTorch, JAX, or TensorFlow.
Problem Solving: Demonstrated analytical and critical thinking skills, with a track record of delivering impactful R&D solutions.
Team Collaboration: Proven success working in cross-functional teams with strong execution and communication skills.
Domain expertise in semiconductor engineering, Bayesian statistics, process engineering, multi-physics modeling, or numerical simulation.
Familiarity with Linux/Unix operating systems.
Experience with MLOps tools and principles (e.
g.
, Docker, CI/CD pipelines).
Data Scientist
Data scientist job in Berkeley, CA
LifeLong Medical Care has an exciting opportunity for a Data Scientist to provide programming support to build analytic applications to support business decision making in the organization. This is a part time, 30 hour/week, benefit eligible position.
LifeLong Medical Care is a multi-site, Federally Qualified Health Center (FQHC) with a rich history of providing innovative healthcare and social services to a wonderfully diverse patient community. Our patient-centered health home is a dynamic place to work, practice, and grow. We have over 15 primary care health centers and deliver integrated services including psychosocial, referrals, chronic disease management, dental, health education, home visits, and much, much more.
Benefits
Compensation: $71k - $75k/year. We offer excellent benefits including: medical, dental, vision (including dependent and domestic partner coverage), generous leave benefits including ten paid holidays, Flexible Spending Accounts, 403(b) retirement savings plan.
Responsibilities
* Under the supervision of the Manager of Analytics, the data scientist is a senior and key part of data analytic team, developing data insights through reporting and provides assistance to all data reporting tool users in Lifelong Medical Care, including documentation of report requirements and report implementations.
* The senior analyst is the core content expert for designated subjects as assigned by Manager of Analytics or designee
* Maintains integrity of the data warehouse in their content areas or as assigned
* Develops and maintains internal reporting services platform using SSRS and Tableau. Supports Data Analysts and Junior Analysts in report development.
* Provides analytic support and data insights to one or multiple departments and develops a variety of complex ad hoc, production and/or trend reports to support business decisions and operational processes for internal and external clients.
* Collaboratively develops data strategy for core content area
* Arranges project requirements in programming sequence by analyzing requirements; preparing a work flow chart and diagram using knowledge of computer capabilities, subject matter, programming language, and logic.
* Communicates with clients and key stakeholders to develop and create specification analytical applications.
* Develops and maintains applications and databases by evaluating client needs; analyzing requirements; developing software systems.
* Performs additional duties in support of the team and immediate reporting need of other departments as assigned by supervisor.
* Protects operations by keeping information confidential and complies with HIPAA requirements.
Qualifications
* Commitment to the provision of primary care services for the underserved with demonstrated ability and sensitivity in working with a variety of people from low-income populations, with diverse educational, lifestyle, ethnic and cultural origins.
* Be creative and mature with a "can do," proactive attitude.
* Ability to effectively support, motivate and supervise staff, encourage and nurture development and growth, to build a strong and productive team.
* Strong organizational, administrative, multi-tasking, prioritization and problem-solving skills.
* Ability to work effectively under pressure in a positive, friendly manner and to be flexible and adaptive to change.
* Ability to take initiative, work independently and make sound judgments within established guidelines; understand and apply oral and written instructions; establish and maintain effective working relations with staff, clinical providers, managers and external agencies or organizations.
* Excellent interpersonal, verbal, and written skills and ability to effectively work with people from diverse backgrounds and be culturally sensitive.
* Work in a team-oriented environment with a number of professionals with different work styles and support needs.
* Conduct oneself in internal and external settings in a way that reflects positively on LifeLong Medical Care as an organization of professional, confident and sensitive staff.
* Ability to continuously scan the environment, identifying opportunities for improvement and intersections with other departments of LifeLong Medical Care and partner organizations.
Job Requirements
* Bachelor's degree (Masters preferred) in Computer Science or a related field or an equivalent combination of education and/or experience.
* Minimum 10 years of experience in programming and data analysis involving duties listed above.
* Experience in Healthcare related field and/or data reporting related work and data visualization development
* Excellent skills in SQL scripting and knowledge of database development.
* Basic understanding of SSIS
* Proficiency in Microsoft Offices, including Excel, PowerPoint, Word.
Job Preferences
* Community Health Center experience.
* Microsoft Certified Solution Associate (MCSA) in SQL database development.
Auto-ApplyPrincipal Data Scientist : Product to Market (P2M) Optimization
Data scientist job in San Francisco, CA
About Gap Inc. Our brands bridge the gaps we see in the world. Old Navy democratizes style to ensure everyone has access to quality fashion at every price point. Athleta unleashes the potential of every woman, regardless of body size, age or ethnicity. Banana Republic believes in sustainable luxury for all. And Gap inspires the world to bring individuality to modern, responsibly made essentials.
This simple idea-that we all deserve to belong, and on our own terms-is core to who we are as a company and how we make decisions. Our team is made up of thousands of people across the globe who take risks, think big, and do good for our customers, communities, and the planet. Ready to learn fast, create with audacity and lead boldly? Join our team.
About the Role
Gap Inc. is seeking a Principal Data Scientist with deep expertise in operations research and machine learning to lead the design and deployment of advanced analytics solutions across the Product-to-Market (P2M) space. This role focuses on driving enterprise-scale impact through optimization and data science initiatives spanning pricing, inventory, and assortment optimization.
The Principal Data Scientist serves as a senior technical and strategic thought partner, defining solution architectures, influencing product and business decisions, and ensuring that analytical solutions are both technically rigorous and operationally viable. The ideal candidate can lead end-to-end solutioning independently, manage ambiguity and complex stakeholder dynamics, and communicate technical and business risk effectively across teams and leadership levels.
What You'll Do
* Lead the framing, design, and delivery of advanced optimization and machine learning solutions for high-impact retail supply chain challenges.
* Partner with product, engineering, and business leaders to define analytics roadmaps, influence strategic priorities, and align technical investments with business goals.
* Provide technical leadership to other data scientists through mentorship, design reviews, and shared best practices in solution design and production deployment.
* Evaluate and communicate solution risks proactively, grounding recommendations in realistic assessments of data, system readiness, and operational feasibility.
* Evaluate, quantify, and communicate the business impact of deployed solutions using statistical and causal inference methods, ensuring benefit realization is measured rigorously and credibly.
* Serve as a trusted advisor by effectively managing stakeholder expectations, influencing decision-making, and translating analytical outcomes into actionable business insights.
* Drive cross-functional collaboration by working closely with engineering, product management, and business partners to ensure model deployment and adoption success.
* Quantify business benefits from deployed solutions using rigorous statistical and causal inference methods, ensuring that model outcomes translate into measurable value
* Design and implement robust, scalable solutions using Python, SQL, and PySpark on enterprise data platforms such as Databricks and GCP.
* Contribute to the development of enterprise standards for reproducible research, model governance, and analytics quality.
Who You Are
* Master's or Ph.D. in Operations Research, Operations Management, Industrial Engineering, Applied Mathematics, or a closely related quantitative discipline.
* 10+ years of experience developing, deploying, and scaling optimization and data science solutions in retail, supply chain, or similar complex domains.
* Proven track record of delivering production-grade analytical solutions that have influenced business strategy and delivered measurable outcomes.
* Strong expertise in operations research methods, including linear, nonlinear, and mixed-integer programming, stochastic modeling, and simulation.
* Deep technical proficiency in Python, SQL, and PySpark, with experience in optimization and ML libraries such as Pyomo, Gurobi, OR-Tools, scikit-learn, and MLlib.
* Hands-on experience with enterprise platforms such as Databricks and cloud environments
* Demonstrated ability to assess, communicate, and mitigate risk across analytical, technical, and business dimensions.
* Excellent communication and storytelling skills, with a proven ability to convey complex analytical concepts to technical and non-technical audiences.
* Strong collaboration and influence skills, with experience leading cross-functional teams in matrixed organizations.
* Experience managing code quality, CI/CD pipelines, and GitHub-based workflows.
Preferred Qualifications
* Experience shaping and executing multi-year analytics strategies in retail or supply chain domains.
* Proven ability to balance long-term innovation with short-term deliverables.
* Background in agile product development and stakeholder alignment for enterprise-scale initiatives.
Benefits at Gap Inc.
* Merchandise discount for our brands: 50% off regular-priced merchandise at Old Navy, Gap, Banana Republic and Athleta, and 30% off at Outlet for all employees.
* One of the most competitive Paid Time Off plans in the industry.*
* Employees can take up to five "on the clock" hours each month to volunteer at a charity of their choice.*
* Extensive 401(k) plan with company matching for contributions up to four percent of an employee's base pay.*
* Employee stock purchase plan.*
* Medical, dental, vision and life insurance.*
* See more of the benefits we offer.
* For eligible employees
Gap Inc. is an equal-opportunity employer and is committed to providing a workplace free from harassment and discrimination. We are committed to recruiting, hiring, training and promoting qualified people of all backgrounds, and make all employment decisions without regard to any protected status. We have received numerous awards for our long-held commitment to equality and will continue to foster a diverse and inclusive environment of belonging. In 2022, we were recognized by Forbes as one of the World's Best Employers and one of the Best Employers for Diversity.
Salary Range: $201,700 - $267,300 USD
Employee pay will vary based on factors such as qualifications, experience, skill level, competencies and work location. We will meet minimum wage or minimum of the pay range (whichever is higher) based on city, county and state requirements.