Staff Data Scientist
Data scientist job in Santa Rosa, CA
Staff Data Scientist | San Francisco | $250K-$300K + Equity
We're partnering with one of the fastest-growing AI companies in the world to hire a Staff Data Scientist. Backed by over $230M from top-tier investors and already valued at over $1B, they've secured customers that include some of the most recognizable names in tech. Their AI platform powers millions of daily interactions and is quickly becoming the enterprise standard for conversational AI.
In this role, you'll bring rigorous analytics and experimentation leadership that directly shapes product strategy and company performance.
What you'll do:
Drive deep-dive analyses on user behavior, product performance, and growth drivers
Design and interpret A/B tests to measure product impact at scale
Build scalable data models, pipelines, and dashboards for company-wide use
Partner with Product and Engineering to embed experimentation best practices
Evaluate ML models, ensuring business relevance, performance, and trade-off clarity
What we're looking for:
5+ years in data science or product analytics at scale (consumer or marketplace preferred)
Advanced SQL and Python skills, with strong foundations in statistics and experimental design
Proven record of designing, running, and analyzing large-scale experiments
Ability to analyze and reason about ML models (classification, recommendation, LLMs)
Strong communicator with a track record of influencing cross-functional teams
If you're excited by the sound of this challenge- apply today and we'll be in touch.
Data Scientist
Data scientist job in Long Beach, CA
STAND 8 provides end to end IT solutions to enterprise partners across the United States and with offices in Los Angeles, New York, New Jersey, Atlanta, and more including internationally in Mexico and India We are seeking a highly analytical and technically skilled Data Scientist to transform complex, multi-source data into unified, actionable insights used for executive reporting and decision-making.
This role requires expertise in business intelligence design, data modeling, metadata management, data integrity validation, and the development of dashboards, reports, and analytics used across operational and strategic environments.
The ideal candidate thrives in a fast-paced environment, demonstrates strong investigative skills, and can collaborate effectively with technical teams, business stakeholders, and leadership.
Essential Duties & Responsibilities
As a Data Scientist, participate across the full solution lifecycle: business case, planning, design, development, testing, migration, and production support.
Analyze large and complex datasets with accuracy and attention to detail.
Collaborate with users to develop effective metadata and data relationships.
Identify reporting and dashboard requirements across business units.
Determine strategic placement of business logic within ETL or metadata models.
Build enterprise data warehouse metadata/semantic models.
Design and develop unified dashboards, reports, and data extractions from multiple data sources.
Develop and execute testing methodologies for reports and metadata models.
Document BI architecture, data lineage, and project report requirements.
Provide technical specifications and data definitions to support the enterprise data dictionary.
Apply analytical skills and Data Science techniques to understand business processes, financial calculations, data flows, and application interactions.
Identify and implement improvements, workarounds, or alternative solutions related to ETL processes, ensuring integrity and timeliness.
Create UI components or portal elements (e.g., SharePoint) for dynamic or interactive stakeholder reporting.
As a Data Scientist, download and process SQL database information to build Power BI or Tableau reports (including cybersecurity awareness campaigns).
Utilize SQL, Python, R, or similar languages for data analysis and modeling.
Support process optimization through advanced modeling, leveraging experience as a Data Scientist where needed.
Required Knowledge & Attributes
Highly self-motivated with strong organizational skills and ability to manage multiple verbal and written assignments.
Experience collaborating across organizational boundaries for data sourcing and usage.
Analytical understanding of business processes, forecasting, capacity planning, and data governance.
Proficient with BI tools (Power BI, Tableau, PBIRS, SSRS, SSAS).
Strong Microsoft Office skills (Word, Excel, Visio, PowerPoint).
High attention to detail and accuracy.
Ability to work independently, demonstrate ownership, and ensure high-quality outcomes.
Strong communication, interpersonal, and stakeholder engagement skills.
Deep understanding that data integrity and consistency are essential for adoption and trust.
Ability to shift priorities and adapt within fast-paced environments.
Required Education & Experience
Bachelor's degree in Computer Science, Mathematics, or Statistics (or equivalent experience).
3+ years of BI development experience.
3+ years with Power BI and supporting Microsoft stack tools (SharePoint 2019, PBIRS/SSRS, Excel 2019/2021).
3+ years of experience with SDLC/project lifecycle processes
3+ years of experience with data warehousing methodologies (ETL, Data Modeling).
3+ years of VBA experience in Excel and Access.
Strong ability to write SQL queries and work with SQL Server 2017-2022.
Experience with BI tools including PBIRS, SSRS, SSAS, Tableau.
Strong analytical skills in business processes, financial modeling, forecasting, and data flow understanding.
Critical thinking and problem-solving capabilities.
Experience producing high-quality technical documentation and presentations.
Excellent communication and presentation skills, with the ability to explain insights to leadership and business teams.
Benefits
Medical coverage and Health Savings Account (HSA) through Anthem
Dental/Vision/Various Ancillary coverages through Unum
401(k) retirement savings plan
Paid-time-off options
Company-paid Employee Assistance Program (EAP)
Discount programs through ADP WorkforceNow
Additional Details
The base range for this contract position is $73 - $83 / per hour, depending on experience. Our pay ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target for new hires of this position across all US locations. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Qualified applicants with arrest or conviction records will be considered
About Us
STAND 8 provides end-to-end IT solutions to enterprise partners across the United States and globally with offices in Los Angeles, Atlanta, New York, Mexico, Japan, India, and more. STAND 8 focuses on the "bleeding edge" of technology and leverages automation, process, marketing, and over fifteen years of success and growth to provide a world-class experience for our customers, partners, and employees.
Our mission is to impact the world positively by creating success through PEOPLE, PROCESS, and TECHNOLOGY.
Check out more at ************** and reach out today to explore opportunities to grow together!
By applying to this position, your data will be processed in accordance with the STAND 8 Privacy Policy.
Data Scientist
Data scientist job in San Francisco, CA
We're working with a Series A health tech start-up pioneering a revolutionary approach to healthcare AI, developing neurosymbolic systems that combine statistical learning with structured medical knowledge. Their technology is being adopted by leading health systems and insurers to enhance patient outcomes through advanced predictive analytics.
We're seeking Machine Learning Engineers who excel at the intersection of data science, modeling, and software engineering. You'll design and implement models that extract insights from longitudinal healthcare data, balancing analytical rigor, interpretability, and scalability.
This role offers a unique opportunity to tackle foundational modeling challenges in healthcare, where your contributions will directly influence clinical, actuarial, and policy decisions.
Key Responsibilities
Develop predictive models to forecast disease progression, healthcare utilization, and costs using temporal clinical data (claims, EHR, laboratory results, pharmacy records)
Design interpretable and explainable ML solutions that earn the trust of clinicians, actuaries, and healthcare decision-makers
Research and prototype innovative approaches leveraging both classical and modern machine learning techniques
Build robust, scalable ML pipelines for training, validation, and deployment in distributed computing environments
Collaborate cross-functionally with data engineers, clinicians, and product teams to ensure models address real-world healthcare needs
Communicate findings and methodologies effectively through visualizations, documentation, and technical presentations
Required Qualifications
Strong foundation in statistical modeling, machine learning, or data science, with preference for experience in temporal or longitudinal data analysis
Proficiency in Python and ML frameworks (PyTorch, JAX, NumPyro, PyMC, etc.)
Proven track record of transitioning models from research prototypes to production systems
Experience with probabilistic methods, survival analysis, or Bayesian inference (highly valued)
Bonus Qualifications
Experience working with clinical data and healthcare terminologies (ICD, CPT, SNOMED CT, LOINC)
Background in actuarial modeling, claims forecasting, or risk adjustment methodologies
Lead Data Scientist - Computer Vision
Data scientist job in Santa Clara, CA
Lead Data Scientist - Computer Vision/Image Processing
About the Role
We are seeking a Lead Data Scientist to drive the strategy and execution of data science initiatives, with a particular focus on computer vision systems & image processing techniques. The ideal candidate has deep expertise in image processing techniques including Filtering, Binary Morphology, Perspective/Affine Transformation, Edge Detection.
Responsibilities
Solid knowledge of computer vision programs and image processing techniques: Filtering, Binary Morphology, Perspective/Affine Transformation, Edge Detection
Strong understanding of machine learning: Regression, Supervised and Unsupervised Learning
Proficiency in Python and libraries such as OpenCV, NumPy, scikit-learn, TensorFlow/PyTorch.
Familiarity with version control (Git) and collaborative development practices
Principal Data Scientist
Data scientist job in Alhambra, CA
The Principal Data Scientist works to establish a comprehensive Data Science Program to advance data-driven decision-making, streamline operations, and fully leverage modern platforms including Databricks, or similar, to meet increasing demand for predictive analytics and AI solutions.
The Principal Data Scientist will guide program development, provide training and mentorship to junior members of the team, accelerate adoption of advanced analytics, and build internal capacity through structured mentorship.
The Principal Data Scientist will possess exceptional communication abilities, both verbal and written, with a strong customer service mindset and the ability to translate complex concepts into clear, actionable insights; strong analytical and business acumen, including foundational experience with regression, association analysis, outlier detection, and core data analysis principles; working knowledge of database design and organization, with the ability to partner effectively with Data Management and Data Engineering teams; outstanding time management and organizational skills, with demonstrated success managing multiple priorities and deliverables in parallel; a highly collaborative work style, coupled with the ability to operate independently, maintain focus, and drive projects forward with minimal oversight; a meticulous approach to quality, ensuring accuracy, reliability, and consistency in all deliverables; and proven mentorship capabilities, including the ability to guide, coach, and upskill junior data scientists and analysts.
5+ years of professional experience leading data science initiatives, including developing machine learning models, statistical analyses, and end-to-end data science workflows in production environments.
3+ years of experience working with Databricks and similar cloud-based analytics platforms, including notebook development, feature engineering, ML model training, and workflow orchestration.
3+ years of experience applying advanced analytics and predictive modeling (e.g., regression, classification, clustering, forecasting, natural language processing).
2+ years of experience implementing MLOps practices, such as model versioning, CI/CD for ML, MLflow, automated pipelines, and model performance monitoring.
2+ years of experience collaborating with data engineering teams to design data pipelines, optimize data transformations, and implement Lakehouse or data warehouse architectures (e.g., Databricks, Snowflake, SQL-based platforms).
2+ years of experience mentoring or supervising junior data scientists or analysts, including code reviews, training, and structured skill development.
2+ years of experience with Python and SQL programming, using data sources such as SQL Server, Oracle, PostgreSQL, or similar relational databases.
1+ year of experience operationalizing analytics within enterprise governance frameworks, partnering with Data Management, Security, and IT to ensure compliance, reproducibility, and best practices.
Education:
This classification requires possession of a Master's degree or higher in Data Science, Statistics, Computer Science, or a closely related field. Additional qualifying professional experience may be substituted for the required education on a year-for-year basis.
At least one of the following industry-recognized certifications in data science or cloud analytics, such as: • Microsoft Azure Data Scientist Associate (DP-100) • Databricks Certified Data Scientist or Machine Learning Professional • AWS Machine Learning Specialty • Google Professional Data Engineer • or equivalent advanced analytics certifications. The certification is required and may not be substituted with additional experience.
Principal Data Scientist
Data scientist job in Alhambra, CA
Job Title - Principal Data Scientist- 809
Duration - 12+ months
**Only on W2 ** & **California Resident Candidates Only**
JOB DETAILS:
Skills Required:
The Principal Data Scientist works to establish a comprehensive Data Science Program to advance data-driven decision-making, streamline operations, and fully leverage modern platforms including Databricks, or similar, to meet increasing demand for predictive analytics and AI solutions. The Principal Data Scientist will guide program development, provide training and mentorship to junior members of the team, accelerate adoption of advanced analytics, and build internal capacity through structured mentorship. The Principal Data Scientist will possess exceptional communication abilities, both verbal and written, with a strong customer service mindset and the ability to translate complex concepts into clear, actionable insights; strong analytical and business acumen, including foundational experience with regression, association analysis, outlier detection, and core data analysis principles; working knowledge of database design and organization, with the ability to partner effectively with Data Management and Data Engineering teams; outstanding time management and organizational skills, with demonstrated success managing multiple priorities and deliverables in parallel; a highly collaborative work style, coupled with the ability to operate independently, maintain focus, and drive projects forward with minimal oversight; a meticulous approach to quality, ensuring accuracy, reliability, and consistency in all deliverables; and proven mentorship capabilities, including the ability to guide, coach, and upskill junior data scientists and analysts.
Experience Preferred
Five (5)+ years of professional experience leading data science initiatives, including developing machine learning models, statistical analyses, and end-to-end data science workflows in production environments.
Three (3)+ years of experience working with Databricks and similar cloud-based analytics platforms, including notebook development, feature engineering, ML model training, and workflow orchestration.
Three (3)+ years of experience applying advanced analytics and predictive modeling (e.g., regression, classification, clustering, forecasting, natural language processing).
Two (2)+ years of experience implementing MLOps practices, such as model versioning, CI/CD for ML, MLflow, automated pipelines, and model performance monitoring.
Two (2)+ years of experience collaborating with data engineering teams to design data pipelines, optimize data transformations, and implement Lakehouse or data warehouse architectures (e.g., Databricks, Snowflake, SQL-based platforms).
Two (2)+ years of experience mentoring or supervising junior data scientists or analysts, including code reviews, training, and structured skill development.
Two (2)+ years of experience with Python and SQL programming, using data sources such as SQL Server, Oracle, PostgreSQL, or similar relational databases.
One (1)+ year of experience operationalizing analytics within enterprise governance frameworks, partnering with Data Management, Security, and IT to ensure compliance, reproducibility, and best practices.
Education Preferred
This classification requires possession of a Master's degree or higher in Data Science, Statistics, Computer Science, or a closely related field. Additional qualifying professional experience may be substituted for the required education on a year-for-year basis. At least one of the following industry-recognized certifications in data science or cloud analytics, such as: • Microsoft Azure Data Scientist Associate (DP-100) • Databricks Certified Data Scientist or Machine Learning Professional • AWS Machine Learning Specialty • Google Professional Data Engineer • or equivalent advanced analytics certifications. The certification is required and may not be substituted with additional experience.
Additional Information
California Resident Candidates Only. This position is HYBRID (2 days onsite, 2 days telework). Interviews will be conducted via Microsoft Teams. The work schedule follows a 4/40 (10-hour days, Monday-Thursday), with the specific shift determined by the program manager. Shifts may range between 7:15 a.m. and 6:00 p.m
Data Scientist
Data scientist job in Santa Rosa, CA
Key Responsibilities
Design and productionize models for opportunity scanning, anomaly detection, and significant change detection across CRM, streaming, ecommerce, and social data.
Define and tune alerting logic (thresholds, SLOs, precision/recall) to minimize noise while surfacing high-value marketing actions.
Partner with marketing, product, and data engineering to operationalize insights into campaigns, playbooks, and automated workflows, with clear monitoring and experimentation.
Required Qualifications
Strong proficiency in Python (pandas, NumPy, scikit-learn; plus experience with PySpark or similar for large-scale data) and SQL on modern warehouses (e.g., BigQuery, Snowflake, Redshift).
Hands-on experience with time-series modeling and anomaly / changepoint / significant-movement detection(e.g., STL decomposition, EWMA/CUSUM, Bayesian/prophet-style models, isolation forests, robust statistics).
Experience building and deploying production ML pipelines (batch and/or streaming), including feature engineering, model training, CI/CD, and monitoring for performance and data drift.
Solid background in statistics and experimentation: hypothesis testing, power analysis, A/B testing frameworks, uplift/propensity modeling, and basic causal inference techniques.
Familiarity with cloud platforms (GCP/AWS/Azure), orchestration tools (e.g., Airflow/Prefect), and dashboarding/visualization tools to expose alerts and model outputs to business users.
Data Scientist
Data scientist job in Thousand Oaks, CA
Axtria is a leading global provider of cloud software and data analytics tailored for the Life Sciences industry. Since our inception in 2010, we have pioneered technology-driven solutions to revolutionize the commercialization journey, driving sales growth, and enhancing patient healthcare outcomes. Committed to impacting millions of lives positively, our innovative platforms deploy cutting-edge Artificial Intelligence and Machine Learning technologies. With a presence in over 30 countries, Axtria is a key player in delivering commercial solutions to the Life Sciences sector, consistently recognized for our growth and technological advancements.
Job Description:
We are looking for a Project Lead for our Decision Science practice. Success in this position requires managing consulting projects/engagements delivering Brand Analytics, Real World Data (RWD) Analytics, Commercial Analytics, Marketing Analytics, and Market Access Analytics solutions.
Candidates will be expected to have familiarity with:
Patient analytics using Real World Data (RWD) sources such as Claims data, EHR/EMR data, lab/diagnostic testing data, etc.
Predictive modeling using Real World Data
Patient and HCP segmentation
Campaign effectiveness, promotion response modeling, marketing mix optimization
Marketing analytics incl. digital marketing
Required skills and experience:
Overall, 4-8 years of relevant work experience and 2+ years of US local experience in pharma analytics
Knowledge of the Biopharmaceutical domain. Prior experience in analytics in therapeutic areas of Oncology, Inflammation, Cardio and Bone will be preferred
Exposure to syndicated data sets including Claims, EMR/EHR data and exposure to/experience working with large data sets.
Strong quantitative and analytical skills, including sound knowledge of statistical concepts and predictive modeling/machine learning.
Demonstrated ability to frame and scope business problems, design solutions, and deliver results.
Excellent spoken and written communication skills, including superior visualization, storyboarding, and presentation skills.
Ability to communicate actionable analytical findings to a technical or non-technical audience in clear and concise language.
Relevant expertise in using analytical tools such as R/Python, Alteryx, Dataiku etc. and ability to quickly master new analytics tools/software as needed.
Ability to lead project teams and own project delivery.
Logistics and Location:
U.S. Citizens and those authorized to work in the U.S. are encouraged to apply.
The position is based out of Thousand Oaks and the candidate needs to be at the client site 3-5 days per week
Axtria is an EEO/AA employer M/F/D/V. We offer attractive performance-based compensation packages including salary and bonus. Comprehensive benefits are available including health insurance, flexible spending accounts, and 401k with company match. Immigration sponsorship will be considered.
Pay Transparency Laws
Salary range or hourly pay range for the position
The salary range for this position is $83,200 to $129,738 annually. The actual salary will vary based on applicant's education, experience, skills, and abilities, as well as internal equity and alignment with market data. The salary may also be adjusted based on applicant's geographic location.
The salary range reflected is based on a primary work location of Thousand Oaks, CA. The actual salary may vary for applicants in a different geographic location.
Data Scientist V
Data scientist job in Menlo Park, CA
Creospan is a growing tech collective of makers, shakers, and problem solvers, offering solutions today that will propel businesses into a better tomorrow. “Tomorrow's ideas, built today!” In addition to being able to work alongside equally brilliant and motivated developers, our consultants appreciate the opportunity to learn and apply new skills and methodologies to different clients and industries.
******NO C2C/3RD PARTY, LOOKING FOR W2 CANDIDATES ONLY, must be able to work in the US without sponsorship now or in the future***
Summary:
The main function of the Data Scientist is to produce innovative solutions driven by exploratory data analysis from complex and high-dimensional datasets.
Job Responsibilities:
• Apply knowledge of statistics, machine learning, programming, data modeling, simulation, and advanced mathematics to recognize patterns, identify opportunities, pose business questions, and make valuable discoveries leading to prototype development and product improvement.
• Use a flexible, analytical approach to design, develop, and evaluate predictive models and advanced algorithms that lead to optimal value extraction from the data.
• Generate and test hypotheses and analyze and interpret the results of product experiments.
• Work with product engineers to translate prototypes into new products, services, and features and provide guidelines for large-scale implementation.
• Provide Business Intelligence (BI) and data visualization support, which includes, but limited to support for the online customer service dashboards and other ad-hoc requests requiring data analysis and visual support.
Skills:
• Experienced in either programming languages such as Python and/or R, big data tools such as Hadoop, or data visualization tools such as Tableau.
• The ability to communicate effectively in writing, including conveying complex information and promoting in-depth engagement on course topics.
• Experience working with large datasets.
Education/Experience:
• Master of Science degree in computer science or in a relevant field.
Senior Data Scientist
Data scientist job in Pleasanton, CA
Net2Source is a Global Workforce Solutions Company headquartered at NJ, USA with its branch offices in Asia Pacific Region. We are one of the fastest growing IT Consulting company across the USA and we are hiring
" Senior Data Scientist
" for one of our clients. We offer a wide gamut of consulting solutions customized to our 450+ clients ranging from Fortune 500/1000 to Start-ups across various verticals like Technology, Financial Services, Healthcare, Life Sciences, Oil & Gas, Energy, Retail, Telecom, Utilities, Technology, Manufacturing, the Internet, and Engineering.
Position: Senior Data Scientist
Location: Pleasanton, CA (Onsite) - Locals Only
Type: Contract
Exp Level - 10+ Years
Required Skills
Design, develop, and deploy advanced marketing models, including:
Build and productionize NLP solutions.
Partner with Marketing and Business stakeholders to translate business objectives into data science solutions.
Work with large-scale structured and unstructured datasets using SQL, Python, and distributed systems.
Evaluate and implement state-of-the-art ML/NLP techniques to improve model performance and business impact.
Communicate insights, results, and recommendations clearly to both technical and non-technical audiences.
Required Qualifications
5+ years of experience in data science or applied machine learning, with a strong focus on marketing analytics.
Hands-on experience building predictive marketing models (e.g., segmentation, attribution, personalization).
Strong expertise in NLP techniques and libraries (e.g., spa Cy, NLTK, Hugging Face, Gensim).
Proficiency in Python, SQL, and common data science libraries (pandas, NumPy, scikit-learn).
Solid understanding of statistics, machine learning algorithms, and model evaluation.
Experience deploying models into production environments.
Strong communication and stakeholder management skills.
Why Work With Us?
We believe in more than just jobs-we build careers. At Net2Source, we champion leadership at all levels, celebrate diverse perspectives, and empower you to make an impact. Think work-life balance, professional growth, and a collaborative culture where your ideas matter.
Our Commitment to Inclusion & Equity
Net2Source is an equal opportunity employer, dedicated to fostering a workplace where diverse talents and perspectives are valued. We make all employment decisions based on merit, ensuring a culture of respect, fairness, and opportunity for all, regardless of age, gender, ethnicity, disability, or other protected characteristics.
Awards & Recognition
America's Most Honored Businesses (Top 10%)
Fastest-Growing Staffing Firm by Staffing Industry Analysts
INC 5000 List for Eight Consecutive Years
Top 100 by
Dallas Business Journal
Spirit of Alliance Award by Agile1
Maddhuker Singh
Sr Account & Delivery Manager
***********************
Senior Data Engineer - Spark, Airflow
Data scientist job in Santa Rosa, CA
We are seeking an experienced Data Engineer to design and optimize scalable data pipelines that drive our global data and analytics initiatives.
In this role, you will leverage technologies such as Apache Spark, Airflow, and Python to build high performance data processing systems and ensure data quality, reliability, and lineage across Mastercard's data ecosystem.
The ideal candidate combines strong technical expertise with hands-on experience in distributed data systems, workflow automation, and performance tuning to deliver impactful, data-driven solutions at enterprise scale.
Responsibilities:
Design and optimize Spark-based ETL pipelines for large-scale data processing.
Build and manage Airflow DAGs for scheduling, orchestration, and checkpointing.
Implement partitioning and shuffling strategies to improve Spark performance.
Ensure data lineage, quality, and traceability across systems.
Develop Python scripts for data transformation, aggregation, and validation.
Execute and tune Spark jobs using spark-submit.
Perform DataFrame joins and aggregations for analytical insights.
Automate multi-step processes through shell scripting and variable management.
Collaborate with data, DevOps, and analytics teams to deliver scalable data solutions.
Qualifications:
Bachelor's degree in Computer Science, Data Engineering, or related field (or equivalent experience).
At least 7 years of experience in data engineering or big data development.
Strong expertise in Apache Spark architecture, optimization, and job configuration.
Proven experience with Airflow DAGs using authoring, scheduling, checkpointing, monitoring.
Skilled in data shuffling, partitioning strategies, and performance tuning in distributed systems.
Expertise in Python programming including data structures and algorithmic problem-solving.
Hands-on with Spark DataFrames and PySpark transformations using joins, aggregations, filters.
Proficient in shell scripting, including managing and passing variables between scripts.
Experienced with spark submit for deployment and tuning.
Solid understanding of ETL design, workflow automation, and distributed data systems.
Excellent debugging and problem-solving skills in large-scale environments.
Experience with AWS Glue, EMR, Databricks, or similar Spark platforms.
Knowledge of data lineage and data quality frameworks like Apache Atlas.
Familiarity with CI/CD pipelines, Docker/Kubernetes, and data governance tools.
Data Engineer
Data scientist job in Irvine, CA
Thank you for stopping by to take a look at the Data Integration Engineer role I posted here on LinkedIN, I appreciate it.
If you have read my s in the past, you will recognize how I write job descriptions. If you are new, allow me to introduce myself. My name is Tom Welke. I am Partner & VP at RSM Solutions, Inc and I have been recruiting technical talent for more than 23 years and been in the tech space since the 1990s. Due to this, I actually write JD's myself...no AI, no 'bots', just a real live human. I realized a while back that looking for work is about as fun as a root canal with no anesthesia...especially now. So, rather than saying 'must work well with others' and 'team mindset', I do away with that kind of nonsense and just tell it like it is.
So, as with every role I work on, social fit is almost as important as technical fit. For this one, technical fit is very very important. But, we also have some social fit characteristics that are important. This is the kind of place that requires people to dive in and learn. The hiring manager for this one is actually a very dear friend of mine. He said something interesting to me not all that long ago. He mentioned, if you aren't spending at least an hour a day learning something new, you really are doing yourself a disservice. This is that classic environment where no one says 'this is not my job'. So that ability to jump in and help is needed for success in this role.
This role is being done onsite in Irvine, California. I prefer working with candidates that are already local to the area. If you need to relocate, that is fine, but there are no relocation dollars available.
I can only work with US Citizens or Green Card Holders for this role. I cannot work with H1, OPT, EAD, F1, H4, or anyone that is not already a US Citizen or Green Card Holder for this role.
The Data Engineer role is similar to the Data Integration role I posted. However, this one is mor Ops focused, with the orchestration of deployment and ML flow, and including orchestrating and using data on the clusters and managing how the models are performing. This role focuses on coding & configuring on the ML side of the house.
You will be designing, automating, and observing end to end data pipelines that feed this client's Kubeflow driven machine learning platform, ensuring models are trained, deployed, and monitored on trustworthy, well governed data. You will build batch/stream workflows, wire them into Azure DevOps CI/CD, and surface real time health metrics in Prometheus + Grafana dashboards to guarantee data availability. The role bridges Data Engineering and MLOps, allowing data scientists to focus on experimentation and the business sees rapid, reliable predictive insight.
Here are some of the main responsibilities:
Design and implement batch and streaming pipelines in Apache Spark running on Kubernetes and Kubeflow Pipelines to hydrate feature stores and training datasets.
Build high throughput ETL/ELT jobs with SSIS, SSAS, and T SQL against MS SQL Server, applying Data Vault style modeling patterns for auditability.
Integrate source control, build, and release automation using GitHub Actions and Azure DevOps for every pipeline component.
Instrument pipelines with Prometheus exporters and visualize SLA, latency, and error budget metrics to enable proactive alerting.
Create automated data quality and schema drift checks; surface anomalies to support a rapid incident response process.
Use MLflow Tracking and Model Registry to version artifacts, parameters, and metrics for reproducible experiments and safe rollbacks.
Work with data scientists to automate model retraining and deployment triggers within Kubeflow based on data freshness or concept drift signals.
Develop PowerShell and .NET utilities to orchestrate job dependencies, manage secrets, and publish telemetry to Azure Monitor.
Optimize Spark and SQL workloads through indexing, partitioning, and cluster sizing strategies, benchmarking performance in CI pipelines.
Document lineage, ownership, and retention policies; ensure pipelines conform to PCI/SOX and internal data governance standards.
Here is what we are seeking:
At least 6 years of experience building data pipelines in Spark or equivalent.
At least 2 years deploying workloads on Kubernetes/Kubeflow.
At least 2 years of experience with MLflow or similar experiment‑tracking tools.
At least 6 years of experience in T‑SQL, Python/Scala for Spark.
At least 6 years of PowerShell/.NET scripting.
At least 6 years of experience with with GitHub, Azure DevOps, Prometheus, Grafana, and SSIS/SSAS.
Kubernetes CKA/CKAD, Azure Data Engineer (DP‑203), or MLOps‑focused certifications (e.g., Kubeflow or MLflow) would be great to see.
Mentor engineers on best practices in containerized data engineering and MLOps.
Senior Data Engineer
Data scientist job in Newport Beach, CA
Pacific Life is a Fortune 500 company headquartered in Newport Beach, California, with nearly 160 years of experience supporting policyholders. The company provides innovative life insurance, annuity solutions, and mutual funds to help individuals and businesses achieve financial security. Serving both retail and institutional markets, Pacific Life emphasizes offering value and stability to current and future generations. With a strong commitment to excellence, Pacific Life has earned a reputation for financial strength and reliability.
Role Description
This is a full-time, hybrid Senior Data Engineer position based in Newport Beach, California. The Senior Data Engineer will be responsible for designing, building, and maintaining scalable data pipelines and architecture. Day-to-day responsibilities include implementing Extract Transform Load (ETL) processes, managing data warehouses, and supporting data analytics initiatives to drive business insights. Collaboration with cross-functional teams and ensuring data quality and efficiency are essential aspects of this role.
Qualifications
Proficiency in Data Engineering, including building and maintaining large-scale data pipelines
Experience in Data Modeling, database design, and optimizing data structures
Hands-on expertise in Extract Transform Load (ETL) processes and tools
Knowledge of Data Warehousing technologies and managing data storage solutions
Familiarity with Data Analytics and deriving meaningful insights from datasets
Strong problem-solving skills and the ability to work collaboratively in a team environment
Bachelor's degree in Computer Science, Data Science, or a related field
7+ year of experience in analysis, design, development and delivery of data
7+ years of experience in SQL, ETL, ELT, leading cloud data warehouse technologies, data transformations and data management tools
Experience working in Azure Dev Ops (ADO), Build and Release CI/CD pipelines and orchestration
Experience with AWS
2+ years of experience with Snowflake, dbt, Matillion or Profisee
Understanding of data catalogs, glossary, data quality and effective data governance
Master Data Management (MDM)/Reference Data experience
Sr Data Platform Engineer
Data scientist job in Elk Grove, CA
Hybrid role 3X a week in office in Elk Grove, CA; no remote capabilities
This is a direct hire opportunity.
We're seeking a seasoned Senior Data Platform Engineer to design, build, and optimize scalable data solutions that power analytics, reporting, and AI/ML initiatives. This full‑time role is hands‑on, working with architects, analysts, and business stakeholders to ensure data systems are reliable, secure, and high‑performing.
Responsibilites:
Build and maintain robust data pipelines (structured, semi‑structured, unstructured).
Implement ETL workflows with Spark, Delta Lake, and cloud‑native tools.
Support big data platforms (Databricks, Snowflake, GCP) in production.
Troubleshoot and optimize SQL queries, Spark jobs, and workloads.
Ensure governance, security, and compliance across data systems.
Integrate workflows into CI/CD pipelines with Git, Jenkins, Terraform.
Collaborate cross‑functionally to translate business needs into technical solutions.
Qualifications:
7+ years in data engineering with production pipeline experience.
Expertise in Spark ecosystem, Databricks, Snowflake, GCP.
Strong skills in PySpark, Python, SQL.
Experience with RAG systems, semantic search, and LLM integration.
Familiarity with Kafka, Pub/Sub, vector databases.
Proven ability to optimize ETL jobs and troubleshoot production issues.
Agile team experience and excellent communication skills.
Certifications in Databricks, Snowflake, GCP, or Azure.
Exposure to Airflow, BI tools (Power BI, Looker Studio).
Lead Data Engineer - (Automotive exp)
Data scientist job in Torrance, CA
Role: Sr Technical Lead
Duration: 12+ Month Contract
Daily Tasks Performed:
Lead the design, development, and deployment of a scalable, secure, and high-performance CDP SaaS product.
Architect solutions that integrate with various data sources, APIs, and third-party platforms.
Design, develop, and optimize complex SQL queries for data extraction, transformation, and analysis
Build and maintain workflow pipelines using Digdag, integrating with data platforms such as Treasure Data, AWS, or other cloud services
Automate ETL processes and schedule tasks using Digdag's YAML-based workflow definitions
Implement data quality checks, logging, and alerting mechanisms within workflow
Leverage AWS services (e.g., S3, Lambda, Athena) where applicable to enhance data processing and storage capabilities
Ensure best practices in software engineering, including code reviews, testing, CI/CD, and documentation.
Oversee data privacy, security, and compliance initiatives (e.g., GDPR, CCPA).
Ensure adherence to security, compliance, and data governance requirements.
Oversee development of real-time and batch data processing systems.
Collaborate with cross-functional teams including data analysts, product managers, and software engineers to translate business requirements into technical solutions
Collaborate with the stakeholders to define technical requirements to align technical solutions with business goals and deliver product features.
Mentor and guide developers, fostering a culture of technical excellence and continuous improvement.
Troubleshoot complex technical issues and provide hands-on support as needed.
Monitor, troubleshoot, and improve data workflows for performance, reliability, and cost-efficiency as needed
Optimize system performance, scalability, and cost efficiency.
What this person will be working on:
As the Senior Technical Lead for our Customer Data Platform (CDP), the candidate will define the technical strategy, architecture, and execution of the platform. They will lead the design and delivery of scalable, secure, and high-performing solutions that enable unified customer data management, advanced analytics, and personalized experiences. This role demands deep technical expertise, strong leadership, and a solid understanding of data platforms and modern cloud technologies. It is a pivotal position that supports the CDP vision by mentoring team members and delivering solutions that empower our customers to unify, analyze, and activate their data.
Position Success Criteria (Desired) - 'WANTS'
Bachelor's or Master's degree in Computer Science, Engineering, or related field.
8+ years of software development experience, with at least 3+ years in a technical leadership role.
Proven experience building and scaling SaaS products, preferably in customer data, marketing technology, or analytics domains
Extensive hands-on experience with Presto, Hive, and Python
Strong proficiency in writing complex SQL queries for data extraction, transformation, and analysis
Familiarity with AWS data services such as S3, Athena, Glue, and Lambda
Deep understanding of data modeling, ETL pipelines, workflow orchestration, and both real-time and batch data processing
Experience ensuring data privacy, security, and compliance in SaaS environments
Knowledge of Customer Data Platforms (CDPs), CDP concepts, and integration with CRM, marketing, and analytics tools
Excellent communication, leadership, and project management skills
Experience working with Agile methodologies and DevOps practices
Ability to thrive in a fast-paced, agile environment
Collaborative mindset with a proactive approach to problem-solving
Stay current with industry trends and emerging technologies relevant to SaaS and customer data platforms.
Training Assessment Data Scientist
Data scientist job in Twentynine Palms, CA
The Opportunity:
As a data scientist, you're excited at the prospect of unlocking the secrets held by a data set, and you're fascinated by the possibilities presented by IoT, machine learning, and artificial intelligence. In an increasingly connected world, massive amounts of structured and unstructured data open new opportunities. As a data scientist at Booz Allen, you can help turn these complex data sets into useful information to solve global challenges. Across private and public sectors-from fraud detection to cancer research to national intelligence-we need you to help find the answers in the data.
On our team, you'll use your data science expertise to help the client conduct training assessments and after actions based on data generated during live, virtual and constructive training. You'll work closely with clients to understand their questions and needs, and then dig into their data-rich environments to find the pieces of their information puzzle. You'll use the right combination of tools and frameworks to turn sets of disparate data points into objective answers to increase technical and tactical expertise of Marine Corps units. Ultimately, you'll provide a deep understanding of the data, what it all means, and how it can be used to improve Marine Corps training readiness.
Work with us as we use data science for good.
Join us. The world can't wait.
You Have:
5+ years of experience with data exploration, data cleaning, data analysis, data visualization, or data mining
5+ years of experience with statistical and general-purpose programming languages for data analysis
5+ years of experience analyzing structured and unstructured data sources
Experience developing predictive data models, quantitative analyses and visualization of targeted data sources
Experience leading the development of solutions to complex programs
Experience with natural language processing, text mining, or machine learning techniques
Secret clearance
Bachelor's degree
Nice If You Have:
Experience working with Marine Corps Live, Virtual and Constructive Training Systems
Experience with distributed data and computing tools, including MapReduce, Hadoop, Hive, EMR, Kafka, Spark, Gurobi, or MySQL
Experience with visualization packages, including Plotly, Seaborn, or ggplot2
Clearance:
Applicants selected will be subject to a security investigation and may need to meet eligibility requirements for access to classified information; Secret clearance is required.
Compensation
At Booz Allen, we celebrate your contributions, provide you with opportunities and choices, and support your total well-being. Our offerings include health, life, disability, financial, and retirement benefits, as well as paid leave, professional development, tuition assistance, work-life programs, and dependent care. Our recognition awards program acknowledges employees for exceptional performance and superior demonstration of our values. Full-time and part-time employees working at least 20 hours a week on a regular basis are eligible to participate in Booz Allen's benefit programs. Individuals that do not meet the threshold are only eligible for select offerings, not inclusive of health benefits. We encourage you to learn more about our total benefits by visiting the Resource page on our Careers site and reviewing Our Employee Benefits page.
Salary at Booz Allen is determined by various factors, including but not limited to location, the individual's particular combination of education, knowledge, skills, competencies, and experience, as well as contract-specific affordability and organizational requirements. The projected compensation range for this position is $77,600.00 to $176,000.00 (annualized USD). The estimate displayed represents the typical salary range for this position and is just one component of Booz Allen's total compensation package for employees. This posting will close within 90 days from the Posting Date.
Identity Statement
As part of the application process, you are expected to be on camera during interviews and assessments. We reserve the right to take your picture to verify your identity and prevent fraud.
Work Model
Our people-first culture prioritizes the benefits of flexibility and collaboration, whether that happens in person or remotely.
If this position is listed as remote or hybrid, you'll periodically work from a Booz Allen or client site facility.
If this position is listed as onsite, you'll work with colleagues and clients in person, as needed for the specific role.
Commitment to Non-Discrimination
All qualified applicants will receive consideration for employment without regard to disability, status as a protected veteran or any other status protected by applicable federal, state, local, or international law.
Auto-ApplyStaff Data Scientist
Data scientist job in San Francisco, CA
Staff Data Scientist | San Francisco | $250K-$300K + Equity
We're partnering with one of the fastest-growing AI companies in the world to hire a Staff Data Scientist. Backed by over $230M from top-tier investors and already valued at over $1B, they've secured customers that include some of the most recognizable names in tech. Their AI platform powers millions of daily interactions and is quickly becoming the enterprise standard for conversational AI.
In this role, you'll bring rigorous analytics and experimentation leadership that directly shapes product strategy and company performance.
What you'll do:
Drive deep-dive analyses on user behavior, product performance, and growth drivers
Design and interpret A/B tests to measure product impact at scale
Build scalable data models, pipelines, and dashboards for company-wide use
Partner with Product and Engineering to embed experimentation best practices
Evaluate ML models, ensuring business relevance, performance, and trade-off clarity
What we're looking for:
5+ years in data science or product analytics at scale (consumer or marketplace preferred)
Advanced SQL and Python skills, with strong foundations in statistics and experimental design
Proven record of designing, running, and analyzing large-scale experiments
Ability to analyze and reason about ML models (classification, recommendation, LLMs)
Strong communicator with a track record of influencing cross-functional teams
If you're excited by the sound of this challenge- apply today and we'll be in touch.
Data Scientist
Data scientist job in San Jose, CA
Key Responsibilities
Design and productionize models for opportunity scanning, anomaly detection, and significant change detection across CRM, streaming, ecommerce, and social data.
Define and tune alerting logic (thresholds, SLOs, precision/recall) to minimize noise while surfacing high-value marketing actions.
Partner with marketing, product, and data engineering to operationalize insights into campaigns, playbooks, and automated workflows, with clear monitoring and experimentation.
Required Qualifications
Strong proficiency in Python (pandas, NumPy, scikit-learn; plus experience with PySpark or similar for large-scale data) and SQL on modern warehouses (e.g., BigQuery, Snowflake, Redshift).
Hands-on experience with time-series modeling and anomaly / changepoint / significant-movement detection(e.g., STL decomposition, EWMA/CUSUM, Bayesian/prophet-style models, isolation forests, robust statistics).
Experience building and deploying production ML pipelines (batch and/or streaming), including feature engineering, model training, CI/CD, and monitoring for performance and data drift.
Solid background in statistics and experimentation: hypothesis testing, power analysis, A/B testing frameworks, uplift/propensity modeling, and basic causal inference techniques.
Familiarity with cloud platforms (GCP/AWS/Azure), orchestration tools (e.g., Airflow/Prefect), and dashboarding/visualization tools to expose alerts and model outputs to business users.
Senior Data Engineer - Spark, Airflow
Data scientist job in San Francisco, CA
We are seeking an experienced Data Engineer to design and optimize scalable data pipelines that drive our global data and analytics initiatives.
In this role, you will leverage technologies such as Apache Spark, Airflow, and Python to build high performance data processing systems and ensure data quality, reliability, and lineage across Mastercard's data ecosystem.
The ideal candidate combines strong technical expertise with hands-on experience in distributed data systems, workflow automation, and performance tuning to deliver impactful, data-driven solutions at enterprise scale.
Responsibilities:
Design and optimize Spark-based ETL pipelines for large-scale data processing.
Build and manage Airflow DAGs for scheduling, orchestration, and checkpointing.
Implement partitioning and shuffling strategies to improve Spark performance.
Ensure data lineage, quality, and traceability across systems.
Develop Python scripts for data transformation, aggregation, and validation.
Execute and tune Spark jobs using spark-submit.
Perform DataFrame joins and aggregations for analytical insights.
Automate multi-step processes through shell scripting and variable management.
Collaborate with data, DevOps, and analytics teams to deliver scalable data solutions.
Qualifications:
Bachelor's degree in Computer Science, Data Engineering, or related field (or equivalent experience).
At least 7 years of experience in data engineering or big data development.
Strong expertise in Apache Spark architecture, optimization, and job configuration.
Proven experience with Airflow DAGs using authoring, scheduling, checkpointing, monitoring.
Skilled in data shuffling, partitioning strategies, and performance tuning in distributed systems.
Expertise in Python programming including data structures and algorithmic problem-solving.
Hands-on with Spark DataFrames and PySpark transformations using joins, aggregations, filters.
Proficient in shell scripting, including managing and passing variables between scripts.
Experienced with spark submit for deployment and tuning.
Solid understanding of ETL design, workflow automation, and distributed data systems.
Excellent debugging and problem-solving skills in large-scale environments.
Experience with AWS Glue, EMR, Databricks, or similar Spark platforms.
Knowledge of data lineage and data quality frameworks like Apache Atlas.
Familiarity with CI/CD pipelines, Docker/Kubernetes, and data governance tools.
Staff Data Scientist
Data scientist job in San Jose, CA
Staff Data Scientist | San Francisco | $250K-$300K + Equity
We're partnering with one of the fastest-growing AI companies in the world to hire a Staff Data Scientist. Backed by over $230M from top-tier investors and already valued at over $1B, they've secured customers that include some of the most recognizable names in tech. Their AI platform powers millions of daily interactions and is quickly becoming the enterprise standard for conversational AI.
In this role, you'll bring rigorous analytics and experimentation leadership that directly shapes product strategy and company performance.
What you'll do:
Drive deep-dive analyses on user behavior, product performance, and growth drivers
Design and interpret A/B tests to measure product impact at scale
Build scalable data models, pipelines, and dashboards for company-wide use
Partner with Product and Engineering to embed experimentation best practices
Evaluate ML models, ensuring business relevance, performance, and trade-off clarity
What we're looking for:
5+ years in data science or product analytics at scale (consumer or marketplace preferred)
Advanced SQL and Python skills, with strong foundations in statistics and experimental design
Proven record of designing, running, and analyzing large-scale experiments
Ability to analyze and reason about ML models (classification, recommendation, LLMs)
Strong communicator with a track record of influencing cross-functional teams
If you're excited by the sound of this challenge- apply today and we'll be in touch.