Cybersecurity Engineer III **
Data engineer job in San Diego, CA
SimVentions, consistently voted one Virginia's Best Places to Work, is looking for an experienced cybersecurity professional to join our team! As a Cybersecurity Engineer III, you will play a key role in advancing cybersecurity operations by performing in-depth system hardening, vulnerability assessment, and security compliance activities in accordance with DoD requirements. The ideal candidate will have a solid foundation in cybersecurity practices and proven experience supporting both Linux and Windows environments across DoD networks. You will work collaboratively with Blue Team, Red Team, and other Cybersecurity professionals on overall cyber readiness defense and system accreditation efforts.
** Position is contingent upon award of contract, anticipated in December of 2025. **
Clearance:
An ACTIVE Secret clearance (IT Level II Tier 5 / Special-Sensitive Position) is required for this position. Applicants selected will be subject to a security investigation and may need to meet eligibility requirements for access to classified information. US Citizenship is required to obtain a clearance.
Requirements:
In-depth understanding of computer security, military system specifications, and DoD cybersecurity policies
Strong ability to communicate clearly and succinctly in written and oral presentations
Must possess one of the following DoD 8570.01-M IAT Level III baseline certifications:
CASP+ CE
CCNP Security
CISA
CISSP (Associate)
CISSP
GCED
GCIH
CCSP
Responsibilities:
Develop Assessment and Authorization (A&A) packages for various systems
Develop and maintain security documentation such as:
Authorization Boundary Diagram
System Hardware/Software/Information Flow
System Security Plan
Privacy Impact Assessment
e-Authentication
Implementation Plan
System Level Continuous Monitoring Plan
Ports, Protocols and Services Registration
Plan of Action and Milestones (POA&M)
Conduct annual FISMA assessments
Perform Continuous Monitoring of Authorized Systems
Generate and update test plans; conduct testing of the system components using the Assured Compliance Assessment Solution (ACAS) tool, implement Security Technical Implementation Guides (STIG), and conduct Information Assurance Vulnerability Management (IAVM) reviews
Perform automated ACAS scanning, STIG, SCAP checks (Evaluate STIG, Tenable Nessus, etc.) on various standalone and networked systems
Analyze cybersecurity test scan results and develop/assist with documenting open findings in the Plan of Action and Milestones (POA&M)
Analyze DISA Security Technical Implementation Guide test results and develop/assist with documenting open findings in the Plan of Action and Milestones
Preferred Skills and Experience:
A combined total of ten (10) years of full-time professional experience in all of the following functional areas:
Computer security, military system specifications, and DoD cybersecurity policies
National Cyber Range Complex (NCRC) Total Ship Computing Environment (TSCE) Program requirements and mission, ship install requirements, and protocols (preferred)
Risk Management Framework (RMF), and the implementation of Cybersecurity and IA boundary defense techniques and various IA-enabled appliances. Examples of these appliances and applications are Firewalls, Intrusion Detection System (IDS), Intrusion Prevention System (IPS), Switch/Routers, Cross Domain Solutions (CDS), EMASS and, Endpoint Security Solution (ESS)
Performing STIG implementation
Performing vulnerability assessments with the ACAS tool
Remediating vulnerability findings to include implementing vendor patches on both Linux and Windows Operating systems
Education: Bachelor of Science in Information Systems, Bachelor of Science in Information Technology, Bachelor of Science in Computer Science, Bachelor of Science in Computer Engineering Compensation:
Compensation at SimVentions is determined by a number of factors, including, but not limited to, the candidate's experience, education, training, security clearance, work location, skills, knowledge, and competencies, as well as alignment with our corporate compensation plan and contract specific requirements.
The projected annual compensation range for this position is $90,000 - $160,000 (USD). This estimate reflects the standard salary range for this position and is just one component of the total compensation package that SimVentions offers.
Benefits:
At SimVentions, we're committed to supporting the total well-being of our employees and their families. Our benefit offerings include comprehensive health and welfare plans to serve a variety of needs.
We offer:
Medical, dental, vision, and prescription drug coverage
Employee Stock Ownership Plan (ESOP)
Competitive 401(k) programs
Retirement and Financial Counselors
Health Savings and Health Reimbursement Accounts
Flexible Spending Accounts
Life insurance, short- & long-term disability
Continuing Education Assistance
Paid Time Off, Paid Holidays, Paid Leave (e.g., Maternity, Paternity, Jury Duty, Bereavement, Military)
Third Party Employee Assistance Program that offers emotional and lifestyle well-being services, to include free counseling
Supplemental Benefit Program
Why Work for SimVentions?:
SimVentions is about more than just being a place to work with other growth-orientated technically exceptional experts. It's also a fun place to work. Our family-friendly atmosphere encourages our employee-owners to imagine, create, explore, discover, and do great things together.
Support Our Warfighters
SimVentions is a proud supporter of the U.S. military, and we take pride in our ability to provide relevant, game-changing solutions to our armed men and women around the world.
Drive Customer Success
We deliver innovative products and solutions that go beyond the expected. This means you can expect to work with a team that will allow you to grow, have a voice, and make an impact.
Get Involved in Giving Back
We believe a well-rounded company starts with well-rounded employees, which is why we offer diverse service opportunities for our team throughout the year.
Build Innovative Technology
SimVentions takes pride in its innovative and cutting-edge technology, so you can be sure that whatever project you work on, you will be having a direct impact on our customer's success.
Work with Brilliant People
We don't just hire the smartest people; we seek experienced, creative individuals who are passionate about their work and thrive in our unique culture.
Create Meaningful Solutions
We are trusted partners with our customers and are provided challenging and meaningful requirements to help them solve.
Employees who join SimVentions will enjoy additional perks like:
Employee Ownership: Work with the best and help build YOUR company!
Family focus: Work for a team that recognizes the importance of family time.
Culture: Add to our culture of technical excellence and collaboration.
Dress code: Business casual, we like to be comfortable while we work.
Resources: Excellent facilities, tools, and training opportunities to grow in your field.
Open communication: Work in an environment where your voice matters.
Corporate Fellowship: Opportunities to participate in company sports teams and employee-led interest groups for personal and professional development.
Employee Appreciation: Multiple corporate events throughout the year, including Holiday Events, Company Picnic, Imagineering Day, and more.
Founding Partner of the FredNats Baseball team: Equitable distribution of tickets for every home game to be enjoyed by our employee-owners and their families from our private suite.
Food: We have a lot of food around here!
FTAC
Software Development Engineer, AI/ML, AWS Neuron, Model Inference
Data engineer job in Cupertino, CA
The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon's custom machine learning accelerators, Inferentia and Trainium. The AWS Neuron SDK, developed by the Annapurna Labs team at AWS, is the backbone for accelerating deep learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML compiler, runtime, and application framework that seamlessly integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance.
The Inference Enablement and Acceleration team is at the forefront of running a wide range of models and supporting novel architecture alongside maximizing their performance for AWS's custom ML accelerators. Working across the stack from PyTorch till the hardware-software boundary, our engineers build systematic infrastructure, innovate new methods and create high-performance kernels for ML functions, ensuring every compute unit is fine tuned for optimal performance for our customers' demanding workloads. We combine deep hardware knowledge with ML expertise to push the boundaries of what's possible in AI acceleration.
As part of the broader Neuron organization, our team works across multiple technology layers - from frameworks and kernels and collaborate with compiler to runtime and collectives. We not only optimize current performance but also contribute to future architecture designs, working closely with customers to enable their models and ensure optimal performance. This role offers a unique opportunity to work at the intersection of machine learning, high-performance computing, and distributed architectures, where you'll help shape the future of AI acceleration technology
You will architect and implement business critical features, and mentor a brilliant team of experienced engineers. We operate in spaces that are very large, yet our teams remain small and agile. There is no blueprint. We're inventing. We're experimenting. It is a very unique learning culture. The team works closely with customers on their model enablement, providing direct support and optimization expertise to ensure their machine learning workloads achieve optimal performance on AWS ML accelerators. The team collaborates with open source ecosystems to provide seamless integration and bring peak performance at scale for customers and developers.
This role is responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side with compiler engineers and runtime engineers to create, build and tune distributed inference solutions with Trainium and Inferentia. Experience optimizing inference performance for both latency and throughput on such large models across the stack from system level optimizations through to Pytorch or JAX is a must have.
You can learn more about Neuron
*****************************************************************************************
***********************************************
*************************************
*********************************************************************************************
Key job responsibilities
This role will help lead the efforts in building distributed inference support for Pytorch in the Neuron SDK. This role will tune these models to ensure highest performance and maximize the efficiency of them running on the customer AWS Trainium and Inferentia silicon and servers. Strong software development using Python, System level programming and ML knowledge are both critical to this role. Our engineers collaborate across compiler, runtime, framework, and hardware teams to optimize machine learning workloads for our global customer base. Working at the intersection of software, hardware, and machine learning systems, you'll bring expertise in low-level optimization, system architecture, and ML model acceleration. In this role, you will:
* Design, develop, and optimize machine learning models and frameworks for deployment on custom ML hardware accelerators.
* Participate in all stages of the ML system development lifecycle including distributed computing based architecture design, implementation, performance profiling, hardware-specific optimizations, testing and production deployment.
* Build infrastructure to systematically analyze and onboard multiple models with diverse architecture.
* Design and implement high-performance kernels and features for ML operations, leveraging the Neuron architecture and programming models
* Analyze and optimize system-level performance across multiple generations of Neuron hardware
* Conduct detailed performance analysis using profiling tools to identify and resolve bottlenecks
* Implement optimizations such as fusion, sharding, tiling, and scheduling
* Conduct comprehensive testing, including unit and end-to-end model testing with continuous deployment and releases through pipelines.
* Work directly with customers to enable and optimize their ML models on AWS accelerators
* Collaborate across teams to develop innovative optimization techniques
A day in the life
You will collaborate with a cross-functional team of applied scientists, system engineers, and product managers to deliver state-of-the-art inference capabilities for Generative AI applications. Your work will involve debugging performance issues, optimizing memory usage, and shaping the future of Neuron's inference stack across Amazon and the Open Source Community. As you design and code solutions to help our team drive efficiencies in software architecture, you'll create metrics, implement automation and other improvements, and resolve the root cause of software defects.
You will also build high-impact solutions to deliver to our large customer base and participate in design discussions, code review, and communicate with internal and external stakeholders. You will work cross-functionally to help drive business decisions with your technical input. You will work in a startup-like development environment, where you're always working on the most important initiative.
About the team
The Inference Enablement and Acceleration team fosters a builder's culture where experimentation is encouraged, and impact is measurable. We emphasize collaboration, technical ownership, and continuous learning. Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we're building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future. Join us to solve some of the most interesting and impactful infrastructure challenges in AI/ML today.
BASIC QUALIFICATIONS- Bachelor's degree in computer science or equivalent
- 5+ years of non-internship professional software development experience
- 5+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Fundamentals of Machine learning and LLMs, their architecture, training and inference lifecycles along with work experience on some optimizations for improving the model execution.
- Software development experience in C++, Python (experience in at least one language is required).
- Strong understanding of system performance, memory management, and parallel computing principles.
- Proficiency in debugging, profiling, and implementing best software engineering practices in large-scale systems.
PREFERRED QUALIFICATIONS- Familiarity with PyTorch, JIT compilation, and AOT tracing.
- Familiarity with CUDA kernels or equivalent ML or low-level kernels
- Candidates with performant kernel development such as CUTLASS, FlashInfer etc., would be well suited.
- Familiar with syntax and tile-level semantics similar to Triton.
- Experience with online/offline inference serving with vLLM, SGLang, TensorRT or similar platforms in production environments.
- Deep understanding of computer architecture, operation systems level software and working knowledge of parallel computing.
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company's reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit ********************************************************* for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $129,300/year in our lowest geographic market up to $223,600/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit ******************************************************** This position will remain posted until filled. Applicants should apply via our internal or external career site.
Staff Data Scientist
Data engineer job in San Francisco, CA
Staff Data Scientist | San Francisco | $250K-$300K + Equity
We're partnering with one of the fastest-growing AI companies in the world to hire a Staff Data Scientist. Backed by over $230M from top-tier investors and already valued at over $1B, they've secured customers that include some of the most recognizable names in tech. Their AI platform powers millions of daily interactions and is quickly becoming the enterprise standard for conversational AI.
In this role, you'll bring rigorous analytics and experimentation leadership that directly shapes product strategy and company performance.
What you'll do:
Drive deep-dive analyses on user behavior, product performance, and growth drivers
Design and interpret A/B tests to measure product impact at scale
Build scalable data models, pipelines, and dashboards for company-wide use
Partner with Product and Engineering to embed experimentation best practices
Evaluate ML models, ensuring business relevance, performance, and trade-off clarity
What we're looking for:
5+ years in data science or product analytics at scale (consumer or marketplace preferred)
Advanced SQL and Python skills, with strong foundations in statistics and experimental design
Proven record of designing, running, and analyzing large-scale experiments
Ability to analyze and reason about ML models (classification, recommendation, LLMs)
Strong communicator with a track record of influencing cross-functional teams
If you're excited by the sound of this challenge- apply today and we'll be in touch.
Data Scientist
Data engineer job in Long Beach, CA
STAND 8 provides end to end IT solutions to enterprise partners across the United States and with offices in Los Angeles, New York, New Jersey, Atlanta, and more including internationally in Mexico and India We are seeking a highly analytical and technically skilled Data Scientist to transform complex, multi-source data into unified, actionable insights used for executive reporting and decision-making.
This role requires expertise in business intelligence design, data modeling, metadata management, data integrity validation, and the development of dashboards, reports, and analytics used across operational and strategic environments.
The ideal candidate thrives in a fast-paced environment, demonstrates strong investigative skills, and can collaborate effectively with technical teams, business stakeholders, and leadership.
Essential Duties & Responsibilities
As a Data Scientist, participate across the full solution lifecycle: business case, planning, design, development, testing, migration, and production support.
Analyze large and complex datasets with accuracy and attention to detail.
Collaborate with users to develop effective metadata and data relationships.
Identify reporting and dashboard requirements across business units.
Determine strategic placement of business logic within ETL or metadata models.
Build enterprise data warehouse metadata/semantic models.
Design and develop unified dashboards, reports, and data extractions from multiple data sources.
Develop and execute testing methodologies for reports and metadata models.
Document BI architecture, data lineage, and project report requirements.
Provide technical specifications and data definitions to support the enterprise data dictionary.
Apply analytical skills and Data Science techniques to understand business processes, financial calculations, data flows, and application interactions.
Identify and implement improvements, workarounds, or alternative solutions related to ETL processes, ensuring integrity and timeliness.
Create UI components or portal elements (e.g., SharePoint) for dynamic or interactive stakeholder reporting.
As a Data Scientist, download and process SQL database information to build Power BI or Tableau reports (including cybersecurity awareness campaigns).
Utilize SQL, Python, R, or similar languages for data analysis and modeling.
Support process optimization through advanced modeling, leveraging experience as a Data Scientist where needed.
Required Knowledge & Attributes
Highly self-motivated with strong organizational skills and ability to manage multiple verbal and written assignments.
Experience collaborating across organizational boundaries for data sourcing and usage.
Analytical understanding of business processes, forecasting, capacity planning, and data governance.
Proficient with BI tools (Power BI, Tableau, PBIRS, SSRS, SSAS).
Strong Microsoft Office skills (Word, Excel, Visio, PowerPoint).
High attention to detail and accuracy.
Ability to work independently, demonstrate ownership, and ensure high-quality outcomes.
Strong communication, interpersonal, and stakeholder engagement skills.
Deep understanding that data integrity and consistency are essential for adoption and trust.
Ability to shift priorities and adapt within fast-paced environments.
Required Education & Experience
Bachelor's degree in Computer Science, Mathematics, or Statistics (or equivalent experience).
3+ years of BI development experience.
3+ years with Power BI and supporting Microsoft stack tools (SharePoint 2019, PBIRS/SSRS, Excel 2019/2021).
3+ years of experience with SDLC/project lifecycle processes
3+ years of experience with data warehousing methodologies (ETL, Data Modeling).
3+ years of VBA experience in Excel and Access.
Strong ability to write SQL queries and work with SQL Server 2017-2022.
Experience with BI tools including PBIRS, SSRS, SSAS, Tableau.
Strong analytical skills in business processes, financial modeling, forecasting, and data flow understanding.
Critical thinking and problem-solving capabilities.
Experience producing high-quality technical documentation and presentations.
Excellent communication and presentation skills, with the ability to explain insights to leadership and business teams.
Benefits
Medical coverage and Health Savings Account (HSA) through Anthem
Dental/Vision/Various Ancillary coverages through Unum
401(k) retirement savings plan
Paid-time-off options
Company-paid Employee Assistance Program (EAP)
Discount programs through ADP WorkforceNow
Additional Details
The base range for this contract position is $73 - $83 / per hour, depending on experience. Our pay ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target for new hires of this position across all US locations. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Qualified applicants with arrest or conviction records will be considered
About Us
STAND 8 provides end-to-end IT solutions to enterprise partners across the United States and globally with offices in Los Angeles, Atlanta, New York, Mexico, Japan, India, and more. STAND 8 focuses on the "bleeding edge" of technology and leverages automation, process, marketing, and over fifteen years of success and growth to provide a world-class experience for our customers, partners, and employees.
Our mission is to impact the world positively by creating success through PEOPLE, PROCESS, and TECHNOLOGY.
Check out more at ************** and reach out today to explore opportunities to grow together!
By applying to this position, your data will be processed in accordance with the STAND 8 Privacy Policy.
Data Scientist
Data engineer job in San Francisco, CA
We're working with a Series A health tech start-up pioneering a revolutionary approach to healthcare AI, developing neurosymbolic systems that combine statistical learning with structured medical knowledge. Their technology is being adopted by leading health systems and insurers to enhance patient outcomes through advanced predictive analytics.
We're seeking Machine Learning Engineers who excel at the intersection of data science, modeling, and software engineering. You'll design and implement models that extract insights from longitudinal healthcare data, balancing analytical rigor, interpretability, and scalability.
This role offers a unique opportunity to tackle foundational modeling challenges in healthcare, where your contributions will directly influence clinical, actuarial, and policy decisions.
Key Responsibilities
Develop predictive models to forecast disease progression, healthcare utilization, and costs using temporal clinical data (claims, EHR, laboratory results, pharmacy records)
Design interpretable and explainable ML solutions that earn the trust of clinicians, actuaries, and healthcare decision-makers
Research and prototype innovative approaches leveraging both classical and modern machine learning techniques
Build robust, scalable ML pipelines for training, validation, and deployment in distributed computing environments
Collaborate cross-functionally with data engineers, clinicians, and product teams to ensure models address real-world healthcare needs
Communicate findings and methodologies effectively through visualizations, documentation, and technical presentations
Required Qualifications
Strong foundation in statistical modeling, machine learning, or data science, with preference for experience in temporal or longitudinal data analysis
Proficiency in Python and ML frameworks (PyTorch, JAX, NumPyro, PyMC, etc.)
Proven track record of transitioning models from research prototypes to production systems
Experience with probabilistic methods, survival analysis, or Bayesian inference (highly valued)
Bonus Qualifications
Experience working with clinical data and healthcare terminologies (ICD, CPT, SNOMED CT, LOINC)
Background in actuarial modeling, claims forecasting, or risk adjustment methodologies
Data Scientist V
Data engineer job in Menlo Park, CA
Creospan is a growing tech collective of makers, shakers, and problem solvers, offering solutions today that will propel businesses into a better tomorrow. “Tomorrow's ideas, built today!” In addition to being able to work alongside equally brilliant and motivated developers, our consultants appreciate the opportunity to learn and apply new skills and methodologies to different clients and industries.
******NO C2C/3RD PARTY, LOOKING FOR W2 CANDIDATES ONLY, must be able to work in the US without sponsorship now or in the future***
Summary:
The main function of the Data Scientist is to produce innovative solutions driven by exploratory data analysis from complex and high-dimensional datasets.
Job Responsibilities:
• Apply knowledge of statistics, machine learning, programming, data modeling, simulation, and advanced mathematics to recognize patterns, identify opportunities, pose business questions, and make valuable discoveries leading to prototype development and product improvement.
• Use a flexible, analytical approach to design, develop, and evaluate predictive models and advanced algorithms that lead to optimal value extraction from the data.
• Generate and test hypotheses and analyze and interpret the results of product experiments.
• Work with product engineers to translate prototypes into new products, services, and features and provide guidelines for large-scale implementation.
• Provide Business Intelligence (BI) and data visualization support, which includes, but limited to support for the online customer service dashboards and other ad-hoc requests requiring data analysis and visual support.
Skills:
• Experienced in either programming languages such as Python and/or R, big data tools such as Hadoop, or data visualization tools such as Tableau.
• The ability to communicate effectively in writing, including conveying complex information and promoting in-depth engagement on course topics.
• Experience working with large datasets.
Education/Experience:
• Master of Science degree in computer science or in a relevant field.
Lead Data Scientist - Computer Vision
Data engineer job in Santa Clara, CA
Lead Data Scientist - Computer Vision/Image Processing
About the Role
We are seeking a Lead Data Scientist to drive the strategy and execution of data science initiatives, with a particular focus on computer vision systems & image processing techniques. The ideal candidate has deep expertise in image processing techniques including Filtering, Binary Morphology, Perspective/Affine Transformation, Edge Detection.
Responsibilities
Solid knowledge of computer vision programs and image processing techniques: Filtering, Binary Morphology, Perspective/Affine Transformation, Edge Detection
Strong understanding of machine learning: Regression, Supervised and Unsupervised Learning
Proficiency in Python and libraries such as OpenCV, NumPy, scikit-learn, TensorFlow/PyTorch.
Familiarity with version control (Git) and collaborative development practices
AI Data Scientist
Data engineer job in Cupertino, CA
Onsite in Cupertino, CA from Day 1 (Client prefer local folks)
Hybrid Schedule: 3 Onsite Days (Tue, Wed, Thur) & 2 Remote Days (Mon, Fri)
Long term contract
Direct client opportunity
No Mid layer / No Implementation partners are Involved
Key points
- Need someone focused on product management and integration of generative AI solutions
- Excellent communication, organizational, and problem-solving skills
We are seeking an AI Engineer to join our Legal Operations team and lead the design, development and deployment of AI-powered tools and automation solutions that transform how our Legal Department operates.
This is a unique opportunity for a technically skilled and product-minded professional who can bridge the gap between engineering, legal, and business functions.
You will work closely with attorneys, legal ops specialists, and other engineering teams to identify opportunities for AI-driven efficiency, develop prototypes and bring scalable solutions to life.
The ideal candidate combines strong software engineering and AI expertise with excellent communication skills, product sensibility and a curiosity about legal workflows and technology
Description
As a Senior Data Scientist/ AI Engineer, you will be responsible for overseeing the design and execution of key tool development programs.
This is a unique opportunity for a technically skilled and product-minded professional who can bridge the gap between engineering, legal, and business functions.
You will work closely with attorneys, legal ops specialists, and other engineering teams to identify opportunities for AI-driven efficiency, develop prototypes and bring scalable solutions to life.
The ideal candidate combines strong software engineering and AI expertise with excellent communication skills, product sensibility and a curiosity about legal workflows and technology.
Key responsibilities may include:
Develop and deploy AI solutions that enhance legal workflows, including contract review, document classification, knowledge management and workflow automation.
Collaborate cross-functionally with attorneys, legal operations, compliance and engineering teams to identify and prioritize AI use cases.
Act as a product developer and owner from concept to rollout-defining requirements, developing proofs of concept, collecting feedback and iterating solutions.
Integrate large language models (LLMs) and other AI technologies into existing systems (e.g., document management, eDiscovery, CLM, or knowledge bases).
Evaluate and integrate third-party legal AI tools and platforms as needed, ensuring compatibility and compliance with internal systems.
Maintain strong documentation and governance around data usage, model performance and ethical AI standards.
Stay current on emerging trends in AI, machine learning and legal tech to help shape the department's AI strategy.
Minimum Qualifications
Bachelor's degree in Computer Science, Data Science, Engineering, or related field (or equivalent experience).
5+ years of experience building and deploying AI/ML or automation solutions in production environments.
Strong programming skills in Python (proven ability to quickly master new frameworks and tools).
Demonstrated experience with modern AI architectures including context engineering, tool use and retrieval augmented generation.
Proven ability to communicate complex technical concepts to non-technical stakeholders.
Strong product development mindset-able to translate business needs into practical, scalable AI tools.
Prior experience in or exposure to legal tech or legal operations.
Preferred Qualifications
Familiarity with DMS, document intelligence and CLM systems (e.g., Ironclad, Icertis, DocuSign CLM), document management platforms (e.g., iManage, NetDocuments) or legal AI tools (e.g., Harvey, Luminance, Casetext, Spellbook, etc.).
Experience building internal AI assistants or chatbots for enterprise knowledge retrieval.
Understanding of data privacy, compliance and governance frameworks relevant to legal data.
Pay Range: $65/hr - $70/hr
The specific compensation for this position will be determined by a number of factors, including the scope, complexity and location of the role as well as the cost of labor in the market; the skills, education, training, credentials and experience of the candidate; and other conditions of employment. Our full-time consultants have access to benefits including medical, dental, vision as well as 401K contributions.
Principal Data Scientist
Data engineer job in Alhambra, CA
Job Title - Principal Data Scientist- 809
Duration - 12+ months
**Only on W2 ** & **California Resident Candidates Only**
JOB DETAILS:
Skills Required:
The Principal Data Scientist works to establish a comprehensive Data Science Program to advance data-driven decision-making, streamline operations, and fully leverage modern platforms including Databricks, or similar, to meet increasing demand for predictive analytics and AI solutions. The Principal Data Scientist will guide program development, provide training and mentorship to junior members of the team, accelerate adoption of advanced analytics, and build internal capacity through structured mentorship. The Principal Data Scientist will possess exceptional communication abilities, both verbal and written, with a strong customer service mindset and the ability to translate complex concepts into clear, actionable insights; strong analytical and business acumen, including foundational experience with regression, association analysis, outlier detection, and core data analysis principles; working knowledge of database design and organization, with the ability to partner effectively with Data Management and Data Engineering teams; outstanding time management and organizational skills, with demonstrated success managing multiple priorities and deliverables in parallel; a highly collaborative work style, coupled with the ability to operate independently, maintain focus, and drive projects forward with minimal oversight; a meticulous approach to quality, ensuring accuracy, reliability, and consistency in all deliverables; and proven mentorship capabilities, including the ability to guide, coach, and upskill junior data scientists and analysts.
Experience Preferred
Five (5)+ years of professional experience leading data science initiatives, including developing machine learning models, statistical analyses, and end-to-end data science workflows in production environments.
Three (3)+ years of experience working with Databricks and similar cloud-based analytics platforms, including notebook development, feature engineering, ML model training, and workflow orchestration.
Three (3)+ years of experience applying advanced analytics and predictive modeling (e.g., regression, classification, clustering, forecasting, natural language processing).
Two (2)+ years of experience implementing MLOps practices, such as model versioning, CI/CD for ML, MLflow, automated pipelines, and model performance monitoring.
Two (2)+ years of experience collaborating with data engineering teams to design data pipelines, optimize data transformations, and implement Lakehouse or data warehouse architectures (e.g., Databricks, Snowflake, SQL-based platforms).
Two (2)+ years of experience mentoring or supervising junior data scientists or analysts, including code reviews, training, and structured skill development.
Two (2)+ years of experience with Python and SQL programming, using data sources such as SQL Server, Oracle, PostgreSQL, or similar relational databases.
One (1)+ year of experience operationalizing analytics within enterprise governance frameworks, partnering with Data Management, Security, and IT to ensure compliance, reproducibility, and best practices.
Education Preferred
This classification requires possession of a Master's degree or higher in Data Science, Statistics, Computer Science, or a closely related field. Additional qualifying professional experience may be substituted for the required education on a year-for-year basis. At least one of the following industry-recognized certifications in data science or cloud analytics, such as: • Microsoft Azure Data Scientist Associate (DP-100) • Databricks Certified Data Scientist or Machine Learning Professional • AWS Machine Learning Specialty • Google Professional Data Engineer • or equivalent advanced analytics certifications. The certification is required and may not be substituted with additional experience.
Additional Information
California Resident Candidates Only. This position is HYBRID (2 days onsite, 2 days telework). Interviews will be conducted via Microsoft Teams. The work schedule follows a 4/40 (10-hour days, Monday-Thursday), with the specific shift determined by the program manager. Shifts may range between 7:15 a.m. and 6:00 p.m
Principal Data Scientist
Data engineer job in Alhambra, CA
The Principal Data Scientist works to establish a comprehensive Data Science Program to advance data-driven decision-making, streamline operations, and fully leverage modern platforms including Databricks, or similar, to meet increasing demand for predictive analytics and AI solutions.
The Principal Data Scientist will guide program development, provide training and mentorship to junior members of the team, accelerate adoption of advanced analytics, and build internal capacity through structured mentorship.
The Principal Data Scientist will possess exceptional communication abilities, both verbal and written, with a strong customer service mindset and the ability to translate complex concepts into clear, actionable insights; strong analytical and business acumen, including foundational experience with regression, association analysis, outlier detection, and core data analysis principles; working knowledge of database design and organization, with the ability to partner effectively with Data Management and Data Engineering teams; outstanding time management and organizational skills, with demonstrated success managing multiple priorities and deliverables in parallel; a highly collaborative work style, coupled with the ability to operate independently, maintain focus, and drive projects forward with minimal oversight; a meticulous approach to quality, ensuring accuracy, reliability, and consistency in all deliverables; and proven mentorship capabilities, including the ability to guide, coach, and upskill junior data scientists and analysts.
5+ years of professional experience leading data science initiatives, including developing machine learning models, statistical analyses, and end-to-end data science workflows in production environments.
3+ years of experience working with Databricks and similar cloud-based analytics platforms, including notebook development, feature engineering, ML model training, and workflow orchestration.
3+ years of experience applying advanced analytics and predictive modeling (e.g., regression, classification, clustering, forecasting, natural language processing).
2+ years of experience implementing MLOps practices, such as model versioning, CI/CD for ML, MLflow, automated pipelines, and model performance monitoring.
2+ years of experience collaborating with data engineering teams to design data pipelines, optimize data transformations, and implement Lakehouse or data warehouse architectures (e.g., Databricks, Snowflake, SQL-based platforms).
2+ years of experience mentoring or supervising junior data scientists or analysts, including code reviews, training, and structured skill development.
2+ years of experience with Python and SQL programming, using data sources such as SQL Server, Oracle, PostgreSQL, or similar relational databases.
1+ year of experience operationalizing analytics within enterprise governance frameworks, partnering with Data Management, Security, and IT to ensure compliance, reproducibility, and best practices.
Education:
This classification requires possession of a Master's degree or higher in Data Science, Statistics, Computer Science, or a closely related field. Additional qualifying professional experience may be substituted for the required education on a year-for-year basis.
At least one of the following industry-recognized certifications in data science or cloud analytics, such as: • Microsoft Azure Data Scientist Associate (DP-100) • Databricks Certified Data Scientist or Machine Learning Professional • AWS Machine Learning Specialty • Google Professional Data Engineer • or equivalent advanced analytics certifications. The certification is required and may not be substituted with additional experience.
Senior Data Engineer - Spark, Airflow
Data engineer job in Fremont, CA
We are seeking an experienced Data Engineer to design and optimize scalable data pipelines that drive our global data and analytics initiatives.
In this role, you will leverage technologies such as Apache Spark, Airflow, and Python to build high performance data processing systems and ensure data quality, reliability, and lineage across Mastercard's data ecosystem.
The ideal candidate combines strong technical expertise with hands-on experience in distributed data systems, workflow automation, and performance tuning to deliver impactful, data-driven solutions at enterprise scale.
Responsibilities:
Design and optimize Spark-based ETL pipelines for large-scale data processing.
Build and manage Airflow DAGs for scheduling, orchestration, and checkpointing.
Implement partitioning and shuffling strategies to improve Spark performance.
Ensure data lineage, quality, and traceability across systems.
Develop Python scripts for data transformation, aggregation, and validation.
Execute and tune Spark jobs using spark-submit.
Perform DataFrame joins and aggregations for analytical insights.
Automate multi-step processes through shell scripting and variable management.
Collaborate with data, DevOps, and analytics teams to deliver scalable data solutions.
Qualifications:
Bachelor's degree in Computer Science, Data Engineering, or related field (or equivalent experience).
At least 7 years of experience in data engineering or big data development.
Strong expertise in Apache Spark architecture, optimization, and job configuration.
Proven experience with Airflow DAGs using authoring, scheduling, checkpointing, monitoring.
Skilled in data shuffling, partitioning strategies, and performance tuning in distributed systems.
Expertise in Python programming including data structures and algorithmic problem-solving.
Hands-on with Spark DataFrames and PySpark transformations using joins, aggregations, filters.
Proficient in shell scripting, including managing and passing variables between scripts.
Experienced with spark submit for deployment and tuning.
Solid understanding of ETL design, workflow automation, and distributed data systems.
Excellent debugging and problem-solving skills in large-scale environments.
Experience with AWS Glue, EMR, Databricks, or similar Spark platforms.
Knowledge of data lineage and data quality frameworks like Apache Atlas.
Familiarity with CI/CD pipelines, Docker/Kubernetes, and data governance tools.
Sr Data Platform Engineer
Data engineer job in Elk Grove, CA
Hybrid role 3X a week in office in Elk Grove, CA; no remote capabilities
This is a direct hire opportunity.
We're seeking a seasoned Senior Data Platform Engineer to design, build, and optimize scalable data solutions that power analytics, reporting, and AI/ML initiatives. This full‑time role is hands‑on, working with architects, analysts, and business stakeholders to ensure data systems are reliable, secure, and high‑performing.
Responsibilites:
Build and maintain robust data pipelines (structured, semi‑structured, unstructured).
Implement ETL workflows with Spark, Delta Lake, and cloud‑native tools.
Support big data platforms (Databricks, Snowflake, GCP) in production.
Troubleshoot and optimize SQL queries, Spark jobs, and workloads.
Ensure governance, security, and compliance across data systems.
Integrate workflows into CI/CD pipelines with Git, Jenkins, Terraform.
Collaborate cross‑functionally to translate business needs into technical solutions.
Qualifications:
7+ years in data engineering with production pipeline experience.
Expertise in Spark ecosystem, Databricks, Snowflake, GCP.
Strong skills in PySpark, Python, SQL.
Experience with RAG systems, semantic search, and LLM integration.
Familiarity with Kafka, Pub/Sub, vector databases.
Proven ability to optimize ETL jobs and troubleshoot production issues.
Agile team experience and excellent communication skills.
Certifications in Databricks, Snowflake, GCP, or Azure.
Exposure to Airflow, BI tools (Power BI, Looker Studio).
Imaging Data Engineer/Architect
Data engineer job in San Francisco, CA
About us:
Intuitive is an innovation-led engineering company delivering business outcomes for 100's of Enterprises globally. With the reputation of being a Tiger Team & a Trusted Partner of enterprise technology leaders, we help solve the most complex Digital Transformation challenges across following Intuitive Superpowers:
Modernization & Migration
Application & Database Modernization
Platform Engineering (IaC/EaC, DevSecOps & SRE)
Cloud Native Engineering, Migration to Cloud, VMware Exit
FinOps
Data & AI/ML
Data (Cloud Native / DataBricks / Snowflake)
Machine Learning, AI/GenAI
Cybersecurity
Infrastructure Security
Application Security
Data Security
AI/Model Security
SDx & Digital Workspace (M365, G-suite)
SDDC, SD-WAN, SDN, NetSec, Wireless/Mobility
Email, Collaboration, Directory Services, Shared Files Services
Intuitive Services:
Professional and Advisory Services
Elastic Engineering Services
Managed Services
Talent Acquisition & Platform Resell Services
About the job:
Title: Imaging Data Engineer/Architect
Start Date: Immediate
# of Positions: 1
Position Type: Contract/ Full-Time
Location: San Francisco, CA
Notes:
Imaging data Engineer/architect who understands Radiology and Digital pathology, related clinical data and metadata.
Hands-on experience on above technologies, and with good knowledge in the biomedical imaging, and data pipelines overall.
About the Role
We are seeking a highly skilled Imaging Data Engineer/Architect to join our San Francisco team as a Subject Matter Expert (SME) in radiology and digital pathology. This role will design and manage imaging data pipelines, ensuring seamless integration of clinical data and metadata to support advanced diagnostic and research applications. The ideal candidate will have deep expertise in medical imaging standards, cloud-based data architectures, and healthcare interoperability, contributing to innovative solutions that enhance patient outcomes.
Responsibilities
Design and implement scalable data architectures for radiology and digital pathology imaging data, including DICOM, HL7, and FHIR standards.
Develop and optimize data pipelines to process and store large-scale imaging datasets (e.g., MRI, CT, histopathology slides) and associated metadata.
Collaborate with clinical teams to understand radiology and pathology workflows, ensuring data solutions align with clinical needs.
Ensure data integrity, security, and compliance with healthcare regulations (e.g., HIPAA, GDPR).
Integrate imaging data with AI/ML models for diagnostic and predictive analytics, working closely with data scientists.
Build and maintain metadata schemas to support data discoverability and interoperability across systems.
Provide technical expertise to cross-functional teams, including product managers and software engineers, to drive imaging data strategy.
Conduct performance tuning and optimization of imaging data storage and retrieval systems in cloud environments (e.g., AWS, Google Cloud, Azure).
Document data architectures and processes, ensuring knowledge transfer to internal teams and external partners.
Stay updated on emerging imaging technologies and standards, proposing innovative solutions to enhance data workflows.
Qualifications
Education: Bachelor's degree in computer science, Biomedical Engineering, or a related field (master's preferred).
Experience:
5+ years in data engineering or architecture, with at least 3 years focused on medical imaging (radiology and/or digital pathology).
Proven experience with DICOM, HL7, FHIR, and imaging metadata standards (e.g., SNOMED, LOINC).
Hands-on experience with cloud platforms (AWS, Google Cloud, or Azure) for imaging data storage and processing.
Technical Skills:
Proficiency in programming languages (e.g., Python, Java, SQL) for data pipeline development.
Expertise in ETL processes, data warehousing, and database management (e.g., Snowflake, BigQuery, PostgreSQL).
Familiarity with AI/ML integration for imaging data analytics.
Knowledge of containerization (e.g., Docker, Kubernetes) for deploying data solutions.
Domain Knowledge:
Deep understanding of radiology and digital pathology workflows, including PACS and LIS systems.
Familiarity with clinical data integration and healthcare interoperability standards.
Soft Skills:
Strong analytical and problem-solving skills to address complex data challenges.
Excellent communication skills to collaborate with clinical and technical stakeholders.
Ability to work independently in a fast-paced environment, with a proactive approach to innovation.
Certifications (preferred):
AWS Certified Solutions Architect, Google Cloud Professional Data Engineer, or equivalent.
Certifications in medical imaging (e.g., CIIP - Certified Imaging Informatics Professional).
Data Engineer (AWS Redshift, BI, Python, ETL)
Data engineer job in Manhattan Beach, CA
We are seeking a skilled Data Engineer with strong experience in business intelligence (BI) and data warehouse development to join our team. In this role, you will design, build, and optimize data pipelines and warehouse architectures that support analytics, reporting, and data-driven decision-making. You will work closely with analysts, data scientists, and business stakeholders to ensure reliable, scalable, and high-quality data solutions.
Responsibilities:
Develop and maintain ETL/ELT pipelines for ingesting, transforming, and delivering data.
Design and enhance data warehouse models (star/snowflake schemas) and BI datasets.
Optimize data workflows for performance, scalability, and reliability.
Collaborate with BI teams to support dashboards, reporting, and analytics needs.
Ensure data quality, governance, and documentation across all solutions.
Qualifications:
Proven experience with data engineering tools (SQL, Python, ETL frameworks).
Strong understanding of BI concepts, reporting tools, and dimensional modeling.
Hands-on experience with cloud data platforms (e.g., AWS, Azure, GCP) is a plus.
Excellent problem-solving skills and ability to work in a cross-functional environment.
Bigdata Engineer
Data engineer job in Mountain View, CA
Net2Source is a Global Workforce Solutions Company headquartered at NJ, USA with its branch offices in Asia Pacific Region. We are one of the fastest growing IT Consulting company across the USA and we are hiring
"
Bigdata Engineer " for one of our clients. We offer a wide gamut of consulting solutions customized to our 450+ clients ranging from Fortune 500/1000 to Start-ups across various verticals like Technology, Financial Services, Healthcare, Life Sciences, Oil & Gas, Energy, Retail, Telecom, Utilities, Technology, Manufacturing, the Internet, and Engineering.
Position: Bigdata Engineer
Location: MTV, CA (Onsite) - Locals Only
Type: Contract
Exp Level - 10+ Years
Required Skills
Min of 7+ years working with Apache Flink and Apache Spark
5+ years' experience with Java
Strong expertise in Python
Expertise developing new pipelines
Adept at supporting and enhancing existing pipelines
Strong experience with AWS Stack
Why Work With Us?
We believe in more than just jobs-we build careers. At Net2Source, we champion leadership at all levels, celebrate diverse perspectives, and empower you to make an impact. Think work-life balance, professional growth, and a collaborative culture where your ideas matter.
Our Commitment to Inclusion & Equity
Net2Source is an equal opportunity employer, dedicated to fostering a workplace where diverse talents and perspectives are valued. We make all employment decisions based on merit, ensuring a culture of respect, fairness, and opportunity for all, regardless of age, gender, ethnicity, disability, or other protected characteristics.
Awards & Recognition
America's Most Honored Businesses (Top 10%)
Fastest-Growing Staffing Firm by Staffing Industry Analysts
INC 5000 List for Eight Consecutive Years
Top 100 by
Dallas Business Journal
Spirit of Alliance Award by Agile1
Maddhuker Singh
Sr Account & Delivery Manager
***********************
Senior Data Engineer
Data engineer job in Newport Beach, CA
Pacific Life is a Fortune 500 company headquartered in Newport Beach, California, with nearly 160 years of experience supporting policyholders. The company provides innovative life insurance, annuity solutions, and mutual funds to help individuals and businesses achieve financial security. Serving both retail and institutional markets, Pacific Life emphasizes offering value and stability to current and future generations. With a strong commitment to excellence, Pacific Life has earned a reputation for financial strength and reliability.
Role Description
This is a full-time, hybrid Senior Data Engineer position based in Newport Beach, California. The Senior Data Engineer will be responsible for designing, building, and maintaining scalable data pipelines and architecture. Day-to-day responsibilities include implementing Extract Transform Load (ETL) processes, managing data warehouses, and supporting data analytics initiatives to drive business insights. Collaboration with cross-functional teams and ensuring data quality and efficiency are essential aspects of this role.
Qualifications
Proficiency in Data Engineering, including building and maintaining large-scale data pipelines
Experience in Data Modeling, database design, and optimizing data structures
Hands-on expertise in Extract Transform Load (ETL) processes and tools
Knowledge of Data Warehousing technologies and managing data storage solutions
Familiarity with Data Analytics and deriving meaningful insights from datasets
Strong problem-solving skills and the ability to work collaboratively in a team environment
Bachelor's degree in Computer Science, Data Science, or a related field
7+ year of experience in analysis, design, development and delivery of data
7+ years of experience in SQL, ETL, ELT, leading cloud data warehouse technologies, data transformations and data management tools
Experience working in Azure Dev Ops (ADO), Build and Release CI/CD pipelines and orchestration
Experience with AWS
2+ years of experience with Snowflake, dbt, Matillion or Profisee
Understanding of data catalogs, glossary, data quality and effective data governance
Master Data Management (MDM)/Reference Data experience
Snowflake Data Engineer (DBT SQL)
Data engineer job in San Jose, CA
Job Description - Snowflake Data Engineer (DBT SQL)
Duration: 6 months
Key Responsibilities
• Design, develop, and optimize data pipelines using Snowflake and DBT SQL.
• Implement and manage data warehousing concepts, metadata management, and data modeling.
• Work with data lakes, multi-dimensional models, and data dictionaries.
• Utilize Snowflake features such as Time Travel and Zero-Copy Cloning.
• Perform query performance tuning and cost optimization in cloud environments.
• Administer Snowflake architecture, warehousing, and processing.
• Develop and maintain PL/SQL Snowflake solutions.
• Apply design patterns for scalable and maintainable data solutions.
• Collaborate with cross-functional teams and tech leads across multiple tracks.
• Provide technical and functional guidance to team members.
Required Skills & Experience
• Hands-on Snowflake development experience (mandatory).
• Strong proficiency in SQL and DBT SQL.
• Knowledge of data warehousing concepts, metadata management, and data modeling.
• Experience with data lakes, multi-dimensional models, and data dictionaries.
• Expertise in Snowflake features (Time Travel, Zero-Copy Cloning).
• Strong background in query optimization and cost management.
• Familiarity with Snowflake administration and pipeline development.
• Knowledge of PL/SQL and SQL databases (additional plus).
• Excellent communication, leadership, and organizational skills.
• Strong team player with a positive attitude.
Data Engineer
Data engineer job in San Jose, CA
Midjourney is a research lab exploring new mediums to expand the imaginative powers of the human species. We are a small, self-funded team focused on design, human infrastructure, and AI. We have no investors, no big company controlling us, and no advertisers. We are 100% supported by our amazing community.
Our tools are already used by millions of people to dream, to explore, and to create. But this is just the start. We think the story of the 2020s is about building the tools that will remake the world for the next century. We're making those tools, to expand what it means to be human.
Core Responsibilities:
Design and maintain data pipelines to consolidate information across multiple sources (subscription platforms, payment systems, infrastructure and usage monitoring, and financial systems) into a unified analytics environment
Build and manage interactive dashboards and self-service BI tools that enable leadership to track key business metrics including revenue performance, infrastructure costs, customer retention, and operational efficiency
Serve as technical owner of our financial planning platform (Pigment or similar), leading implementation and build-out of models, data connections, and workflows in partnership with Finance leadership to translate business requirements into functional system architecture
Develop automated data quality checks and cleaning processes to ensure accuracy and consistency across financial and operational datasets
Partner with Finance, Product and Operations teams to translate business questions into analytical frameworks, including cohort analysis, cost modeling, and performance trending
Create and maintain documentation for data models, ETL processes, dashboard logic, and system workflows to ensure knowledge continuity
Support strategic planning initiatives by building financial models, scenario analyses, and data-driven recommendations for resource allocation and growth investments
Required Qualifications:
3-5+ years experience in data engineering, analytics engineering, or similar role with demonstrated ability to work with large-scale datasets
Strong SQL skills and experience with modern data warehousing solutions (BigQuery, Snowflake, Redshift, etc.)
Proficiency in at least one programming language (Python, R) for data manipulation and analysis
Experience with BI/visualization tools (Looker, Tableau, Power BI, or similar)
Hands-on experience administering enterprise financial systems (NetSuite, SAP, Oracle, or similar ERP platforms)
Experience working with Stripe Billing or similar subscription management platforms, including data extraction and revenue reporting
Ability to communicate technical concepts clearly to non-technical stakeholders
Data Engineer
Data engineer job in Irvine, CA
Thank you for stopping by to take a look at the Data Integration Engineer role I posted here on LinkedIN, I appreciate it.
If you have read my s in the past, you will recognize how I write job descriptions. If you are new, allow me to introduce myself. My name is Tom Welke. I am Partner & VP at RSM Solutions, Inc and I have been recruiting technical talent for more than 23 years and been in the tech space since the 1990s. Due to this, I actually write JD's myself...no AI, no 'bots', just a real live human. I realized a while back that looking for work is about as fun as a root canal with no anesthesia...especially now. So, rather than saying 'must work well with others' and 'team mindset', I do away with that kind of nonsense and just tell it like it is.
So, as with every role I work on, social fit is almost as important as technical fit. For this one, technical fit is very very important. But, we also have some social fit characteristics that are important. This is the kind of place that requires people to dive in and learn. The hiring manager for this one is actually a very dear friend of mine. He said something interesting to me not all that long ago. He mentioned, if you aren't spending at least an hour a day learning something new, you really are doing yourself a disservice. This is that classic environment where no one says 'this is not my job'. So that ability to jump in and help is needed for success in this role.
This role is being done onsite in Irvine, California. I prefer working with candidates that are already local to the area. If you need to relocate, that is fine, but there are no relocation dollars available.
I can only work with US Citizens or Green Card Holders for this role. I cannot work with H1, OPT, EAD, F1, H4, or anyone that is not already a US Citizen or Green Card Holder for this role.
The Data Engineer role is similar to the Data Integration role I posted. However, this one is mor Ops focused, with the orchestration of deployment and ML flow, and including orchestrating and using data on the clusters and managing how the models are performing. This role focuses on coding & configuring on the ML side of the house.
You will be designing, automating, and observing end to end data pipelines that feed this client's Kubeflow driven machine learning platform, ensuring models are trained, deployed, and monitored on trustworthy, well governed data. You will build batch/stream workflows, wire them into Azure DevOps CI/CD, and surface real time health metrics in Prometheus + Grafana dashboards to guarantee data availability. The role bridges Data Engineering and MLOps, allowing data scientists to focus on experimentation and the business sees rapid, reliable predictive insight.
Here are some of the main responsibilities:
Design and implement batch and streaming pipelines in Apache Spark running on Kubernetes and Kubeflow Pipelines to hydrate feature stores and training datasets.
Build high throughput ETL/ELT jobs with SSIS, SSAS, and T SQL against MS SQL Server, applying Data Vault style modeling patterns for auditability.
Integrate source control, build, and release automation using GitHub Actions and Azure DevOps for every pipeline component.
Instrument pipelines with Prometheus exporters and visualize SLA, latency, and error budget metrics to enable proactive alerting.
Create automated data quality and schema drift checks; surface anomalies to support a rapid incident response process.
Use MLflow Tracking and Model Registry to version artifacts, parameters, and metrics for reproducible experiments and safe rollbacks.
Work with data scientists to automate model retraining and deployment triggers within Kubeflow based on data freshness or concept drift signals.
Develop PowerShell and .NET utilities to orchestrate job dependencies, manage secrets, and publish telemetry to Azure Monitor.
Optimize Spark and SQL workloads through indexing, partitioning, and cluster sizing strategies, benchmarking performance in CI pipelines.
Document lineage, ownership, and retention policies; ensure pipelines conform to PCI/SOX and internal data governance standards.
Here is what we are seeking:
At least 6 years of experience building data pipelines in Spark or equivalent.
At least 2 years deploying workloads on Kubernetes/Kubeflow.
At least 2 years of experience with MLflow or similar experiment‑tracking tools.
At least 6 years of experience in T‑SQL, Python/Scala for Spark.
At least 6 years of PowerShell/.NET scripting.
At least 6 years of experience with with GitHub, Azure DevOps, Prometheus, Grafana, and SSIS/SSAS.
Kubernetes CKA/CKAD, Azure Data Engineer (DP‑203), or MLOps‑focused certifications (e.g., Kubeflow or MLflow) would be great to see.
Mentor engineers on best practices in containerized data engineering and MLOps.
Staff Data Scientist
Data engineer job in San Jose, CA
Staff Data Scientist | San Francisco | $250K-$300K + Equity
We're partnering with one of the fastest-growing AI companies in the world to hire a Staff Data Scientist. Backed by over $230M from top-tier investors and already valued at over $1B, they've secured customers that include some of the most recognizable names in tech. Their AI platform powers millions of daily interactions and is quickly becoming the enterprise standard for conversational AI.
In this role, you'll bring rigorous analytics and experimentation leadership that directly shapes product strategy and company performance.
What you'll do:
Drive deep-dive analyses on user behavior, product performance, and growth drivers
Design and interpret A/B tests to measure product impact at scale
Build scalable data models, pipelines, and dashboards for company-wide use
Partner with Product and Engineering to embed experimentation best practices
Evaluate ML models, ensuring business relevance, performance, and trade-off clarity
What we're looking for:
5+ years in data science or product analytics at scale (consumer or marketplace preferred)
Advanced SQL and Python skills, with strong foundations in statistics and experimental design
Proven record of designing, running, and analyzing large-scale experiments
Ability to analyze and reason about ML models (classification, recommendation, LLMs)
Strong communicator with a track record of influencing cross-functional teams
If you're excited by the sound of this challenge- apply today and we'll be in touch.