Senior Agentic AI Data Scientist
Bethlehem, PA jobs
We need HANDS ON engineering leaders, not architects.
MUST BE VERY SEASONED DATA SCIENCE ENGINEERS WHO IS WILLING TO DO A SHORT ONLINE TEST
Can sit in Hudson Yards or Bethlehem, PA 2-3 days onsite
Hybrid role - candidates must be able to work onsite in Hudson Yards, NY or Bethlehem, PA
I will not entertain out of state candidates.
We're looking for a very Senior Data Scientist - Agentic AI with strong hands-on experience in AI/ML, LLMs, and intelligent automation. This role will focus on building, deploying, and scaling Agentic AI systems and enterprise-level generative AI solutions that transform business operations and customer experiences.
You'll work on high-visibility projects alongside senior leadership, translating cutting-edge AI research into real-world impact.
Key Responsibilities:
Design and deploy Agentic AI solutions to automate complex workflows.
Operationalize LLMs and generative AI to process unstructured data (contracts, claims, medical records, etc.).
Build autonomous agents and reasoning systems integrated into enterprise platforms.
Partner with engineering and AIOps teams to move models from prototype to production.
Translate research in reinforcement learning and reasoning into business-ready AI applications.
Mentor junior data scientists and establish best practices for scalable AI development.
What We're Looking For:
PhD (2+ yrs) or Master's (10+ yrs) in Statistics, Computer Science, Engineering, or Applied Mathematics.
5+ years of hands-on AI/ML development experience.
Strong programming skills in Python, PyTorch, TensorFlow, LangGraph.
Proven background in machine learning, optimization, and statistical modeling.
Excellent communication, leadership, and cross-functional collaboration skills.
Senior Data Scientist Agentic AI
New York, NY jobs
My name is Bill Stevens, and I have a new three month plus contract to hire Senior Data Scientist Agentic AI opportunity available for a major firm with offices located in Midtown, Manhattan on the West Side and Holmdel, New Jersey that could be of interest to you, please review my specification below and I am available at any time to speak with you so please feel free to call me. The work week schedule will be hybrid, three days a week in either of the firms' offices and two days remote. The onsite work site will be determined by the candidate.
The ideal candidate should also possess a green card or be of citizenship. No Visa entanglements and no H1-B holding company submittals.
The firms Data & AI team spearheads a culture of intelligence and automation across the enterprise, creating business value from advanced data and AI solutions. Their team includes data scientists, engineers, analysts, and product leaders working together to deliver AI-driven products that power growth, improve risk management, and elevate customer experience.
The firm created the Data Science Lab (DSL) to reimagine emerging technologies, evolving consumer needs, and rapid advances in AI. The DSL expedites transition to data-driven decision making and fosters innovation by rapidly testing, scaling, and operationalizing state-of-the-art AI.
We are seeking a Senior Data Scientist Engineer, Agentic AI who is an experienced individual contributor with deep expertise in AI/ML and a track record of turning advanced research into practical, impactful enterprise solutions. This role focuses on building, deploying, and scaling agentic AI systems, large language models, and intelligent automation solutions that reshape how the firm operates, serves customers, and drives growth. You'll collaborate directly with senior executives on high-visibility projects to bring next-generation AI to life across the firm's products and services.
Key Responsibilities:
Design and deploy Agentic AI solutions to automate complex business workflows, enhance decision-making, and improve customer and employee experiences.
Operationalize cutting-edge LLMs and generative AI to process and understand unstructured data such as contracts, claims, medical records, and customer interactions.
Build autonomous agents and multi-step reasoning systems that integrate with the firm's core platforms to deliver measurable business impact.
Partner with data engineers and AIOps teams to ensure AI models are production-ready, scalable, and robust, from prototype to enterprise deployment.
Translate research in agentic AI, reinforcement learning, and reasoning into practical solutions that support underwriting, claims automation, customer servicing, and risk assessment.
Collaborate with product owners, engineers, and business leaders to define use cases, design solutions, and measure ROI.
Contribute to the Data Science Lab by establishing repeatable frameworks for developing, testing, and deploying agentic AI solutions.
Mentor junior data scientists and contribute to the standardization of AI/ML practices, tools, and frameworks across the firm.
You are:
Passionate about pushing the frontier of AI while applying it to solve real-world business problems.
Excited by the potential of agentic AI, autonomous systems, and LLM-based solutions to transform industries.
A hands-on builder who thrives on seeing AI solutions move from proof-of-concept to real-world deployment.
Comfortable working in multi-disciplinary teams and engaging with senior business leaders to align AI solutions with enterprise goals.
You have:
PhD with 2+ years of experience OR have a Master's degree with 4+ years of experience in Statistics, Computer Science, Engineering, Applied mathematics or related field
3+ years of hands-on AI modeling/development experience
Strong theoretical foundations in probability & statistics
Strong programming skills in Python including PyTorch, Tensorflow, LangGraph
Solid background in machine learning algorithms, optimization, and statistical modeling
Excellent communication skills and ability to work and collaborating cross-functionally with Product, Engineering, and other disciplines at both the leadership and hands-on level
Excellent analytical and problem-solving abilities with superb attention to detail
Proven leadership in providing technical leadership and mentoring to data scientists and strong management skills with ability to monitor/track performance for enterprise success
This position pays $150.00 per hour on a w-2 hourly basis or $175.00 per hour on a Corp basis. The Corp rate is for independent contractors only and not third-party firms. No Visa entanglements and no H1-B holding companies.
The interview process will include an initial phone or virtual interview screening.
Please let me know your interest in this position, availability to interview and start for this position along with a copy of your recent resume or please feel free to call me at any time with any questions.
Regards
Bill Stevens
Senior Technical Recruiter
PRI Technology
Denville, New Jersey 07834
**************
******************************
Senior Data Scientist
McLean, VA jobs
We are seeking a highly experienced **Principal Gen AI Scientist** with a strong focus on **Generative AI (GenAI)** to lead the design and development of cutting-edge AI Agents, Agentic Workflows and Gen AI Applications that solve complex business problems. This role requires advanced proficiency in Prompt Engineering, Large Language Models (LLMs), RAG, Graph RAG, MCP, A2A, multi-modal AI, Gen AI Patterns, Evaluation Frameworks, Guardrails, data curation, and AWS cloud deployments. You will serve as a hands-on Gen AI (data) scientist and critical thought leader, working alongside full stack developers, UX designers, product managers and data engineers to shape and implement enterprise-grade Gen AI solutions.
Key Responsibilities:
* Architect and implement scalable AI Agents, Agentic Workflows and GenAI applications to address diverse and complex business use cases.
* Develop, fine-tune, and optimize lightweight LLMs; lead the evaluation and adaptation of models such as Claude (Anthropic), Azure OpenAI, and open-source alternatives.
* Design and deploy Retrieval-Augmented Generation (RAG) and Graph RAG systems using vector databases and knowledge bases.
* Curate enterprise data using connectors integrated with AWS Bedrock's Knowledge Base/Elastic
* Implement solutions leveraging MCP (Model Context Protocol) and A2A (Agent-to-Agent) communication.
* Build and maintain Jupyter-based notebooks using platforms like SageMaker and MLFlow/Kubeflow on Kubernetes (EKS).
* Collaborate with cross-functional teams of UI and microservice engineers, designers, and data engineers to build full-stack Gen AI experiences.
* Integrate GenAI solutions with enterprise platforms via API-based methods and GenAI standardized patterns.
* Establish and enforce validation procedures with Evaluation Frameworks, bias mitigation, safety protocols, and guardrails for production-ready deployment.
* Design & build robust ingestion pipelines that extract, chunk, enrich, and anonymize data from PDFs, video, and audio sources for use in LLM-powered workflows-leveraging best practices like semantic chunking and privacy controls
* Orchestrate multimodal pipelines** using scalable frameworks (e.g., Apache Spark, PySpark) for automated ETL/ELT workflows appropriate for unstructured media
* Implement embeddings drives-map media content to vector representations using embedding models, and integrate with vector stores (AWS KnowledgeBase/Elastic/Mongo Atlas) to support RAG architectures
**Required Qualifications:**
* 10+ years of experience in AI/ML, with 3+ years in applied GenAI or LLM-based solutions.
* Deep expertise in prompt engineering, fine-tuning, RAG, GraphRAG, vector databases (e.g., AWS KnowledgeBase / Elastic), and multi-modal models.
* Proven experience with cloud-native AI development (AWS SageMaker, Bedrock, MLFlow on EKS).
* Strong programming skills in Python and ML libraries (Transformers, LangChain, etc.).
* Deep understanding of Gen AI system patterns and architectural best practices, Evaluation Frameworks
* Demonstrated ability to work in cross-functional agile teams.
* Need Github Code Repository Link for each candidate. Please thoroughly vet the candidates.
**Preferred Qualifications:**
* Published contributions or patents in AI/ML/LLM domains.
* Hands-on experience with enterprise AI governance and ethical deployment frameworks.
* Familiarity with CI/CD practices for ML Ops and scalable inference APIs.
#LI-CGTS
#TS-2942
Senior Data Scientist
McLean, VA jobs
Purpose:
As a Data Scientist, you will play a key role in delivering impactful, data-driven solutions for our strategic enterprise clients. This role also offers the opportunity to shape and grow Infocepts' Data Science & AI practice, contributing to high-impact AI/ML initiatives, crafting data-driven narratives for stakeholders, and applying advanced techniques to solve complex business problems from strategy to execution.
Key Result Areas and Activities:
Design, build, and deploy AI/ML solutions using modern cloud and data platforms.
Lead data science projects across industries, ensuring alignment with business goals.
Apply supervised, unsupervised, deep learning, and Generative AI (e.g., LLMs, agentic workflows) techniques to address client use cases.
Collaborate with data engineering teams to optimize model pipelines using Delta Lake and Spark.
Communicate findings effectively through data visualizations and stakeholder presentations.
Drive adoption of MLOps practices for scalable and reliable model deployment.
Contribute to the evolution of Infocepts' Data Science & AI offerings through innovation and knowledge sharing.
Roles & Responsibilities
Essential Skills
5+ years of experience in applied AI and machine/deep learning.
Hands-on experience with Databricks, MLflow, PySpark, and Spark MLlib.
Proficiency in Python and SQL for model development and data manipulation.
Strong understanding of supervised and unsupervised learning, deep learning, and Generative AI.
Familiarity with cloud platforms: AWS, Azure, and GCP.
Solid foundation in advanced statistical methods and probabilistic analysis.
Ability to lead end-to-end AI/ML projects, including design, development, and stakeholder management.
Experience with visualization tools like Tableau, Power BI, or similar.
Familiarity with ML workflow orchestration and MLOps practices.
Desirable Skills
Experience with LLMs (Large Language Models) and agentic AI workflows.
Familiarity with modern data platforms like Snowflake.
Exposure to real-time data processing in cloud-native environments.
Contributions to open-source AI projects or publications in data science communities.
Qualifications
Bachelor's or Master's degree in Computer Science, Data Science, Machine Learning, Statistics, or a related field.
Certifications in cloud platforms (AWS, Azure, GCP) or Databricks are a plus.
Qualities:
Able to consult, write, and present persuasively
Able to work in a self-organized and cross-functional team
Able to iterate based on new information, peer reviews, and feedback
Able to work seamlessly with clients across multiple geographies
Research focused mindset
Excellent analytical, presentation, reporting, documentation and interactive skills
"Infocepts is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected Veteran status, age, or any other characteristic protected by law."
Applied Data Scientist/ Data Science Engineer
Austin, TX jobs
Role: Applied Data Scientist/ Data Science Engineer
Yrs. of experience: 8+ Yrs.
Job type : Fulltime
Job Responsibilities:
You will be part of a team that innovates and collaborates with internal stakeholders to deliver world-class solutions with a customer first mentality. This group is passionate about the data science field and is motivated to find opportunity in, and develop solutions for, evolving challenges.
You will:
Solve business and customer issues utilizing AI/ML - Mandatory
Build prototypes and scalable AI/ML solutions that will be integrated into software products
Collaborate with software engineers, business stakeholders and product owners in an Agile environment
Have complete ownership of model outcomes and drive continuous improvement
Essential Requirements:
Strong coding skills in Python and SQL - Mandatory
Machine Learning knowledge (Deep learning, Information Retrieval (RAG), GenAI , Classification, Forecasting, Regression, etc. on large datasets) with experience in ML model deployment
Ability to work with internal stakeholders to transfer business questions into quantitative problem statements
Ability to effectively communicate data science progress to non-technical internal stakeholders
Ability to lead a team of data scientists is a plus
Experience with Big Data technologies and/or software development is a plus
Senior Data Scientist
McLean, VA jobs
Locals to Only# In- Person Interview
Job Title: Data Scientist Specialist
We are seeking a highly experienced Principal Gen AI Scientist with a strong focus on Generative AI (GenAI) to lead the design and development of cutting-edge AI Agents, Agentic Workflows and Gen AI Applications that solve complex business problems. This role requires advanced proficiency in Prompt Engineering, Large Language Models (LLMs), RAG, Graph RAG, MCP, A2A, multi-modal AI, Gen AI Patterns, Evaluation Frameworks, Guardrails, data curation, and AWS cloud deployments. You will serve as a hands-on Gen AI (data) scientist and critical thought leader, working alongside full stack developers, UX designers, product managers and data engineers to shape and implement enterprise-grade Gen AI solutions.
Responsibilities:
Architect and implement scalable AI Agents, Agentic Workflows and GenAI applications to address diverse and complex business use cases.
Develop, fine-tune, and optimize lightweight LLMs; lead the evaluation and adaptation of models such as Claude (Anthropic), Azure OpenAI, and open-source alternatives.
Design and deploy Retrieval-Augmented Generation (RAG) and Graph RAG systems using vector databases and knowledge bases.
Curate enterprise data using connectors integrated with AWS Bedrock's Knowledge Base/Elastic.
Implement solutions leveraging MCP (Model Context Protocol) and A2A (Agent-to-Agent) communication.
Build and maintain Jupyter-based notebooks using platforms like AWS SageMaker and MLFlow/Kubeflow on Kubernetes (EKS).
Collaborate with cross-functional teams of UI and microservice engineers, designers, and data engineers to build full-stack Gen AI experiences.
Integrate GenAI solutions with enterprise platforms via API-based methods and GenAI standardized patterns.
Establish and enforce validation procedures with Evaluation Frameworks, bias mitigation, safety protocols, and guardrails for production-ready deployment.
Design & build robust ingestion pipelines that extract, chunk, enrich, and anonymize data from PDFs, video, and audio sources for use in LLM-powered workflows-leveraging best practices like semantic chunking and privacy controls.
Orchestrate multimodal pipelines** using scalable frameworks (e.g., Apache Spark, PySpark) for automated ETL/ELT workflows appropriate for unstructured media.
Implement embeddings drives-map media content to vector representations using embedding models, and integrate with vector stores (AWS Knowledge Base/Elastic/Mongo Atlas) to support RAG architectures.
Qualifications:
experience in AI/ML, with applied GenAI or LLM-based solutions.
Deep expertise in prompt engineering, fine-tuning, RAG, GraphRAG, vector databases (e.g., AWS Knowledge Base / Elastic), and multi-modal models.
Proven experience with cloud-native AI development (AWS SageMaker, Amazon Bedrock, MLFlow on EKS).
Strong programming skills in Python and ML libraries (Transformers, LangChain, etc.).
Deep understanding of Gen AI system patterns and architectural best practices, Evaluation Frameworks.
Demonstrated ability to work in cross-functional agile teams.
Data Scientist
Reston, VA jobs
• Collect, clean, and preprocess large datasets from multiple sources.
• Apply statistical analysis and machine learning techniques to solve business problems.
• Build predictive models and algorithms to optimize processes and improve outcomes.
• Develop dashboards and visualizations to communicate insights effectively.
• Collaborate with cross-functional teams (Product, Engineering, Risk, Marketing) to identify opportunities for leveraging data.
• Ensure data integrity, security, and compliance with organizational standards.
• Stay current with emerging technologies and best practices in data science and AI.
________________________________________
Required Qualifications
• Bachelor's or Master's degree in Data Science, Computer Science, Statistics, Mathematics, or related field.
• Strong proficiency in Python, R, SQL, and experience with data manipulation libraries (e.g., Pandas, NumPy).
• Hands-on experience with machine learning frameworks (e.g., scikit-learn, TensorFlow, PyTorch).
• Solid understanding of statistical modeling, hypothesis testing, and data visualization.
• Experience with big data platforms (e.g., Spark, Hadoop) and cloud environments (AWS, Azure, GCP).
• Excellent problem-solving skills and ability to communicate complex concepts clearly.
________________________________________
Preferred Qualifications
• Experience in risk modeling, financial services, or product analytics.
• Knowledge of MLOps and deploying models in production.
• Familiarity with data governance and compliance frameworks.
________________________________________
Soft Skills
• Strong analytical thinking and attention to detail.
• Ability to work independently and in a team environment.
• Effective communication and stakeholder management skills.
#LI-CGTS
#TS-0455
Machine Learning Engineer / Data Scientist / GenAI
New York, NY jobs
NYC NY / Hybrid
12+ Months
Project - Leveraging Llama to extract cybersecurity insights out of unstructured data from their ticketing system.
Must have strong experience with:
Llama
Python
Hadoop
MCP
Machine Learning (ML)
They need a strong developer - using llama and Hadoop (this is where the data sits), experience with MCP. They have various ways to pull the data out of their tickets but want someone who can come in and make recommendations on the best way to do it and then get it done. They have tight timelines.
Thanks and Regards!
Lavkesh Dwivedi
************************
Amtex System Inc.
28 Liberty Street, 6th Floor | New York, NY - 10005
************
********************
Senior Data Governance Consultant (Informatica)
Plano, TX jobs
Senior Data Governance Consultant (Informatica)
About Paradigm - Intelligence Amplified
Paradigm is a strategic consulting firm that turns vision into tangible results. For over 30 years, we've helped Fortune 500 and high-growth organizations accelerate business outcomes across data, cloud, and AI. From strategy through execution, we empower clients to make smarter decisions, move faster, and maximize return on their technology investments. What sets us apart isn't just what we do, it's how we do it. Driven by a clear mission and values rooted in integrity, excellence, and collaboration, we deliver work that creates lasting impact. At Paradigm, your ideas are heard, your growth is prioritized, your contributions make a difference.
Summary:
We are seeking a Senior Data Governance Consultant to lead and enhance data governance capabilities across a financial services organization
The Senior Data Governance Consultant will collaborate closely with business, risk, compliance, technology, and data management teams to define data standards, strengthen data controls, and drive a culture of data accountability and stewardship
The ideal candidate will have deep experience in developing and implementing data governance frameworks, data policies, and control mechanisms that ensure compliance, consistency, and trust in enterprise data assets
Hands-on experience with Informatica, including Master Data Management (MDM) or Informatica Data Management Cloud (IDMC), is preferred
This position is Remote, with occasional travel to Plano, TX
Responsibilities:
Data Governance Frameworks:
Design, implement, and enhance data governance frameworks aligned with regulatory expectations (e.g., BCBS 239, GDPR, CCPA, DORA) and internal control standards
Policy & Standards Development:
Develop, maintain, and operationalize data policies, standards, and procedures that govern data quality, metadata management, data lineage, and data ownership
Control Design & Implementation:
Define and embed data control frameworks across data lifecycle processes to ensure data integrity, accuracy, completeness, and timeliness
Risk & Compliance Alignment:
Work with risk and compliance teams to identify data-related risks and ensure appropriate mitigation and monitoring controls are in place
Stakeholder Engagement:
Partner with data owners, stewards, and business leaders to promote governance practices and drive adoption of governance tools and processes
Data Quality Management:
Define and monitor data quality metrics and KPIs, establishing escalation and remediation procedures for data quality issues
Metadata & Lineage:
Support metadata and data lineage initiatives to increase transparency and enable traceability across systems and processes
Reporting & Governance Committees:
Prepare materials and reporting for data governance forums, risk committees, and senior management updates
Change Management & Training:
Develop communication and training materials to embed governance culture and ensure consistent understanding across the organization
Required Qualifications:
7+ years of experience in data governance, data management, or data risk roles within financial services (banking, insurance, or asset management preferred)
Strong knowledge of data policy development, data standards, and control frameworks
Proven experience aligning data governance initiatives with regulatory and compliance requirements
Familiarity with Informatica data governance and metadata tools
Excellent communication skills with the ability to influence senior stakeholders and translate technical concepts into business language
Deep understanding of data management principles (DAMA-DMBOK, DCAM, or equivalent frameworks)
Bachelor's or Master's Degree in Information Management, Data Science, Computer Science, Business, or related field
Preferred Qualifications:
Hands-on experience with Informatica, including Master Data Management (MDM) or Informatica Data Management Cloud (IDMC), is preferred
Experience with data risk management or data control testing
Knowledge of financial regulatory frameworks (e.g., Basel, MiFID II, Solvency II, BCBS 239)
Certifications, such as Informatica, CDMP, or DCAM
Background in consulting or large-scale data transformation programs
Key Competencies:
Strategic and analytical thinking
Strong governance and control mindset
Excellent stakeholder and relationship management
Ability to drive organizational change and embed governance culture
Attention to detail with a pragmatic approach
Why Join Paradigm
At Paradigm, integrity drives innovation. You'll collaborate with curious, dedicated teammates, solving complex problems and unlocking immense data value for leading organizations. If you seek a place where your voice is heard, growth is supported, and your work creates lasting business value, you belong at Paradigm.
Learn more at ********************
Policy Disclosure:
Paradigm maintains a strict drug-free workplace policy. All offers of employment are contingent upon successfully passing a standard 5-panel drug screen. Please note that a positive test result for any prohibited substance, including marijuana, will result in disqualification from employment, regardless of state laws permitting its use. This policy applies consistently across all positions and locations.
Snowflake Data Engineer (DBT SQL)
San Jose, CA jobs
Job Description - Snowflake Data Engineer (DBT SQL)
Duration: 6 months
Key Responsibilities
• Design, develop, and optimize data pipelines using Snowflake and DBT SQL.
• Implement and manage data warehousing concepts, metadata management, and data modeling.
• Work with data lakes, multi-dimensional models, and data dictionaries.
• Utilize Snowflake features such as Time Travel and Zero-Copy Cloning.
• Perform query performance tuning and cost optimization in cloud environments.
• Administer Snowflake architecture, warehousing, and processing.
• Develop and maintain PL/SQL Snowflake solutions.
• Apply design patterns for scalable and maintainable data solutions.
• Collaborate with cross-functional teams and tech leads across multiple tracks.
• Provide technical and functional guidance to team members.
Required Skills & Experience
• Hands-on Snowflake development experience (mandatory).
• Strong proficiency in SQL and DBT SQL.
• Knowledge of data warehousing concepts, metadata management, and data modeling.
• Experience with data lakes, multi-dimensional models, and data dictionaries.
• Expertise in Snowflake features (Time Travel, Zero-Copy Cloning).
• Strong background in query optimization and cost management.
• Familiarity with Snowflake administration and pipeline development.
• Knowledge of PL/SQL and SQL databases (additional plus).
• Excellent communication, leadership, and organizational skills.
• Strong team player with a positive attitude.
Sr Data Platform Engineer
Elk Grove, CA jobs
Hybrid role 3X a week in office in Elk Grove, CA; no remote capabilities
This is a direct hire opportunity.
We're seeking a seasoned Senior Data Platform Engineer to design, build, and optimize scalable data solutions that power analytics, reporting, and AI/ML initiatives. This full‑time role is hands‑on, working with architects, analysts, and business stakeholders to ensure data systems are reliable, secure, and high‑performing.
Responsibilites:
Build and maintain robust data pipelines (structured, semi‑structured, unstructured).
Implement ETL workflows with Spark, Delta Lake, and cloud‑native tools.
Support big data platforms (Databricks, Snowflake, GCP) in production.
Troubleshoot and optimize SQL queries, Spark jobs, and workloads.
Ensure governance, security, and compliance across data systems.
Integrate workflows into CI/CD pipelines with Git, Jenkins, Terraform.
Collaborate cross‑functionally to translate business needs into technical solutions.
Qualifications:
7+ years in data engineering with production pipeline experience.
Expertise in Spark ecosystem, Databricks, Snowflake, GCP.
Strong skills in PySpark, Python, SQL.
Experience with RAG systems, semantic search, and LLM integration.
Familiarity with Kafka, Pub/Sub, vector databases.
Proven ability to optimize ETL jobs and troubleshoot production issues.
Agile team experience and excellent communication skills.
Certifications in Databricks, Snowflake, GCP, or Azure.
Exposure to Airflow, BI tools (Power BI, Looker Studio).
Data Engineer (AWS Redshift, BI, Python, ETL)
Manhattan Beach, CA jobs
We are seeking a skilled Data Engineer with strong experience in business intelligence (BI) and data warehouse development to join our team. In this role, you will design, build, and optimize data pipelines and warehouse architectures that support analytics, reporting, and data-driven decision-making. You will work closely with analysts, data scientists, and business stakeholders to ensure reliable, scalable, and high-quality data solutions.
Responsibilities:
Develop and maintain ETL/ELT pipelines for ingesting, transforming, and delivering data.
Design and enhance data warehouse models (star/snowflake schemas) and BI datasets.
Optimize data workflows for performance, scalability, and reliability.
Collaborate with BI teams to support dashboards, reporting, and analytics needs.
Ensure data quality, governance, and documentation across all solutions.
Qualifications:
Proven experience with data engineering tools (SQL, Python, ETL frameworks).
Strong understanding of BI concepts, reporting tools, and dimensional modeling.
Hands-on experience with cloud data platforms (e.g., AWS, Azure, GCP) is a plus.
Excellent problem-solving skills and ability to work in a cross-functional environment.
Data Engineer w/ Python & SQL
Alpharetta, GA jobs
We're looking for a Data Engineer to build and maintain scalable data pipelines and cloud data infrastructure on GCP. The role focuses on BigQuery, Dataflow, and modern ETL/ELT to support analytics and ML workflows.
MUST HAVES
A problem solver with ability to analyze and research complex issues and problems; and proposing actionable solutions and/or strategies.
Solid understanding and hands on experience with major cloud platforms.
Experience in designing and implementing data pipelines.
Must have strong Python, SQL & GCP skills
Responsibilities
Build and optimize batch/streaming pipelines using Dataflow, Pub/Sub, Composer.
Develop and tune BigQuery models, queries, and ingestion processes.
Implement IaC (Terraform), CI/CD, monitoring, and data quality checks.
Ensure data governance, security, and reliable pipeline operations.
Collaborate with data science teams and support Vertex AI-based ML workflows.
Must-Have
Must have strong Python, SQL & GCP skills
3-5+ years of data engineering experience.
Hands-on GCP experience (BigQuery, Dataflow, Pub/Sub).
Solid ETL/ELT and data modeling experience.
Nice-to-Have
GCP certifications, Spark, Kafka, Airflow, dbt/Dataform, Docker/K8s.
Data Engineer
Jersey City, NJ jobs
ONLY LOCALS TO NJ/NY - NO RELOCATION CANDIDATES
Skillset: Data Engineer
Must Haves: Python, PySpark, AWS - ECS, Glue, Lambda, S3
Nice to Haves: Java, Spark, React Js
Interview Process: Interview Process: 2 rounds, 2nd will be on site
You're ready to gain the skills and experience needed to grow within your role and advance your career - and we have the perfect software engineering opportunity for you.
As a Data Engineer III - Python / Spark / Data Lake at JPMorgan Chase within the Consumer and Community Bank , you will be a seasoned member of an agile team, tasked with designing and delivering reliable data collection, storage, access, and analytics solutions that are secure, stable, and scalable. Your responsibilities will include developing, testing, and maintaining essential data pipelines and architectures across diverse technical areas, supporting various business functions to achieve the firm's business objectives.
Job responsibilities:
• Supports review of controls to ensure sufficient protection of enterprise data.
• Advises and makes custom configuration changes in one to two tools to generate a product at the business or customer request.
• Updates logical or physical data models based on new use cases.
• Frequently uses SQL and understands NoSQL databases and their niche in the marketplace.
• Adds to team culture of diversity, opportunity, inclusion, and respect.
• Develop enterprise data models, Design/ develop/ maintain large-scale data processing pipelines (and infrastructure), Lead code reviews and provide mentoring thru the process, Drive data quality, Ensure data accessibility (to analysts and data scientists), Ensure compliance with data governance requirements, and Ensure business alignment (ensure data engineering practices align with business goals).
• Supports review of controls to ensure sufficient protection of enterprise data
Required qualifications, capabilities, and skills
• Formal training or certification on data engineering concepts and 2+ years applied experience
• Experience across the data lifecycle, advanced experience with SQL (e.g., joins and aggregations), and working understanding of NoSQL databases
• Experience with statistical data analysis and ability to determine appropriate tools and data patterns to perform analysis
• Extensive experience in AWS, design, implementation, and maintenance of data pipelines using Python and PySpark.
• Proficient in Python and PySpark, able to write and execute complex queries to perform curation and build views required by end users (single and multi-dimensional).
• Proven experience in performance and tuning to ensure jobs are running at optimal levels and no performance bottleneck.
• Advanced proficiency in leveraging Gen AI models from Anthropic (or OpenAI, or Google) using APIs/SDKs
• Advanced proficiency in cloud data lakehouse platform such as AWS data lake services, Databricks or Hadoop, relational data store such as Postgres, Oracle or similar, and at least one NOSQL data store such as Cassandra, Dynamo, MongoDB or similar
• Advanced proficiency in Cloud Data Warehouse Snowflake, AWS Redshift
• Advanced proficiency in at least one scheduling/orchestration tool such as Airflow, AWS Step Functions or similar
• Proficiency in Unix scripting, data structures, data serialization formats such as JSON, AVRO, Protobuf, or similar, big-data storage formats such as Parquet, Iceberg, or similar, data processing methodologies such as batch, micro-batching, or stream, one or more data modelling techniques such as Dimensional, Data Vault, Kimball, Inmon, etc., Agile methodology, TDD or BDD and CI/CD tools.
Preferred qualifications, capabilities, and skills
• Knowledge of data governance and security best practices.
• Experience in carrying out data analysis to support business insights.
• Strong Python and Spark
Azure Data Engineer Sr
Irving, TX jobs
Minimum 7 years of relevant work experience in data engineering, with at least 2 years in a data modeling.
Strong technical foundation in Python, SQL, and experience with cloud platforms (Azure,).
Deep understanding of data engineering fundamentals, including database architecture and design, Extract, transform and load (ETL) processes, data lakes, data warehousing, and both batch and streaming technologies.
Experience with data orchestration tools (e.g., Airflow), data processing frameworks (e.g., Spark, Databricks), and data visualization tools (e.g., Tableau, Power BI).
Proven ability to lead a team of engineers, fostering a collaborative and high-performing environment.
Sr. Data Engineer (SQL+Python+AWS)
Saint Petersburg, FL jobs
looking for a Sr. Data Engineer (SQL+Python+AWS) to work on a 12+ Months, Contract (potential Extension or may Convert to Full-time) = Hybrid at St. Petersburg, FL 33716 with a Direct Financial Client = only on W2 for US Citizen or Green Card Holders.
Notes from the Hiring Manager:
• Setting up Python environments and data structures to support the Data Science/ML team.
• No prior Data Science or Machine Learning experience required.
• Role involves building new data pipelines and managing file-loading connections.
• Strong SQL skills are essential.
• Contract-to-hire position.
• Hybrid role based in St. Pete, FL (33716) only.
Duties:
This role is building and maintaining data pipelines that connect Oracle-based source systems to AWS cloud environments, to provide well-structured data for analysis and machine learning in AWS SageMaker.
It includes working closely with data scientists to deliver scalable data workflows as a foundation for predictive modeling and analytics.
• Develop and maintain data pipelines to extract, transform, and load data from Oracle databases and other systems into AWS environments (S3, Redshift, Glue, etc.).
• Collaborate with data scientists to ensure data is prepared, cleaned, and optimized for SageMaker-based machine learning workloads.
• Implement and manage data ingestion frameworks, including batch and streaming pipelines.
• Automate and schedule data workflows using AWS Glue, Step Functions, or Airflow.
• Develop and maintain data models, schemas, and cataloging processes for discoverability and consistency.
• Optimize data processes for performance and cost efficiency.
• Implement data quality checks, validation, and governance standards.
• Work with DevOps and security teams to comply with RJ standards.
Skills:
Required:
• Strong proficiency with SQL and hands-on experience working with Oracle databases.
• Experience designing and implementing ETL/ELT pipelines and data workflows.
• Hands-on experience with AWS data services, such as S3, Glue, Redshift, Lambda, and IAM.
• Proficiency in Python for data engineering (pandas, boto3, pyodbc, etc.).
• Solid understanding of data modeling, relational databases, and schema design.
• Familiarity with version control, CI/CD, and automation practices.
• Ability to collaborate with data scientists to align data structures with model and analytics requirements
Preferred:
• Experience integrating data for use in AWS SageMaker or other ML platforms.
• Exposure to MLOps or ML pipeline orchestration.
• Familiarity with data cataloging and governance tools (AWS Glue Catalog, Lake Formation).
• Knowledge of data warehouse design patterns and best practices.
• Experience with data orchestration tools (e.g., Apache Airflow, Step Functions).
• Working knowledge of Java is a plus.
Education:
B.S. in Computer Science, MIS or related degree and a minimum of five (5) years of related experience or combination of education, training and experience.
Senior Data Engineer
McLean, VA jobs
The candidate must have 5+ years of hands on experience working with PySpark/Python, microservices architecture, AWS EKS, SQL, Postgres, DB2, Snowflake, Behave OR Cucumber frameworks, Pytest (unit testing), automation testing and regression testing.
Experience with tools such as Jenkins, SonarQube AND/OR Fortify are preferred for this role.
Experience in Angular and DevOps are nice to haves for this role.
Must Have Qualifications: PySpark/Python based microservices, AWS EKS, Postgres SQL Database, Behave/Cucumber for automation, Pytest, Snowflake, Jenkins, SonarQube and Fortify.
Responsibilities:
Development of microservices based on Python, PySpark, AWS EKS, AWS Postgres for a data-oriented modernization project.
New System: Python and PySpark, AWS Postgres DB, Behave/Cucumber for automation, and Pytest
Perform System, functional and data analysis on the current system and create technical/functional requirement documents.
Current System: Informatica, SAS, AutoSys, DB2
Write automated tests using Behave/cucumber, based on the new micro-services-based architecture
Promote top code quality and solve issues related to performance tuning and scalability.
Strong skills in DevOps, Docker/container-based deployments to AWS EKS using Jenkins and experience with SonarQube and Fortify.
Able to communicate and engage with business teams and analyze the current business requirements (BRS documents) and create necessary data mappings.
Preferred strong skills and experience in reporting applications development and data analysis
Knowledge in Agile methodologies and technical documentation.
ETL Data Engineer with SpringBatch Experience-- SHADC5693360
Smithfield, RI jobs
Job Title: ETL Data Engineer with SpringBatch Experience - W2 only - We can provide sponsorship
Duration: Long Term
MUST HAVES:
Strong SQL for querying and data validation
Oracle
AWS
ETL experience with Java SpringBatch (for the ETL data transformation).
Note: the ETL work is done in Java (so Python is only a nice to have).
The Expertise and Skills You Bring
Bachelor's or Master's Degree in a technology related field (e.g. Engineering, Computer Science, etc.) required with 5+ years of working experience
4+ years of Java development utilizing Spring frameworks. Experience writing batch jobs with Spring Batch is a must
2+ years of experience developing applications that run in AWS, with focus on AWS Batch, S3, IAM
3+ years working with SQL (ANSI SQL, Oracle, Snowflake)
2+ years of Python development
Experience with Unix shell scripting (bash, ksh) and scheduling / orchestration tools (Control-M)
Strong data modeling skills with experience working with 3NF and Star Schema data models
Proven data analysis skills; not afraid to work in a complex data ecosystem
Hands-on experience on SQL query optimization and tuning to improve performance is desirable
Experience with DevOps, Continuous Integration and Continuous Delivery (Jenkins, Terraform, CloudFormation)
Experience in Agile methodologies (Kanban and SCRUM)
Experience building and deploying containerized applications using Docker
Work experience in the financial services industry is a plus
Proven track record to handle ambiguity and work in a fast-paced environment, either independently or in a collaborative manner
Good interpersonal skills to work with multiple teams within the business unit and across the organization
Data Engineer
Bloomington, MN jobs
Are you an experienced Data Engineer with a desire to excel? If so, then Talent Software Services may have the job for you! Our client is seeking an experienced Data Engineer to work at their company in Bloomington, MN.
Primary Responsibilities/Accountabilities:
Develop and maintain scalable ETL/ELT pipelines using Databricks and Airflow.
Build and optimize Python-based data workflows and SQL queries for large datasets.
Ensure data quality, reliability, and high performance across pipelines.
Collaborate with cross-functional teams to support analytics and reporting requirements.
Monitor, troubleshoot, and improve production data workflows.
Qualifications:
Strong hands-on experience with Databricks, Python, SQL, and Apache Airflow.
6-10+ years of experience in Data Engineering.
Experience with cloud platforms (Azure/AWS/GCP) and big data ecosystems.
Solid understanding of data warehousing, data modelling, and distributed data processing.
Python Data Engineer- THADC5693417
Houston, TX jobs
Must Haves:
Strong proficiency in Python; 5+ years' experience.
Expertise in Fast API and microservices architecture and coding
Linking python based apps with sql and nosql db's
Deployments on docker, Kubernetes and monitoring tools
Experience with Automated testing and test-driven development
Git source control, git actions, ci/cd , VS code and copilot
Expertise in both on prem sql dbs (oracle, sql server, Postgres, db2) and no sql databases
Working knowledge of data warehousing and ETL Able to explain the business functionality of the projects/applications they have worked on
Ability to multi task and simultaneously work on multiple projects.
NO CLOUD - they are on prem
Day to Day:
Insight Global is looking for a Python Data Engineer for one of our largest oil and gas clients in Downtown Houston, TX. This person will be responsible for building python-based relationships between back-end SQL and NoSQL databases, architecting and coding Fast API and Microservices, and performing testing on back-office applications. The ideal candidate will have experience developing applications utilizing python and microservices and implementing complex business functionality utilizing python.