Senior Agentic AI Data Scientist
Bethlehem, PA jobs
We need HANDS ON engineering leaders, not architects.
MUST BE VERY SEASONED DATA SCIENCE ENGINEERS WHO IS WILLING TO DO A SHORT ONLINE TEST
Can sit in Hudson Yards or Bethlehem, PA 2-3 days onsite
Hybrid role - candidates must be able to work onsite in Hudson Yards, NY or Bethlehem, PA
I will not entertain out of state candidates.
We're looking for a very Senior Data Scientist - Agentic AI with strong hands-on experience in AI/ML, LLMs, and intelligent automation. This role will focus on building, deploying, and scaling Agentic AI systems and enterprise-level generative AI solutions that transform business operations and customer experiences.
You'll work on high-visibility projects alongside senior leadership, translating cutting-edge AI research into real-world impact.
Key Responsibilities:
Design and deploy Agentic AI solutions to automate complex workflows.
Operationalize LLMs and generative AI to process unstructured data (contracts, claims, medical records, etc.).
Build autonomous agents and reasoning systems integrated into enterprise platforms.
Partner with engineering and AIOps teams to move models from prototype to production.
Translate research in reinforcement learning and reasoning into business-ready AI applications.
Mentor junior data scientists and establish best practices for scalable AI development.
What We're Looking For:
PhD (2+ yrs) or Master's (10+ yrs) in Statistics, Computer Science, Engineering, or Applied Mathematics.
5+ years of hands-on AI/ML development experience.
Strong programming skills in Python, PyTorch, TensorFlow, LangGraph.
Proven background in machine learning, optimization, and statistical modeling.
Excellent communication, leadership, and cross-functional collaboration skills.
Applied Data Scientist/ Data Science Engineer
Austin, TX jobs
Role: Applied Data Scientist/ Data Science Engineer
Yrs. of experience: 8+ Yrs.
Job type : Fulltime
Job Responsibilities:
You will be part of a team that innovates and collaborates with internal stakeholders to deliver world-class solutions with a customer first mentality. This group is passionate about the data science field and is motivated to find opportunity in, and develop solutions for, evolving challenges.
You will:
Solve business and customer issues utilizing AI/ML - Mandatory
Build prototypes and scalable AI/ML solutions that will be integrated into software products
Collaborate with software engineers, business stakeholders and product owners in an Agile environment
Have complete ownership of model outcomes and drive continuous improvement
Essential Requirements:
Strong coding skills in Python and SQL - Mandatory
Machine Learning knowledge (Deep learning, Information Retrieval (RAG), GenAI , Classification, Forecasting, Regression, etc. on large datasets) with experience in ML model deployment
Ability to work with internal stakeholders to transfer business questions into quantitative problem statements
Ability to effectively communicate data science progress to non-technical internal stakeholders
Ability to lead a team of data scientists is a plus
Experience with Big Data technologies and/or software development is a plus
Data Scientist
Reston, VA jobs
• Collect, clean, and preprocess large datasets from multiple sources.
• Apply statistical analysis and machine learning techniques to solve business problems.
• Build predictive models and algorithms to optimize processes and improve outcomes.
• Develop dashboards and visualizations to communicate insights effectively.
• Collaborate with cross-functional teams (Product, Engineering, Risk, Marketing) to identify opportunities for leveraging data.
• Ensure data integrity, security, and compliance with organizational standards.
• Stay current with emerging technologies and best practices in data science and AI.
________________________________________
Required Qualifications
• Bachelor's or Master's degree in Data Science, Computer Science, Statistics, Mathematics, or related field.
• Strong proficiency in Python, R, SQL, and experience with data manipulation libraries (e.g., Pandas, NumPy).
• Hands-on experience with machine learning frameworks (e.g., scikit-learn, TensorFlow, PyTorch).
• Solid understanding of statistical modeling, hypothesis testing, and data visualization.
• Experience with big data platforms (e.g., Spark, Hadoop) and cloud environments (AWS, Azure, GCP).
• Excellent problem-solving skills and ability to communicate complex concepts clearly.
________________________________________
Preferred Qualifications
• Experience in risk modeling, financial services, or product analytics.
• Knowledge of MLOps and deploying models in production.
• Familiarity with data governance and compliance frameworks.
________________________________________
Soft Skills
• Strong analytical thinking and attention to detail.
• Ability to work independently and in a team environment.
• Effective communication and stakeholder management skills.
#LI-CGTS
#TS-0455
Data Scientist Specialist
McLean, VA jobs
Job Title: Data Scientist Specialist
Duration: 45 Minutes|120 Minutes
Interview Type: MS Teams - Video Mandatory| 2nd round on-site
Call notes:
Flexibility in hands on experience
Not required 10 years of experience
ML and Gen AI
2 years of Gen AI experience
Gen AI development
This role comes between software engineer and Data Scientist
Can see Data Engineer
BS/MS in Ai or Data Science preferred
PHD degree not needed
Senior Data Scientist
McLean, VA jobs
We are seeking a highly experienced **Principal Gen AI Scientist** with a strong focus on **Generative AI (GenAI)** to lead the design and development of cutting-edge AI Agents, Agentic Workflows and Gen AI Applications that solve complex business problems. This role requires advanced proficiency in Prompt Engineering, Large Language Models (LLMs), RAG, Graph RAG, MCP, A2A, multi-modal AI, Gen AI Patterns, Evaluation Frameworks, Guardrails, data curation, and AWS cloud deployments. You will serve as a hands-on Gen AI (data) scientist and critical thought leader, working alongside full stack developers, UX designers, product managers and data engineers to shape and implement enterprise-grade Gen AI solutions.
Key Responsibilities:
* Architect and implement scalable AI Agents, Agentic Workflows and GenAI applications to address diverse and complex business use cases.
* Develop, fine-tune, and optimize lightweight LLMs; lead the evaluation and adaptation of models such as Claude (Anthropic), Azure OpenAI, and open-source alternatives.
* Design and deploy Retrieval-Augmented Generation (RAG) and Graph RAG systems using vector databases and knowledge bases.
* Curate enterprise data using connectors integrated with AWS Bedrock's Knowledge Base/Elastic
* Implement solutions leveraging MCP (Model Context Protocol) and A2A (Agent-to-Agent) communication.
* Build and maintain Jupyter-based notebooks using platforms like SageMaker and MLFlow/Kubeflow on Kubernetes (EKS).
* Collaborate with cross-functional teams of UI and microservice engineers, designers, and data engineers to build full-stack Gen AI experiences.
* Integrate GenAI solutions with enterprise platforms via API-based methods and GenAI standardized patterns.
* Establish and enforce validation procedures with Evaluation Frameworks, bias mitigation, safety protocols, and guardrails for production-ready deployment.
* Design & build robust ingestion pipelines that extract, chunk, enrich, and anonymize data from PDFs, video, and audio sources for use in LLM-powered workflows-leveraging best practices like semantic chunking and privacy controls
* Orchestrate multimodal pipelines** using scalable frameworks (e.g., Apache Spark, PySpark) for automated ETL/ELT workflows appropriate for unstructured media
* Implement embeddings drives-map media content to vector representations using embedding models, and integrate with vector stores (AWS KnowledgeBase/Elastic/Mongo Atlas) to support RAG architectures
**Required Qualifications:**
* 10+ years of experience in AI/ML, with 3+ years in applied GenAI or LLM-based solutions.
* Deep expertise in prompt engineering, fine-tuning, RAG, GraphRAG, vector databases (e.g., AWS KnowledgeBase / Elastic), and multi-modal models.
* Proven experience with cloud-native AI development (AWS SageMaker, Bedrock, MLFlow on EKS).
* Strong programming skills in Python and ML libraries (Transformers, LangChain, etc.).
* Deep understanding of Gen AI system patterns and architectural best practices, Evaluation Frameworks
* Demonstrated ability to work in cross-functional agile teams.
* Need Github Code Repository Link for each candidate. Please thoroughly vet the candidates.
**Preferred Qualifications:**
* Published contributions or patents in AI/ML/LLM domains.
* Hands-on experience with enterprise AI governance and ethical deployment frameworks.
* Familiarity with CI/CD practices for ML Ops and scalable inference APIs.
#LI-CGTS
#TS-2942
Senior Data Scientist
McLean, VA jobs
Purpose:
As a Data Scientist, you will play a key role in delivering impactful, data-driven solutions for our strategic enterprise clients. This role also offers the opportunity to shape and grow Infocepts' Data Science & AI practice, contributing to high-impact AI/ML initiatives, crafting data-driven narratives for stakeholders, and applying advanced techniques to solve complex business problems from strategy to execution.
Key Result Areas and Activities:
Design, build, and deploy AI/ML solutions using modern cloud and data platforms.
Lead data science projects across industries, ensuring alignment with business goals.
Apply supervised, unsupervised, deep learning, and Generative AI (e.g., LLMs, agentic workflows) techniques to address client use cases.
Collaborate with data engineering teams to optimize model pipelines using Delta Lake and Spark.
Communicate findings effectively through data visualizations and stakeholder presentations.
Drive adoption of MLOps practices for scalable and reliable model deployment.
Contribute to the evolution of Infocepts' Data Science & AI offerings through innovation and knowledge sharing.
Roles & Responsibilities
Essential Skills
5+ years of experience in applied AI and machine/deep learning.
Hands-on experience with Databricks, MLflow, PySpark, and Spark MLlib.
Proficiency in Python and SQL for model development and data manipulation.
Strong understanding of supervised and unsupervised learning, deep learning, and Generative AI.
Familiarity with cloud platforms: AWS, Azure, and GCP.
Solid foundation in advanced statistical methods and probabilistic analysis.
Ability to lead end-to-end AI/ML projects, including design, development, and stakeholder management.
Experience with visualization tools like Tableau, Power BI, or similar.
Familiarity with ML workflow orchestration and MLOps practices.
Desirable Skills
Experience with LLMs (Large Language Models) and agentic AI workflows.
Familiarity with modern data platforms like Snowflake.
Exposure to real-time data processing in cloud-native environments.
Contributions to open-source AI projects or publications in data science communities.
Qualifications
Bachelor's or Master's degree in Computer Science, Data Science, Machine Learning, Statistics, or a related field.
Certifications in cloud platforms (AWS, Azure, GCP) or Databricks are a plus.
Qualities:
Able to consult, write, and present persuasively
Able to work in a self-organized and cross-functional team
Able to iterate based on new information, peer reviews, and feedback
Able to work seamlessly with clients across multiple geographies
Research focused mindset
Excellent analytical, presentation, reporting, documentation and interactive skills
"Infocepts is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected Veteran status, age, or any other characteristic protected by law."
Senior Data Scientist
McLean, VA jobs
Locals to Only# In- Person Interview
Job Title: Data Scientist Specialist
We are seeking a highly experienced Principal Gen AI Scientist with a strong focus on Generative AI (GenAI) to lead the design and development of cutting-edge AI Agents, Agentic Workflows and Gen AI Applications that solve complex business problems. This role requires advanced proficiency in Prompt Engineering, Large Language Models (LLMs), RAG, Graph RAG, MCP, A2A, multi-modal AI, Gen AI Patterns, Evaluation Frameworks, Guardrails, data curation, and AWS cloud deployments. You will serve as a hands-on Gen AI (data) scientist and critical thought leader, working alongside full stack developers, UX designers, product managers and data engineers to shape and implement enterprise-grade Gen AI solutions.
Responsibilities:
Architect and implement scalable AI Agents, Agentic Workflows and GenAI applications to address diverse and complex business use cases.
Develop, fine-tune, and optimize lightweight LLMs; lead the evaluation and adaptation of models such as Claude (Anthropic), Azure OpenAI, and open-source alternatives.
Design and deploy Retrieval-Augmented Generation (RAG) and Graph RAG systems using vector databases and knowledge bases.
Curate enterprise data using connectors integrated with AWS Bedrock's Knowledge Base/Elastic.
Implement solutions leveraging MCP (Model Context Protocol) and A2A (Agent-to-Agent) communication.
Build and maintain Jupyter-based notebooks using platforms like AWS SageMaker and MLFlow/Kubeflow on Kubernetes (EKS).
Collaborate with cross-functional teams of UI and microservice engineers, designers, and data engineers to build full-stack Gen AI experiences.
Integrate GenAI solutions with enterprise platforms via API-based methods and GenAI standardized patterns.
Establish and enforce validation procedures with Evaluation Frameworks, bias mitigation, safety protocols, and guardrails for production-ready deployment.
Design & build robust ingestion pipelines that extract, chunk, enrich, and anonymize data from PDFs, video, and audio sources for use in LLM-powered workflows-leveraging best practices like semantic chunking and privacy controls.
Orchestrate multimodal pipelines** using scalable frameworks (e.g., Apache Spark, PySpark) for automated ETL/ELT workflows appropriate for unstructured media.
Implement embeddings drives-map media content to vector representations using embedding models, and integrate with vector stores (AWS Knowledge Base/Elastic/Mongo Atlas) to support RAG architectures.
Qualifications:
experience in AI/ML, with applied GenAI or LLM-based solutions.
Deep expertise in prompt engineering, fine-tuning, RAG, GraphRAG, vector databases (e.g., AWS Knowledge Base / Elastic), and multi-modal models.
Proven experience with cloud-native AI development (AWS SageMaker, Amazon Bedrock, MLFlow on EKS).
Strong programming skills in Python and ML libraries (Transformers, LangChain, etc.).
Deep understanding of Gen AI system patterns and architectural best practices, Evaluation Frameworks.
Demonstrated ability to work in cross-functional agile teams.
Senior Data Governance Consultant (Informatica)
Plano, TX jobs
Senior Data Governance Consultant (Informatica)
About Paradigm - Intelligence Amplified
Paradigm is a strategic consulting firm that turns vision into tangible results. For over 30 years, we've helped Fortune 500 and high-growth organizations accelerate business outcomes across data, cloud, and AI. From strategy through execution, we empower clients to make smarter decisions, move faster, and maximize return on their technology investments. What sets us apart isn't just what we do, it's how we do it. Driven by a clear mission and values rooted in integrity, excellence, and collaboration, we deliver work that creates lasting impact. At Paradigm, your ideas are heard, your growth is prioritized, your contributions make a difference.
Summary:
We are seeking a Senior Data Governance Consultant to lead and enhance data governance capabilities across a financial services organization
The Senior Data Governance Consultant will collaborate closely with business, risk, compliance, technology, and data management teams to define data standards, strengthen data controls, and drive a culture of data accountability and stewardship
The ideal candidate will have deep experience in developing and implementing data governance frameworks, data policies, and control mechanisms that ensure compliance, consistency, and trust in enterprise data assets
Hands-on experience with Informatica, including Master Data Management (MDM) or Informatica Data Management Cloud (IDMC), is preferred
This position is Remote, with occasional travel to Plano, TX
Responsibilities:
Data Governance Frameworks:
Design, implement, and enhance data governance frameworks aligned with regulatory expectations (e.g., BCBS 239, GDPR, CCPA, DORA) and internal control standards
Policy & Standards Development:
Develop, maintain, and operationalize data policies, standards, and procedures that govern data quality, metadata management, data lineage, and data ownership
Control Design & Implementation:
Define and embed data control frameworks across data lifecycle processes to ensure data integrity, accuracy, completeness, and timeliness
Risk & Compliance Alignment:
Work with risk and compliance teams to identify data-related risks and ensure appropriate mitigation and monitoring controls are in place
Stakeholder Engagement:
Partner with data owners, stewards, and business leaders to promote governance practices and drive adoption of governance tools and processes
Data Quality Management:
Define and monitor data quality metrics and KPIs, establishing escalation and remediation procedures for data quality issues
Metadata & Lineage:
Support metadata and data lineage initiatives to increase transparency and enable traceability across systems and processes
Reporting & Governance Committees:
Prepare materials and reporting for data governance forums, risk committees, and senior management updates
Change Management & Training:
Develop communication and training materials to embed governance culture and ensure consistent understanding across the organization
Required Qualifications:
7+ years of experience in data governance, data management, or data risk roles within financial services (banking, insurance, or asset management preferred)
Strong knowledge of data policy development, data standards, and control frameworks
Proven experience aligning data governance initiatives with regulatory and compliance requirements
Familiarity with Informatica data governance and metadata tools
Excellent communication skills with the ability to influence senior stakeholders and translate technical concepts into business language
Deep understanding of data management principles (DAMA-DMBOK, DCAM, or equivalent frameworks)
Bachelor's or Master's Degree in Information Management, Data Science, Computer Science, Business, or related field
Preferred Qualifications:
Hands-on experience with Informatica, including Master Data Management (MDM) or Informatica Data Management Cloud (IDMC), is preferred
Experience with data risk management or data control testing
Knowledge of financial regulatory frameworks (e.g., Basel, MiFID II, Solvency II, BCBS 239)
Certifications, such as Informatica, CDMP, or DCAM
Background in consulting or large-scale data transformation programs
Key Competencies:
Strategic and analytical thinking
Strong governance and control mindset
Excellent stakeholder and relationship management
Ability to drive organizational change and embed governance culture
Attention to detail with a pragmatic approach
Why Join Paradigm
At Paradigm, integrity drives innovation. You'll collaborate with curious, dedicated teammates, solving complex problems and unlocking immense data value for leading organizations. If you seek a place where your voice is heard, growth is supported, and your work creates lasting business value, you belong at Paradigm.
Learn more at ********************
Policy Disclosure:
Paradigm maintains a strict drug-free workplace policy. All offers of employment are contingent upon successfully passing a standard 5-panel drug screen. Please note that a positive test result for any prohibited substance, including marijuana, will result in disqualification from employment, regardless of state laws permitting its use. This policy applies consistently across all positions and locations.
Snowflake Data Engineer (DBT SQL)
San Jose, CA jobs
Job Description - Snowflake Data Engineer (DBT SQL)
Duration: 6 months
Key Responsibilities
• Design, develop, and optimize data pipelines using Snowflake and DBT SQL.
• Implement and manage data warehousing concepts, metadata management, and data modeling.
• Work with data lakes, multi-dimensional models, and data dictionaries.
• Utilize Snowflake features such as Time Travel and Zero-Copy Cloning.
• Perform query performance tuning and cost optimization in cloud environments.
• Administer Snowflake architecture, warehousing, and processing.
• Develop and maintain PL/SQL Snowflake solutions.
• Apply design patterns for scalable and maintainable data solutions.
• Collaborate with cross-functional teams and tech leads across multiple tracks.
• Provide technical and functional guidance to team members.
Required Skills & Experience
• Hands-on Snowflake development experience (mandatory).
• Strong proficiency in SQL and DBT SQL.
• Knowledge of data warehousing concepts, metadata management, and data modeling.
• Experience with data lakes, multi-dimensional models, and data dictionaries.
• Expertise in Snowflake features (Time Travel, Zero-Copy Cloning).
• Strong background in query optimization and cost management.
• Familiarity with Snowflake administration and pipeline development.
• Knowledge of PL/SQL and SQL databases (additional plus).
• Excellent communication, leadership, and organizational skills.
• Strong team player with a positive attitude.
Senior Data Engineer
Philadelphia, PA jobs
We are seeking a passionate and skilled Senior Data Engineer to join our dynamic team in Philadelphia, PA. In this role, you will lead the design and implementation of advanced data pipelines for Business Intelligence (BI) and reporting. Your expertise will transform complex data into actionable insights, driving significant business value for our clients.
Key Responsibilities:
Design and implement scalable and efficient data pipelines for BI and reporting.
Define and manage key business metrics, build automated dashboards, and develop analytic self-service capabilities.
Write comprehensive technical documentation to outline data solutions and architectures.
Lead requirements gathering, solution design, and implementation for data projects.
Develop and maintain ETL frameworks for large real-world data (RWD) assets.
Mentor and guide technical teams, fostering a culture of innovation.
Stay updated with new technologies and solve complex data problems.
Facilitate the deployment and integration of AI models, ensuring data quality and compatibility with existing analytics infrastructure.
Collaborate with cross-functional stakeholders to understand data needs and deliver impactful analytics and reports.
Required Qualifications:
Bachelor's or Master's degree in Computer Science, Information Systems, or a related field.
4+ years of SQL experience.
Experience with data modeling, warehousing, and building ETL pipelines.
Proficiency in at least one modern scripting or programming language (e.g., Python, Java, Scala, NodeJS).
Experience working directly with business stakeholders to align data solutions with business needs.
Working knowledge of Snowflake as a data warehousing solution.
Experience with workflow orchestration tools like Apache Airflow.
Knowledge of data transformation tools and frameworks such as dbt (Data Build Tool), PySpark, or Snowpark.
Experience with open-source table formats (e.g., Apache Iceberg, Delta, Hudi).
Familiarity with container technologies like Docker and Kubernetes.
Experience with on-premises and cloud MDM deployments.
Preferred Qualifications:
Proficiency with data visualization tools (e.g., Tableau, Power BI, Quicksight).
Certifications in Snowflake or Azure Data Engineering
Experience with Agile methodologies and project management tools (e.g., Jira).
Experience deploying and managing data solutions within Azure AI, Azure ML, or similar environments.
Familiarity with DevOps practices, particularly CI/CD for data solutions.
Knowledge of emerging data architectures, including Data Mesh, Data Fabric, Multimodal Data Management, and AI/ML integration.
Familiarity with ETL tools like Informatica and Matillion.
Previous experience in professional services or consultancy environments.
Experience in technical pre-sales, solution demos, and proposal development.
Sr Data Platform Engineer
Elk Grove, CA jobs
Hybrid role 3X a week in office in Elk Grove, CA; no remote capabilities
This is a direct hire opportunity.
We're seeking a seasoned Senior Data Platform Engineer to design, build, and optimize scalable data solutions that power analytics, reporting, and AI/ML initiatives. This full‑time role is hands‑on, working with architects, analysts, and business stakeholders to ensure data systems are reliable, secure, and high‑performing.
Responsibilites:
Build and maintain robust data pipelines (structured, semi‑structured, unstructured).
Implement ETL workflows with Spark, Delta Lake, and cloud‑native tools.
Support big data platforms (Databricks, Snowflake, GCP) in production.
Troubleshoot and optimize SQL queries, Spark jobs, and workloads.
Ensure governance, security, and compliance across data systems.
Integrate workflows into CI/CD pipelines with Git, Jenkins, Terraform.
Collaborate cross‑functionally to translate business needs into technical solutions.
Qualifications:
7+ years in data engineering with production pipeline experience.
Expertise in Spark ecosystem, Databricks, Snowflake, GCP.
Strong skills in PySpark, Python, SQL.
Experience with RAG systems, semantic search, and LLM integration.
Familiarity with Kafka, Pub/Sub, vector databases.
Proven ability to optimize ETL jobs and troubleshoot production issues.
Agile team experience and excellent communication skills.
Certifications in Databricks, Snowflake, GCP, or Azure.
Exposure to Airflow, BI tools (Power BI, Looker Studio).
Data Engineer (AWS Redshift, BI, Python, ETL)
Manhattan Beach, CA jobs
We are seeking a skilled Data Engineer with strong experience in business intelligence (BI) and data warehouse development to join our team. In this role, you will design, build, and optimize data pipelines and warehouse architectures that support analytics, reporting, and data-driven decision-making. You will work closely with analysts, data scientists, and business stakeholders to ensure reliable, scalable, and high-quality data solutions.
Responsibilities:
Develop and maintain ETL/ELT pipelines for ingesting, transforming, and delivering data.
Design and enhance data warehouse models (star/snowflake schemas) and BI datasets.
Optimize data workflows for performance, scalability, and reliability.
Collaborate with BI teams to support dashboards, reporting, and analytics needs.
Ensure data quality, governance, and documentation across all solutions.
Qualifications:
Proven experience with data engineering tools (SQL, Python, ETL frameworks).
Strong understanding of BI concepts, reporting tools, and dimensional modeling.
Hands-on experience with cloud data platforms (e.g., AWS, Azure, GCP) is a plus.
Excellent problem-solving skills and ability to work in a cross-functional environment.
Cloud Data Engineer- Databricks
McLean, VA jobs
Purpose:
We are seeking a highly skilled Cloud Data Engineer with deep expertise in Databricks and modern cloud platforms such as AWS, Azure, or GCP. This role is ideal for professionals who are passionate about building next-generation data platforms, optimizing complex data workflows, and enabling advanced analytics and AI in cloud-native environments. You'll have the opportunity to work with Fortune-500 organizations in data and analytics, helping them unlock the full potential of their data through innovative, scalable solutions.
Key Result Areas and Activities:
Design and implement robust, scalable data engineering solutions.
Build and optimize data pipelines using Databricks, including serverless capabilities, Unity Catalog, and Mosaic AI.
Collaborate with analytics and AI teams to enable real-time and batch data workflows.
Support and improve cloud-native data platforms (AWS, Azure, GCP).
Ensure adherence to best practices in data modeling, warehousing, and governance.
Contribute to automation of data workflows using CI/CD, DevOps, or DataOps practices.
Implement and maintain workflow orchestration tools like Apache Airflow and dbt.
Roles & Responsibilities
Essential Skills
4+ years of experience in data engineering with a focus on scalable solutions.
Strong hands-on experience with Databricks in a cloud environment.
Proficiency in Spark and Python for data processing.
Solid understanding of data modeling, data warehousing, and architecture principles.
Experience working with at least one major cloud provider (AWS, Azure, or GCP).
Familiarity with CI/CD pipelines and data workflow automation.
Desirable Skills
Direct experience with Unity Catalog and Mosaic AI within Databricks.
Working knowledge of DevOps/DataOps principles in a data engineering context.
Exposure to Apache Airflow, dbt, and modern data orchestration frameworks.
Qualifications
Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems, or a related field.
Relevant certifications in cloud platforms (AWS/Azure/GCP) or Databricks are a plus.
Qualities:
Able to consult, write, and present persuasively
Able to work in a self-organized and cross-functional team
Able to iterate based on new information, peer reviews, and feedback
Able to work seamlessly with clients across multiple geographies
Research focused mindset
Excellent analytical, presentation, reporting, documentation and interactive skills
"Infocepts is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected Veteran status, age, or any other characteristic protected by law."
Senior Data Engineer
Houston, TX jobs
We are seeking an experienced Data Engineer (5+ years) to join our Big Data & Advanced Analytics team. This role partners closely with Data Science and key business units to solve real-world midstream oil and gas challenges using machine learning, data engineering, and advanced analytics. The ideal candidate brings strong technical expertise and thought leadership to help mature and scale the organization's data engineering practice.
Must-Have Skills
Python (Pandas, NumPy, PyTest, Scikit-Learn)
SQL
Apache Airflow
Kubernetes
CI/CD
Git
Test-Driven Development (TDD)
API development
Working knowledge of Machine Learning concepts
Key Responsibilities
Build, test, and maintain scalable data pipeline architectures
Work independently on analytics and data engineering projects across multiple business functions
Automate manual data flows to improve reliability, speed, and reusability
Develop data-intensive applications and APIs
Design and implement algorithms that convert raw data into actionable insights
Deploy and operationalize mathematical and machine learning models
Support data analysts and data scientists by enabling data processing automation and deployment workflows
Implement and maintain data quality checks to ensure accuracy, completeness, and consistenc
Azure Data Engineer Sr
Irving, TX jobs
Minimum 7 years of relevant work experience in data engineering, with at least 2 years in a data modeling.
Strong technical foundation in Python, SQL, and experience with cloud platforms (Azure,).
Deep understanding of data engineering fundamentals, including database architecture and design, Extract, transform and load (ETL) processes, data lakes, data warehousing, and both batch and streaming technologies.
Experience with data orchestration tools (e.g., Airflow), data processing frameworks (e.g., Spark, Databricks), and data visualization tools (e.g., Tableau, Power BI).
Proven ability to lead a team of engineers, fostering a collaborative and high-performing environment.
Sr. Cloud Data Engineer
Malvern, PA jobs
Job Title: Sr. Cloud Data Engineer
Duration: 12 months+ Contract
Contract Description:
Responsibilities:
Maintain and optimize AWS-based data pipelines to ensure timely and reliable data delivery.
Develop and troubleshoot workflows using AWS Glue, PySpark, Step Functions, and DynamoDB.
Collaborate on code management and CI/CD processes using Bitbucket, GitHub, and Bamboo.
Participate in code reviews and repository management to uphold coding standards.
Provide technical guidance and mentorship to junior engineers and assist in team coordination.
Qualifications:
9-10 years of experience in data engineering with strong hands-on AWS expertise.
Proficient in AWS Glue, PySpark, Step Functions, and DynamoDB.
Skilled in managing code repositories and CI/CD pipelines (Bitbucket, GitHub, Bamboo).
Experience in team coordination or mentoring roles.
Familiarity with Wealth Asset Management, especially personal portfolio performance, is a plus
Senior Snowflake Data Engineer
Santa Clara, CA jobs
About the job
Why Zensar?
We're a bunch of hardworking, fun-loving, people-oriented technology enthusiasts. We love what we do, and we're passionate about helping our clients thrive in an increasingly complex digital world. Zensar is an organization focused on building relationships, with our clients and with each other-and happiness is at the core of everything we do. In fact, we're so into happiness that we've created a Global Happiness Council, and we send out a Happiness Survey to our employees each year. We've learned that employee happiness requires more than a competitive paycheck, and our employee value proposition-grow, own, achieve, learn (GOAL)-lays out the core opportunities we seek to foster for every employee. Teamwork and collaboration are critical to Zensar's mission and success, and our teams work on a diverse and challenging mix of technologies across a broad industry spectrum. These industries include banking and financial services, high-tech and manufacturing, healthcare, insurance, retail, and consumer services. Our employees enjoy flexible work arrangements and a competitive benefits package, including medical, dental, vision, 401(k), among other benefits. If you are looking for a place to have an immediate impact, to grow and contribute, where we work hard, play hard, and support each other, consider joining team Zensar!
Zensar is seeking an Senior Snowflake Data Engineer -Santa Clara, CA-Work from office all 5 days-This is open for Full time with excellent benefits and growth opportunities and contract role as well.
Job Description:
Key Requirements:
Strong hands-on experience in data engineering using Snowflake and Databricks, with proven ability to build and optimize large-scale data pipelines.
Deep understanding of data architecture principles, including ingestion, transformation, storage, and access control.
Solid experience in system design and solution architecture, focusing on scalability, reliability, and maintainability.
Expertise in ETL/ELT pipeline design, including data extraction, transformation, validation, and load processes.
In-depth knowledge of data modeling techniques (dimensional modeling, star, and snowflake schemas).
Skilled in optimizing compute and storage costs across Snowflake and Databricks environments.
Strong proficiency in administration, including database design, schema management, user roles, permissions, and access control policies.
Hands-on experience implementing data lineage, quality, and monitoring frameworks.
Advanced proficiency in SQL for data processing, transformation, and automation.
Experience with reporting and visualization tools such as Power BI and Sigma Computing.
Excellent communication and collaboration skills, with the ability to work independently and drive technical initiatives.
Zensar believes that diversity of backgrounds, thought, experience, and expertise fosters the robust exchange of ideas that enables the highest quality collaboration and work product. Zensar is an equal opportunity employer. All employment decisions shall be made without regard to age, race, creed, color, religion, sex, national origin, ancestry, disability status, veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by federal, state, or local law. Zensar is committed to providing veteran employment opportunities to our service men and women. Zensar is committed to providing equal employment opportunities for persons with disabilities or religious observances, including reasonable accommodation when needed. Accommodations made to facilitate the recruiting process are not a guarantee of future or continued accommodations once hired.
Zensar does not facilitate/sponsor any work authorization for this position.
Candidates who are currently employed by a client or vendor of Zensar may be ineligible for consideration.
Zensar values your privacy. We'll use your data in accordance with our privacy statement located at: *********************************
Data Engineer(python, pyspark, databricks)
Dallas, TX jobs
Job Title: Data Engineer(python, pyspark, databricks)
Data Engineer with strong proficiency in SQL, Python, and PySpark to support high-performance data pipelines and analytics initiatives. This role will focus on scalable data processing, transformation, and integration efforts that enable business insights, regulatory compliance, and operational efficiency.
Data Engineer - SQL, Python and Pyspark Expert (Onsite - Dallas, TX)
Key Responsibilities
Design, develop, and optimize ETL/ELT pipelines using SQL, Python, and PySpark for large-scale data environments
Implement scalable data processing workflows in distributed data platforms (e.g., Hadoop, Databricks, or Spark environments)
Partner with business stakeholders to understand and model mortgage lifecycle data (origination, underwriting, servicing, foreclosure, etc.)
Create and maintain data marts, views, and reusable data components to support downstream reporting and analytics
Ensure data quality, consistency, security, and lineage across all stages of data processing
Assist in data migration and modernization efforts to cloud-based data warehouses (e.g., Snowflake, Azure Synapse, GCP BigQuery)
Document data flows, logic, and transformation rules
Troubleshoot performance and quality issues in batch and real-time pipelines
Support compliance-related reporting (e.g., HMDA, CFPB)
Required Qualifications
6+ years of experience in data engineering or data development
Advanced expertise in SQL (joins, CTEs, optimization, partitioning, etc.)
Strong hands-on skills in Python for scripting, data wrangling, and automation
Proficient in PySpark for building distributed data pipelines and processing large volumes of structured/unstructured data
Experience working with mortgage banking data sets and domain knowledge is highly preferred
Strong understanding of data modeling (dimensional, normalized, star schema)
Experience with cloud-based platforms (e.g., Azure Databricks, AWS EMR, GCP Dataproc)
Familiarity with ETL tools, orchestration frameworks (e.g., Airflow, ADF, dbt)
Senior Data Engineer
McLean, VA jobs
The candidate must have 5+ years of hands on experience working with PySpark/Python, microservices architecture, AWS EKS, SQL, Postgres, DB2, Snowflake, Behave OR Cucumber frameworks, Pytest (unit testing), automation testing and regression testing.
Experience with tools such as Jenkins, SonarQube AND/OR Fortify are preferred for this role.
Experience in Angular and DevOps are nice to haves for this role.
Must Have Qualifications: PySpark/Python based microservices, AWS EKS, Postgres SQL Database, Behave/Cucumber for automation, Pytest, Snowflake, Jenkins, SonarQube and Fortify.
Responsibilities:
Development of microservices based on Python, PySpark, AWS EKS, AWS Postgres for a data-oriented modernization project.
New System: Python and PySpark, AWS Postgres DB, Behave/Cucumber for automation, and Pytest
Perform System, functional and data analysis on the current system and create technical/functional requirement documents.
Current System: Informatica, SAS, AutoSys, DB2
Write automated tests using Behave/cucumber, based on the new micro-services-based architecture
Promote top code quality and solve issues related to performance tuning and scalability.
Strong skills in DevOps, Docker/container-based deployments to AWS EKS using Jenkins and experience with SonarQube and Fortify.
Able to communicate and engage with business teams and analyze the current business requirements (BRS documents) and create necessary data mappings.
Preferred strong skills and experience in reporting applications development and data analysis
Knowledge in Agile methodologies and technical documentation.
Python Data Engineer- THADC5693417
Houston, TX jobs
Must Haves:
Strong proficiency in Python; 5+ years' experience.
Expertise in Fast API and microservices architecture and coding
Linking python based apps with sql and nosql db's
Deployments on docker, Kubernetes and monitoring tools
Experience with Automated testing and test-driven development
Git source control, git actions, ci/cd , VS code and copilot
Expertise in both on prem sql dbs (oracle, sql server, Postgres, db2) and no sql databases
Working knowledge of data warehousing and ETL Able to explain the business functionality of the projects/applications they have worked on
Ability to multi task and simultaneously work on multiple projects.
NO CLOUD - they are on prem
Day to Day:
Insight Global is looking for a Python Data Engineer for one of our largest oil and gas clients in Downtown Houston, TX. This person will be responsible for building python-based relationships between back-end SQL and NoSQL databases, architecting and coding Fast API and Microservices, and performing testing on back-office applications. The ideal candidate will have experience developing applications utilizing python and microservices and implementing complex business functionality utilizing python.