Machine Learning Data Scientist
Pittsburgh, PA jobs
Machine Learning Data Scientist
Length: 6 Month Contract to Start
* Please no agencies. Direct employees currently authorized to work in the United States - no sponsorship available.*
Job Description:
We are looking for a Data Scientist/Engineer with Machine Learning and strong skills in Python, time-series modeling, and SCADA/industrial data. In this role, you will build and deploy ML models for forecasting, anomaly detection, and predictive maintenance using high-frequency sensor and operational data.
Essential Duties and Responsibilities:
Develop ML models for time-series forecasting and anomaly detection
Build data pipelines for SCADA/IIoT data ingestion and processing
Perform feature engineering and signal analysis on time-series data
Deploy models in production using APIs, microservices, and MLOps best practices
Collaborate with data engineers and domain experts to improve data quality and model performance
Qualifications:
Strong Python skills
Experience working with SCADA systems or industrial data historians
Solid understanding of time-series analytics and signal processing
Experience with cloud platforms and containerization (AWS/Azure/GCP, Docker)
POST-OFFER BACKGROUND CHECK IS REQUIRED. Digital Prospectors is an Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, veteran status, or any other characteristic protected by law. Digital Prospectors affirms the right of all individuals to equal opportunity and prohibits any form of discrimination or harassment.
Come see why DPC has achieved:
4.9/5 Star Glassdoor rating and the only staffing company (< 1000 employees) to be voted in the national Top 10 ‘Employee's Choice - Best Places to Work' by Glassdoor.
Voted ‘Best Staffing Firm to Temp/Contract For' seven times by Staffing Industry Analysts as well as a ‘Best Company to Work For' by Forbes, Fortune and Inc. magazine.
As you are applying, please join us in fostering diversity, equity, and inclusion by completing the Invitation to Self-Identify form today!
*******************
Job #18135
Data Scientist
Alhambra, CA jobs
Title: Principal Data Scientist
Duration: 12 Months Contract
Additional Information
California Resident Candidates Only. This position is HYBRID (2 days onsite, 2 days telework). Interviews will be conducted via Microsoft Teams. The work schedule follows a 4/40 (10-hour days, Monday-Thursday), with the specific shift determined by the program manager. Shifts may range between 7:15 a.m. and 6:00 p.m.
Job description:
The Principal Data Scientist works to establish a comprehensive Data Science Program to advance data-driven decision-making, streamline operations, and fully leverage modern platforms including Databricks, or similar, to meet increasing demand for predictive analytics and AI solutions. The Principal Data Scientist will guide program development, provide training and mentorship to junior members of the team, accelerate adoption of advanced analytics, and build internal capacity through structured mentorship. The Principal Data Scientist will possess exceptional communication abilities, both verbal and written, with a strong customer service mindset and the ability to translate complex concepts into clear, actionable insights; strong analytical and business acumen, including foundational experience with regression, association analysis, outlier detection, and core data analysis principles; working knowledge of database design and organization, with the ability to partner effectively with Data Management and Data Engineering teams; outstanding time management and organizational skills, with demonstrated success managing multiple priorities and deliverables in parallel; a highly collaborative work style, coupled with the ability to operate independently, maintain focus, and drive projects forward with minimal oversight; a meticulous approach to quality, ensuring accuracy, reliability, and consistency in all deliverables; and proven mentorship capabilities, including the ability to guide, coach, and upskill junior data scientists and analysts.
Experience Required:
Five (5)+ years of professional experience leading data science initiatives, including developing machine learning models, statistical analyses, and end-to-end data science workflows in production environments.
Three (3)+ years of experience working with Databricks and similar cloud-based analytics platforms, including notebook development, feature engineering, ML model training, and workflow orchestration.
Three (3)+ years of experience applying advanced analytics and predictive modeling (e.g., regression, classification, clustering, forecasting, natural language processing).
Two (2)+ years of experience implementing MLOps practices, such as model versioning, CI/CD for ML, MLflow, automated pipelines, and model performance monitoring.
Two (2)+ years of experience collaborating with data engineering teams to design data pipelines, optimize data transformations, and implement Lakehouse or data warehouse architectures (e.g., Databricks, Snowflake, SQL-based platforms).
Two (2)+ years of experience mentoring or supervising junior data scientists or analysts, including code reviews, training, and structured skill development.
Two (2)+ years of experience with Python and SQL programming, using data sources such as SQL Server, Oracle, PostgreSQL, or similar relational databases.
One (1)+ year of experience operationalizing analytics within enterprise governance frameworks, partnering with Data Management, Security, and IT to ensure compliance, reproducibility, and best practices.
Education Required & certifications:
This classification requires possession of a Master's degree or higher in Data Science, Statistics, Computer Science, or a closely related field. Additional qualifying professional experience may be substituted for the required education on a year-for-year basis. At least one of the following industry-recognized certifications in data science or cloud analytics, such as:
Microsoft Azure Data Scientist Associate (DP-100)
Databricks Certified Data Scientist or Machine Learning Professional
AWS Machine Learning Specialty
Google Professional Data Engineer • or equivalent advanced analytics certifications. The certification is required and may not be substituted with additional experience.
About US Tech Solutions:
US Tech Solutions is a global staff augmentation firm providing a wide range of talent on-demand and total workforce solutions. To know more about US Tech Solutions, please visit ************************
US Tech Solutions is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, colour, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.
Recruiter Details:
Name: T Saketh Ram Sharma
Email: *****************************
Internal Id: 25-54101
Senior Data Scientist
Plainfield, NJ jobs
Data Scientist - Pharmaceutical Analytics (PhD)
1 year Contract - Hybrid- Plainfield, NJ
We're looking for a PhD-level Data Scientist with experience in the pharmaceutical industry and expertise working with commercial data sets (IQVIA, claims, prescription data). This role will drive insights that shape drug launches, market access, and patient outcomes.
What You'll Do
Apply machine learning & advanced analytics to pharma commercial data
Deliver insights on market dynamics, physician prescribing, and patient behavior
Partner with R&D, medical affairs, and commercial teams to guide strategy
Build predictive models for sales effectiveness, adherence, and market forecasting
What We're Looking For
PhD in Data Science, Statistics, Computer Science, Bioinformatics, or related field
5+ years of pharma or healthcare analytics experience
Strong skills in enterprise-class software stacks and cloud computing
Deep knowledge of pharma market dynamics & healthcare systems
Excellent communication skills to translate data into strategy
Senior Data Engineer
Charlotte, NC jobs
**NO 3rd Party vendor candidates or sponsorship**
Role Title: Senior Data Engineer
Client: Global construction and development company
Employment Type: Contract
Duration: 1 year
Preferred Location: Remote based in ET or CT time zones
Role Description:
The Senior Data Engineer will play a pivotal role in designing, architecting, and optimizing cloud-native data integration and Lakehouse solutions on Azure, with a strong emphasis on Microsoft Fabric adoption, PySpark/Spark-based transformations, and orchestrated pipelines. This role will lead end-to-end data engineering-from ingestion through APIs and Azure services to curated Lakehouse/warehouse layers-while ensuring scalable, secure, well-governed, and well-documented data products. The ideal candidate is hands-on in delivery and also brings data architecture knowledge to help shape patterns, standards, and solution designs.
Key Responsibilities
Design and implement end-to-end data pipelines and ELT/ETL workflows using Azure Data Factory (ADF), Synapse, and Microsoft Fabric.
Build and optimize PySpark/Spark transformations for large-scale processing, applying best practices for performance tuning (partitioning, joins, file sizing, incremental loads).
Develop and maintain API-heavy ingestion patterns, including REST/SOAP integrations, authentication/authorization handling, throttling, retries, and robust error handling.
Architect scalable ingestion, transformation, and serving solutions using Azure Data Lake / OneLake, Lakehouse patterns (Bronze/Silver/Gold), and data warehouse modeling practices.
Implement monitoring, logging, alerting, and operational runbooks for production pipelines; support incident triage and root-cause analysis.
Apply governance and security practices across the lifecycle, including access controls, data quality checks, lineage, and compliance requirements.
Write complex SQL, develop data models, and enable downstream consumption through analytics tools and curated datasets.
Drive engineering standards: reusable patterns, code reviews, documentation, source control, and CI/CD practices.
Requirements:
Bachelor's degree (or equivalent experience) in Computer Science, Engineering, or a related field.
5+ years of experience in data engineering with strong focus on Azure Cloud.
Strong experience with Azure Data Factory pipelines, orchestration patterns, parameterization, and production support.
Strong hands-on experience with Synapse (pipelines, SQL pools and/or Spark), and modern cloud data platform patterns.
Advanced PySpark/Spark experience for complex transformations and performance optimization.
Heavy experience with API-based integrations (building ingestion frameworks, handling auth, pagination, retries, rate limits, and resiliency).
Strong knowledge of SQL and data warehousing concepts (dimensional modeling, incremental processing, data quality validation).
Strong understanding of cloud data architectures including Data Lake, Lakehouse, and Data Warehouse patterns.
Preferred Skills
Experience with Microsoft Fabric (Lakehouse/Warehouse/OneLake, Pipelines, Dataflows Gen2, notebooks).
Architecture experience (formal or informal), such as contributing to solution designs, reference architectures, integration standards, and platform governance.
Experience with DevOps/CI-CD for data engineering using Azure DevOps or GitHub (deployment patterns, code promotion, testing).
Experience with Power BI and semantic model considerations for Lakehouse/warehouse-backed reporting.
Familiarity with data catalog/governance tooling (e.g., Microsoft Purview).
Senior Data Scientist
Chicago, IL jobs
Role: Senior Data Scientist
· We are seeking a hands-on Senior Data Scientist to join our Insurance Analytics & AI Vertical. The ideal candidate will bring a blend of insurance domain expertise (preferably P&C), consulting mindset, and strong data science skills. This is a mid-senior level role focused on delivering value through analytics, stakeholder engagement, and logical problem solving, rather than people management.
· The role involves working closely with EXL teams and clients on reporting, data engineering, transformation, and advanced analytics projects. While strong technical skills are important, we are looking for someone who can engage directly with clients, translate business needs into analytical solutions, and drive measurable impact.
Key Responsibilities
· Collaborate with EXL and client stakeholders to design and deliver data-driven solutions across reporting, analytics, and transformation initiatives.
· Apply traditional statistical methods, machine learning, deep learning, and NLP techniques to solve business problems.
· Support insurance-focused analytics use cases (with preference for P&C lines of business).
· Work in a consulting setup: conduct requirement gathering, structure problem statements, and communicate insights effectively to senior stakeholders.
· Ensure data quality, governance, and compliance with Data Privacy and Protection Guidelines.
· Independently research, analyze, and present findings, ensuring client-ready deliverables.
· Contribute to continuous improvement initiatives and support business development activities where required.
Key Skillsets & Experience
· 7-12 years of experience in analytics, reporting, dashboarding, ETL, Python/R, and associated data management.
· Proficiency in machine learning, deep learning algorithms (e.g., neural networks), and text analytics techniques (NLTK, Gensim, LDA, word embeddings like Word2Vec, FastText, GloVe).
· Strong consulting background with structured problem-solving and stakeholder management skills.
· Excellent communication and presentation skills with the ability to influence and engage senior business leaders.
· Hands-on role with ability to independently manage client deliverables and operate in cross-cultural, global environments.
Data Management Skills
· Strong familiarity with advanced analytics tools (Python, R), BI tools (Tableau, Power BI), and related software applications.
· Good knowledge of SQL, Informatica, Hadoop/Spark, ETL tools.
· Ability to translate business/functional requirements into technical specifications.
· Exposure to cloud data management and AWS services (preferred).
Candidate Profile
· Bachelor's/Master's degree in Economics, Mathematics, Computer Science/Engineering, Operations Research, or related analytical fields.
· Prior insurance industry experience (P&C preferred) strongly desired.
· Superior analytical, logical, and problem-solving skills.
· Outstanding written and verbal communication abilities with a consultative orientation.
· Flexible to work in a fast-paced, evolving environment with occasional visits to the client's Chicago office.
Data Architect - Azure Databricks
Palo Alto, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner.
Please visit Fractal | Intelligence for Imagination for more information about Fractal.
Job Posting Title: Principal Architect - Azure Databricks
Job Description
Seeking a visionary and hands-on Principal Architect to lead large-scale, complex technical initiatives leveraging Databricks within the healthcare payer domain. This role is pivotal in driving data modernization, advanced analytics, and AI/ML solutions for our clients. You will serve as a strategic advisor, technical leader, and delivery expert across multiple engagements.
Responsibilities:
Design & Architecture of Scalable Data Platforms
Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs such as sales forecasting, trade promotions, supply chain optimization etc...
Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management).
Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility.
Client & Business Stakeholder Engagement
Partner with business stakeholders to translate functional requirements into scalable technical solutions.
Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases
Data Pipeline Development & Collaboration
Collaborate with data engineers and data scientists to develop end-to-end pipelines using PySpark, SQL, DLT (Delta Live Tables), and Databricks Workflows.
Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets.
Performance, Scalability, and Reliability
Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques.
Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools
Security, Compliance & Governance
Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies.
Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging.
Adoption of AI Copilots & Agentic Development
Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for:
Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks.
Generating documentation and test cases to accelerate pipeline development.
Interactive debugging and iterative code optimization within notebooks.
Advocate for agentic AI workflows that use specialized agents for:
Data profiling and schema inference.
Automated testing and validation.
Innovation and Continuous Learning
Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling.
Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements.
Requirements:
Bachelor's or master's degree in computer science, Information Technology, or a related field.
12-18 years of hands-on experience in data engineering, with at least 5+ years on Databricks Architecture and Apache Spark.
Expertise in building high-throughput, low-latency ETL/ELT pipelines on Azure Databricks using PySpark, SQL, and Databricks-native features.
Familiarity with ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage (Azure Data Lake Storage Gen2)
Experience designing Lakehouse architectures with bronze, silver, gold layering.
Expertise in optimizing Databricks performance using Delta Lake features such as OPTIMIZE, VACUUM, ZORDER, and Time Travel
Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing.
Experience with designing Data marts using Databricks SQL warehouse and integrating with BI tools (Power BI, Tableau, etc.).
Hands-on experience designing solutions using Workflows (Jobs), Delta Lake, Delta Live Tables (DLT), Unity Catalog, and MLflow.
Familiarity with Databricks REST APIs, Notebooks, and cluster configurations for automated provisioning and orchestration.
Experience in integrating Databricks with CI/CD pipelines using tools such as Azure DevOps, GitHub Actions.
Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning Databricks workspaces and resources
In-depth experience with Azure Cloud services such as ADF, Synapse, ADLS, Key Vault, Azure Monitor, and Azure Security Centre.
Strong understanding of data privacy, access controls, and governance best practices.
Experience working with Unity Catalog, RBAC, tokenization, and data classification frameworks
Worked as a consultant for more than 4-5 years with multiple clients
Contribute to pre-sales, proposals, and client presentations as a subject matter expert.
Participated and Lead RFP responses for your organization. Experience in providing solutions for technical problems and provide cost estimates
Excellent communication skills for stakeholder interaction, solution presentations, and team coordination.
Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements.
Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality.
Pay:
The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $ 200,000 - $300,000. In addition, you may be eligible for a discretionary bonus for the current performance period.
Benefits:
As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take time needed for either sick time or vacation.
Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
AWS Data Architect
San Jose, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner.
Please visit Fractal | Intelligence for Imagination for more information about Fractal.
Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines.
Responsibilities:
Design & Architecture of Scalable Data Platforms
Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs
Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management).
Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility.
Client & Business Stakeholder Engagement
Partner with business stakeholders to translate functional requirements into scalable technical solutions.
Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases
Data Pipeline Development & Collaboration
Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL
Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets.
Performance, Scalability, and Reliability
Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques.
Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools
Security, Compliance & Governance
Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies.
Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging.
Adoption of AI Copilots & Agentic Development
Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for
Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks.
Generating documentation and test cases to accelerate pipeline development.
Interactive debugging and iterative code optimization within notebooks.
Advocate for agentic AI workflows that use specialized agents for
Data profiling and schema inference.
Automated testing and validation.
Innovation and Continuous Learning
Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling.
Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements.
Requirements:
Bachelor's or master's degree in computer science, Information Technology, or a related field.
8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark.
Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL.
Excellent hands on experience with workload automation tools such as Airflow, Prefect etc.
Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage
Experience designing Lakehouse architectures with bronze, silver, gold layering.
Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing.
Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.).
Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions.
Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources
In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc.
Strong understanding of data privacy, access controls, and governance best practices.
Experience working with RBAC, tokenization, and data classification frameworks
Excellent communication skills for stakeholder interaction, solution presentations, and team coordination.
Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements.
Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality
Must be able to work in PST time zone.
Pay:
The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period.
Benefits:
As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation.
Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
AWS Data Architect
Santa Rosa, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner.
Please visit Fractal | Intelligence for Imagination for more information about Fractal.
Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines.
Responsibilities:
Design & Architecture of Scalable Data Platforms
Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs
Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management).
Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility.
Client & Business Stakeholder Engagement
Partner with business stakeholders to translate functional requirements into scalable technical solutions.
Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases
Data Pipeline Development & Collaboration
Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL
Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets.
Performance, Scalability, and Reliability
Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques.
Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools
Security, Compliance & Governance
Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies.
Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging.
Adoption of AI Copilots & Agentic Development
Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for
Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks.
Generating documentation and test cases to accelerate pipeline development.
Interactive debugging and iterative code optimization within notebooks.
Advocate for agentic AI workflows that use specialized agents for
Data profiling and schema inference.
Automated testing and validation.
Innovation and Continuous Learning
Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling.
Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements.
Requirements:
Bachelor's or master's degree in computer science, Information Technology, or a related field.
8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark.
Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL.
Excellent hands on experience with workload automation tools such as Airflow, Prefect etc.
Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage
Experience designing Lakehouse architectures with bronze, silver, gold layering.
Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing.
Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.).
Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions.
Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources
In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc.
Strong understanding of data privacy, access controls, and governance best practices.
Experience working with RBAC, tokenization, and data classification frameworks
Excellent communication skills for stakeholder interaction, solution presentations, and team coordination.
Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements.
Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality
Must be able to work in PST time zone.
Pay:
The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period.
Benefits:
As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation.
Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
AWS Data Architect
San Francisco, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner.
Please visit Fractal | Intelligence for Imagination for more information about Fractal.
Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines.
Responsibilities:
Design & Architecture of Scalable Data Platforms
Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs
Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management).
Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility.
Client & Business Stakeholder Engagement
Partner with business stakeholders to translate functional requirements into scalable technical solutions.
Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases
Data Pipeline Development & Collaboration
Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL
Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets.
Performance, Scalability, and Reliability
Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques.
Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools
Security, Compliance & Governance
Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies.
Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging.
Adoption of AI Copilots & Agentic Development
Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for
Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks.
Generating documentation and test cases to accelerate pipeline development.
Interactive debugging and iterative code optimization within notebooks.
Advocate for agentic AI workflows that use specialized agents for
Data profiling and schema inference.
Automated testing and validation.
Innovation and Continuous Learning
Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling.
Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements.
Requirements:
Bachelor's or master's degree in computer science, Information Technology, or a related field.
8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark.
Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL.
Excellent hands on experience with workload automation tools such as Airflow, Prefect etc.
Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage
Experience designing Lakehouse architectures with bronze, silver, gold layering.
Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing.
Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.).
Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions.
Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources
In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc.
Strong understanding of data privacy, access controls, and governance best practices.
Experience working with RBAC, tokenization, and data classification frameworks
Excellent communication skills for stakeholder interaction, solution presentations, and team coordination.
Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements.
Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality
Must be able to work in PST time zone.
Pay:
The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period.
Benefits:
As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation.
Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
AWS Data Architect
Fremont, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner.
Please visit Fractal | Intelligence for Imagination for more information about Fractal.
Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines.
Responsibilities:
Design & Architecture of Scalable Data Platforms
Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs
Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management).
Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility.
Client & Business Stakeholder Engagement
Partner with business stakeholders to translate functional requirements into scalable technical solutions.
Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases
Data Pipeline Development & Collaboration
Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL
Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets.
Performance, Scalability, and Reliability
Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques.
Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools
Security, Compliance & Governance
Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies.
Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging.
Adoption of AI Copilots & Agentic Development
Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for
Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks.
Generating documentation and test cases to accelerate pipeline development.
Interactive debugging and iterative code optimization within notebooks.
Advocate for agentic AI workflows that use specialized agents for
Data profiling and schema inference.
Automated testing and validation.
Innovation and Continuous Learning
Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling.
Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements.
Requirements:
Bachelor's or master's degree in computer science, Information Technology, or a related field.
8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark.
Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL.
Excellent hands on experience with workload automation tools such as Airflow, Prefect etc.
Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage
Experience designing Lakehouse architectures with bronze, silver, gold layering.
Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing.
Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.).
Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions.
Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources
In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc.
Strong understanding of data privacy, access controls, and governance best practices.
Experience working with RBAC, tokenization, and data classification frameworks
Excellent communication skills for stakeholder interaction, solution presentations, and team coordination.
Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements.
Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality
Must be able to work in PST time zone.
Pay:
The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period.
Benefits:
As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation.
Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
Data Architect
Cincinnati, OH jobs
THIS IS A W2 (NOT C2C OR REFERRAL BASED) CONTRACT OPPORTUNITY
REMOTE MOSTLY WITH 1 DAY/MO ONSITE IN CINCINNATI-LOCAL CANDIDATES TAKE PREFERENCE
RATE: $75-85/HR WITH BENEFITS
We are seeking a highly skilled Data Architect to function in a consulting capacity to analyze, redesign, and optimize a Medical Payments client's environment. The ideal candidate will have deep expertise in SQL, Azure cloud services, and modern data architecture principles.
Responsibilities
Design and maintain scalable, secure, and high-performing data architectures.
Lead migration and modernization projects in heavy use production systems.
Develop and optimize data models, schemas, and integration strategies.
Implement data governance, security, and compliance standards.
Collaborate with business stakeholders to translate requirements into technical solutions.
Ensure data quality, consistency, and accessibility across systems.
Required Qualifications
Bachelor's degree in Computer Science, Information Systems, or related field.
Proven experience as a Data Architect or similar role.
Strong proficiency in SQL (query optimization, stored procedures, indexing).
Hands-on experience with Azure cloud services for data management and analytics.
Knowledge of data modeling, ETL processes, and data warehousing concepts.
Familiarity with security best practices and compliance frameworks.
Preferred Skills
Understanding of Electronic Health Records systems.
Understanding of Big Data technologies and modern data platforms outside the scope of this project.
Data Engineer
New York, NY jobs
Hey All, We are looking for a mid-level data engineer. No third parties As a result of this expansion, we are seeking experienced software Data engineers with 5+ years of relevant experience to support the design and development of a strategic data platform for SMBC Capital Markets and Nikko Securities Group.
Qualifications and Skills
• Proven experience as a Data Engineer with experience in Azure cloud.
• Experience implementing solutions using -
• Azure cloud services
• Azure Data Factory
• Azure Lake Gen 2
• Azure Databases
• Azure Data Fabric
• API Gateway management
• Azure Functions
• Well versed with Azure Databricks
• Strong SQL skills with RDMS or no SQL databases
• Experience with developing APIs using FastAPI or similar frameworks in Python
• Familiarity with the DevOps lifecycle (git, Jenkins, etc.), CI/CD processes
• Good understanding of ETL/ELT processes
• Experience in financial services industry, financial instruments, asset classes and market data are a plus.
Data Architect
Plano, TX jobs
KPI Partners is a 5 times Gartner-recognized data, analytics, and AI consulting company. We are leaders in data engineering on Azure, AWS, Google, Snowflake, and Databricks. Founded in 2006, KPI has over 400 consultants and has successfully delivered over 1,000 projects to our clients. We are looking for skilled data engineers who want to work with the best team in data engineering.
Title: Senior Data Architect
Location: Plano, TX (Hybrid)
Job Type: Contract - 6 Months
Key Skills: SQL, PySpark, Databricks, and Azure Cloud
Key Note: Looking for a Data Architect who is Hands-on with SQL, PySpark, Databricks, and Azure Cloud.
About the Role:
We are seeking a highly skilled and experienced Senior Data Architect to join our dynamic team at KPI, working on challenging and multi-year data transformation projects. This is an excellent opportunity for a talented data engineer to play a key role in building innovative data solutions using Azure Native Services and related technologies. If you are passionate about working with large-scale data systems and enjoy solving complex engineering problems, this role is for you.
Key Responsibilities:
Data Engineering: Design, development, and implementation of data pipelines and solutions using PySpark, SQL, and related technologies.
Collaboration: Work closely with cross-functional teams to understand business requirements and translate them into robust data solutions.
Data Warehousing: Design and implement data warehousing solutions, ensuring scalability, performance, and reliability.
Continuous Learning: Stay up to date with modern technologies and trends in data engineering and apply them to improve our data platform.
Mentorship: Provide guidance and mentorship to junior data engineers, ensuring best practices in coding, design, and development.
Must-Have Skills & Qualifications:
Minimum 12+ years of overall experience in IT Industry.
4+ years of experience in data engineering, with a strong background in building large-scale data solutions.
4+ years of hands-on experience developing and implementing data pipelines using Azure stack experience (Azure, ADF, Databricks, Functions)
Proven expertise in SQL for querying, manipulating, and analyzing large datasets.
Strong knowledge of ETL processes and data warehousing fundamentals.
Self-motivated and independent, with a “let's get this done” mindset and the ability to thrive in a fast-paced and dynamic environment.
Good-to-Have Skills:
Databricks Certification is a plus.
Data Modeling, Azure Architect Certification.
Azure Data Engineer
Jersey City, NJ jobs
Title: Senior Azure Data Engineer Client: Major Japanese Bank Experience Level: Senior (10+ Years)
The Senior Azure Data Engineer will design, build, and optimize enterprise data solutions within Microsoft Azure for a major Japanese bank. This role focuses on architecting scalable data pipelines, enhancing data lake environments, and ensuring security, compliance, and data governance best practices.
Key Responsibilities:
Develop, maintain, and optimize Azure-based data pipelines and ETL/ELT workflows.
Design and implement Azure Data Lake, Synapse, Databricks, and ADF solutions.
Ensure data security, compliance, lineage, and governance controls.
Partner with architecture, data governance, and business teams to deliver high-quality data solutions.
Troubleshoot performance issues and improve system efficiency.
Required Skills:
10+ years of data engineering experience.
Strong hands-on expertise with Azure Synapse, Azure Data Factory, Azure Databricks, Azure Data Lake, and Azure SQL.
Azure certifications strongly preferred.
Strong SQL, Python, and cloud data architecture skills.
Experience in financial services or large enterprise environments preferred.
Senior Data Analytics Engineer (Customer Data)
Irving, TX jobs
Our client is seeking a Senior Data Analytics Engineer (Customer Data) to join their team! This position is located in remote.
Build, optimize, and maintain customer data pipelines in PySpark/Databricks to support CDP-driven use cases across AWS/Azure/GCP
Transform raw and integrated customer data into analytics-ready datasets used for dashboards, reporting, segmentation, personalization, and downstream AI/ML applications
Develop and enrich customer behavior metrics, campaign analytics, and performance insights such as: ad engagement, lifecycle metrics, retention
Partner with Marketing, Sales, Product, and Data Science teams to translate business goals into metrics, features, and analytical data models
Build datasets consumed by Power BI/Tableau dashboards (hands-on dashboard creation not required)
Ensure high cluster performance and pipeline optimization in Databricks, including troubleshooting skewed joins, sorting, partitioning, and real-time processing needs
Work across multiple cloud and vendor ecosystems such as: AWS/Azure/GCP; Hightouch or comparable CDP vendors
Participate in the data ingestion and digestion phases, shaping integrated data into analytical layers for MarTech and BI
Contribute to and enforce data engineering standards, documentation, governance, and best practices across the organization
Desired Skills/Experience:
6+ years of experience in Data Engineering, Analytics Engineering, or related fields, including data modeling experience
Strong Data Engineering fundamentals with the ability to design pipelines, optimize performance, and deliver real-time or near-real-time datasets
Ability to deeply understand data, identifying gaps, designing meaningful transformations, and creating metrics with clear business context
Understanding of how customer data moves through Customer Data Platforms (CDPs) and how to design pipelines that integrate with them
Experience supporting Marketing, Customer Data, MarTech, CDP, segmentation, or personalization teams strongly preferred
Hands-on experience required with: Databricks, PySpark, Python, SQL, Building analytics datasets for dashboards/reporting and customer behavior analytics or campaign performance insights
Experience designing and implementing features that feed downstream AI or customer-facing applications
Benefits:
Medical, Dental, & Vision Insurance Plans
Employee-Owned Profit Sharing (ESOP)
401K offered
The approximate pay range for this position starting at $150-160,000+. Please note that the pay range provided is a good faith estimate. Final compensation may vary based on factors including but not limited to background, knowledge, skills, and location. We comply with local wage minimums.
Senior Data Engineer
Austin, TX jobs
We are looking for a seasoned Azure Data Engineer to design, build, and optimize secure, scalable, and high-performance data solutions within the Microsoft Azure ecosystem. This will be a multi-year contract worked FULLY ONSITE in Austin, TX.
The ideal candidate brings deep technical expertise in data architecture, ETL/ELT engineering, data integration, and governance, along with hands-on experience in MDM, API Management, Lakehouse architectures, and data mesh or data hub frameworks. This position combines strategic architectural planning with practical, hands-on implementation, empowering cross-functional teams to leverage data as a key organizational asset.
Key Responsibilities
1. Data Architecture & Strategy
Design and deploy end-to-end Azure data platforms using Azure Data Lake, Azure Synapse Analytics, Azure Databricks, and Azure SQL Database.
Build and implement Lakehouse and medallion (Bronze/Silver/Gold) architectures for scalable and modular data processing.
Define and support data mesh and data hub patterns to promote domain-driven design and federated governance.
Establish standards for conceptual, logical, and physical data modeling across data warehouse and data lake environments.
2. Data Integration & Pipeline Development
Develop and maintain ETL/ELT pipelines using Azure Data Factory, Synapse Pipelines, and Databricks for both batch and streaming workloads.
Integrate diverse data sources (on-prem, cloud, SaaS, APIs) into a unified Azure data environment.
Optimize pipelines for cost-effectiveness, performance, and scalability.
3. Master Data Management (MDM) & Data Governance
Implement MDM solutions using Azure-native or third-party platforms (e.g., Profisee, Informatica, Semarchy).
Define and manage data governance, metadata, and data quality frameworks.
Partner with business teams to align data standards and maintain data integrity across domains.
4. API Management & Integration
Build and manage APIs for data access, transformation, and system integration using Azure API Management and Logic Apps.
Design secure, reliable data services for internal and external consumers.
Automate workflows and system integrations using Azure Functions, Logic Apps, and Power Automate.
5. Database & Platform Administration
Perform core DBA tasks, including performance tuning, query optimization, indexing, and backup/recovery for Azure SQL and Synapse.
Monitor and optimize cost, performance, and scalability across Azure data services.
Implement CI/CD and Infrastructure-as-Code (IaC) solutions using Azure DevOps, Terraform, or Bicep.
6. Collaboration & Leadership
Work closely with data scientists, analysts, business stakeholders, and application teams to deliver high-value data solutions.
Mentor junior engineers and define best practices for coding, data modeling, and solution design.
Contribute to enterprise-wide data strategy and roadmap development.
Required Qualifications
Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems, or related fields.
5+ years of hands-on experience in Azure-based data engineering and architecture.
Strong proficiency with the following:
Azure Data Factory, Azure Synapse, Azure Databricks, Azure Data Lake Storage Gen2
SQL, Python, PySpark, PowerShell
Azure API Management and Logic Apps
Solid understanding of data modeling approaches (3NF, dimensional modeling, Data Vault, star/snowflake schemas).
Proven experience with Lakehouse/medallion architectures and data mesh/data hub designs.
Familiarity with MDM concepts, data governance frameworks, and metadata management.
Experience with automation, data-focused CI/CD, and IaC.
Thorough understanding of Azure security, RBAC, Key Vault, and core networking principles.
What We Offer
Competitive compensation and benefits package
Luna Data Solutions, Inc. (LDS) provides equal employment opportunities to all employees. All applicants will be considered for employment. LDS prohibits discrimination and harassment of any type regarding age, race, color, religion, sexual orientation, gender identity, sex, national origin, genetics, protected veteran status, and/or disability status.
Big Data Engineer
Santa Monica, CA jobs
Our client is seeking a Big Data Engineer to join their team! This position is located in Santa Monica, California.
Design and build core components of a large-scale data platform for both real-time and batch processing, owning key features of big data applications that evolve with business needs
Develop next-generation, cloud-based big data infrastructure supporting batch and streaming workloads, with continuous improvements to performance, scalability, reliability, and availability
Champion engineering excellence, promoting best practices such as design patterns, CI/CD, thorough code reviews, and automated testing
Drive innovation, contributing new ideas and applying cutting-edge technologies to deliver impactful solutions
Participate in the full software development lifecycle, including system design, experimentation, implementation, deployment, and testing
Collaborate closely with program managers, product managers, SDETs, and researchers in an open, agile, and highly innovative environment
Desired Skills/Experience:
Bachelor's degree in a STEM field such as: Science, Technology, Engineering, Mathematics
5+ years of relevant professional experience
4+ years of professional software development experience using Java, Scala, Python, or similar programming languages
3+ years of hands-on big data development experience with technologies such as Spark, Flink, SingleStore, Kafka, NiFi, and AWS big data tools
Strong understanding of system and application design, architecture principles, and distributed system fundamentals
Proven experience building highly available, scalable, and production-grade services
Genuine passion for technology, with the ability to work across interdisciplinary areas and adopt new tools or approaches
Experience processing massive datasets at the petabyte scale
Proficiency with cloud infrastructure and DevOps tools, such as Terraform, Kubernetes (K8s), Spinnaker, IAM, and ALB
Hands-on experience with modern data warehousing and analytics platforms, including ClickHouse, Druid, Snowflake, Impala, Presto, Kinesis, and more
Familiarity with common web development frameworks, such as Spring Boot, React.js, Vue.js, or Angular
Benefits:
Medical, Dental, & Vision Insurance Plans
Employee-Owned Profit Sharing (ESOP)
401K offered
The approximate pay range for this position is between $52.00 and $75.00. Please note that the pay range provided is a good faith estimate. Final compensation may vary based on factors including but not limited to background, knowledge, skills, and location. We comply with local wage minimums.
Senior Data Engineer
Glendale, CA jobs
Our client is seeking a Senior Data Engineer to join their team! This position is located in Glendale, California.
Contribute to maintaining, updating, and expanding existing Core Data platform data pipelines
Build tools and services to support data discovery, lineage, governance, and privacy
Collaborate with other software and data engineers and cross-functional teams
Work with a tech stack that includes Airflow, Spark, Databricks, Delta Lake, Kubernetes, and AWS
Collaborate with product managers, architects, and other engineers to drive the success of the Core Data platform
Contribute to developing and documenting internal and external standards and best practices for pipeline configurations, naming conventions, and more
Ensure high operational efficiency and quality of Core Data platform datasets to meet SLAs and ensure reliability and accuracy for stakeholders in Engineering, Data Science, Operations, and Analytics
Participate in agile and scrum ceremonies to collaborate and refine team processes
Engage with customers to build relationships, understand needs, and prioritize both innovative solutions and incremental platform improvements
Maintain detailed documentation of work and changes to support data quality and data governance requirements
Desired Skills/Experience:
5+ years of data engineering experience developing large data pipelines
Proficiency in at least one major programming language such as: Python, Java or Scala
Strong SQL skills and the ability to create queries to analyze complex datasets
Hands-on production experience with distributed processing systems such as Spark
Experience interacting with and ingesting data efficiently from API data sources
Experience coding with the Spark DataFrame API to create data engineering workflows in Databricks
Hands-on production experience with data pipeline orchestration systems such as Airflow for creating and maintaining data pipelines
Experience developing APIs with GraphQL
Deep understanding of AWS or other cloud providers, as well as infrastructure-as-code
Familiarity with data modeling techniques and data warehousing best practices
Strong algorithmic problem-solving skills
Excellent written and verbal communication skills
Advanced understanding of OLTP versus OLAP environments
Benefits:
Medical, Dental, & Vision Insurance Plans
Employee-Owned Profit Sharing (ESOP)
401K offered
The approximate pay range for this position is between $51.00 and $73.00. Please note that the pay range provided is a good faith estimate. Final compensation may vary based on factors including but not limited to background, knowledge, skills, and location. We comply with local wage minimums.
Azure Data Engineer
Irving, TX jobs
Our client is seeking an Azure Data Engineer to join their team! This position is located in Irving, Texas. THIS ROLE REQUIRES AN ONSITE INTERVIEW IN IRVING, please only apply if you are local and available to interview onsite.
Duties:
Lead the design, architecture, and implementation of key data initiatives and platform capabilities
Optimize existing data workflows and systems to improve performance, cost-efficiency, identifying and guiding teams to implement solutions
Lead and mentor a team of 2-5 data engineers, providing guidance on technical best practices, career development, and initiative execution
Contribute to the development of data engineering standards, processes, and documentation, promoting consistency and maintainability across teams while enabling business stakeholders
Desired Skills/Experience:
Bachelor's degree or equivalent in Computer Science, Mathematics, Software Engineering, Management Information Systems, etc.
5+ years of relevant work experience in data engineering
Strong technical skills in SQL, PySpark/Python, Azure, and Databricks
Deep understanding of data engineering fundamentals, including database architecture and design, ETL, etc.
Benefits:
Medical, Dental, & Vision Insurance Plans
Employee-Owned Profit Sharing (ESOP)
401K offered
The approximate pay range for this position starting at $140-145,000+. Please note that the pay range provided is a good faith estimate. Final compensation may vary based on factors including but not limited to background, knowledge, skills, and location. We comply with local wage minimums.
Data Engineer
Austin, TX jobs
About the Role
We are seeking a highly skilled Databricks Data Engineer with strong expertise in modern data engineering, Azure cloud technologies, and Lakehouse architectures. This role is ideal for someone who thrives in dynamic environments, enjoys solving complex data challenges, and can lead end-to-end delivery of scalable data solutions.
What We're Looking For
8+ years designing and delivering scalable data pipelines in modern data platforms
Deep experience in data engineering, data warehousing, and enterprise-grade solution delivery
Ability to lead cross-functional initiatives in matrixed teams
Advanced skills in SQL, Python, and ETL/ELT development, including performance tuning
Hands-on experience with Azure, Snowflake, and Databricks, including system integrations
Key Responsibilities
Design, build, and optimize large-scale data pipelines on the Databricks Lakehouse platform
Modernize and enhance cloud-based data ecosystems on Azure, contributing to architecture, modeling, security, and CI/CD
Use Apache Airflow and similar tools for workflow automation and orchestration
Work with financial or regulated datasets while ensuring strong compliance and governance
Drive best practices in data quality, lineage, cataloging, and metadata management
Primary Technical Skills
Develop and optimize ETL/ELT pipelines using Python, PySpark, Spark SQL, and Databricks Notebooks
Design efficient Delta Lake models for reliability and performance
Implement and manage Unity Catalog for governance, RBAC, lineage, and secure data sharing
Build reusable frameworks using Databricks Workflows, Repos, and Delta Live Tables
Create scalable ingestion pipelines for APIs, databases, files, streaming sources, and MDM systems
Automate ingestion and workflows using Python and REST APIs
Support downstream analytics for BI, data science, and application workloads
Write optimized SQL/T-SQL queries, stored procedures, and curated datasets
Automate DevOps workflows, testing pipelines, and workspace configurations
Additional Skills
Azure: Data Factory, Data Lake, Key Vault, Logic Apps, Functions
CI/CD: Azure DevOps
Orchestration: Apache Airflow (plus)
Streaming: Delta Live Tables
MDM: Profisee (nice-to-have)
Databases: SQL Server, Cosmos DB
Soft Skills
Strong analytical and problem-solving mindset
Excellent communication and cross-team collaboration
Detail-oriented with a high sense of ownership and accountability