ETL/ELT Data Engineer (Secret Clearance) - Hybrid
Austin, TX jobs
LaunchCode is recruiting for a Software Data Engineer to work at one of our partner companies!
Details:
Full-Time W2, Salary
Immediate opening
Hybrid - Austin, TX (onsite 1-2 times a week)
Pay $85K-$120K
Minimum Experience: 4 years
Security Clearance: Active DoD Secret Clearance
Disclaimer: Please note that we are unable to provide work authorization or sponsorship for this role, now or in the future. Candidates requiring current or future sponsorship will not be considered.
Job description
Job Summary
A Washington, DC-based software solutions provider founded in 2017, specializes in delivering mission-critical and enterprise solutions to the federal government. Originating from the Department of Defense's software factory ecosystem, the company focuses on Command and Control, Cybersecurity, Space, Geospatial, and Modeling & Simulation. The company leverages commercial technology to enhance the capabilities of the DoD, IC, and their end-users, with innovation driven by its Innovation centers. The company has a presence in Boston, MA, Colorado Springs, CO, San Antonio, TX, and St. Louis, MO.
Why the company?
Environment of Autonomy
Innovative Commercial Approach
People over process
We are seeking a passionate Software Data Engineer to support the Army Software Factory (ASWF) in aligning with DoDM 8140.03 Cyber Workforce requirements and broader compliance mandates. The Army Software Factory (ASWF), a first-of-its-kind initiative under Army Futures Command, is revolutionizing the Army's approach to software development by training and employing self-sustaining technical talent from across the military and civilian workforce. Guided by the motto “By Soldiers, For Soldiers,” ASWF equips service members to develop mission-critical software solutions independently-especially vital for future contested environments where traditional technical support may be unavailable. This initiative also serves as a strategic prototype to modernize legacy IT processes and build technical readiness across the force to ensure battlefield dominance in the digital age.
Required Skills:
Active DoD Secret Clearance (Required)
4+ years of experience in data science, data engineering, or similar roles.
Expertise in designing, building, and maintaining scalable ETL/ELT pipelines using tools and languages such as Python, SQL, Apache Spark, or Airflow.
Strong proficiency in working with relational and NoSQL databases, including experience with database design, optimization, and query performance tuning (e.g., PostgreSQL, MySQL, MongoDB, Cassandra).
Demonstrable experience with cloud data platforms and services (e.g., AWS Redshift, S3, Glue, Athena; Azure Data Lake, Data Factory, Synapse; Google BigQuery, Cloud Storage, Dataflow).
Solid understanding of data warehousing concepts (e.g., Kimball, Inmon methodologies) and experience with data modeling for analytical purposes.
Proficiency in at least one programming language commonly used in data engineering (e.g., Python, Java, Scala) for data manipulation, scripting, and automation.
CompTIA Security+ Certified or otherwise DoDM 8140.03 (formerly DoD 8570.01-M) compliant.
Nice to Have:
Familiarity with SBIR technologies and transformative platform shifts
Experience working in Agile or DevSecOps environments
2+ years of experience interfacing with Platform Engineers and data visibility team, manage AWS resources, and GitLab admin
#LI-hybrid #austintx #ETLengineer #dataengineer #army #aswf #clearancejobs #clearedjobs #secretclearance #ETL
Data Architect - Azure Databricks
Palo Alto, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner.
Please visit Fractal | Intelligence for Imagination for more information about Fractal.
Job Posting Title: Principal Architect - Azure Databricks
Job Description
Seeking a visionary and hands-on Principal Architect to lead large-scale, complex technical initiatives leveraging Databricks within the healthcare payer domain. This role is pivotal in driving data modernization, advanced analytics, and AI/ML solutions for our clients. You will serve as a strategic advisor, technical leader, and delivery expert across multiple engagements.
Responsibilities:
Design & Architecture of Scalable Data Platforms
Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs such as sales forecasting, trade promotions, supply chain optimization etc...
Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management).
Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility.
Client & Business Stakeholder Engagement
Partner with business stakeholders to translate functional requirements into scalable technical solutions.
Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases
Data Pipeline Development & Collaboration
Collaborate with data engineers and data scientists to develop end-to-end pipelines using PySpark, SQL, DLT (Delta Live Tables), and Databricks Workflows.
Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets.
Performance, Scalability, and Reliability
Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques.
Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools
Security, Compliance & Governance
Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies.
Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging.
Adoption of AI Copilots & Agentic Development
Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for:
Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks.
Generating documentation and test cases to accelerate pipeline development.
Interactive debugging and iterative code optimization within notebooks.
Advocate for agentic AI workflows that use specialized agents for:
Data profiling and schema inference.
Automated testing and validation.
Innovation and Continuous Learning
Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling.
Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements.
Requirements:
Bachelor's or master's degree in computer science, Information Technology, or a related field.
12-18 years of hands-on experience in data engineering, with at least 5+ years on Databricks Architecture and Apache Spark.
Expertise in building high-throughput, low-latency ETL/ELT pipelines on Azure Databricks using PySpark, SQL, and Databricks-native features.
Familiarity with ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage (Azure Data Lake Storage Gen2)
Experience designing Lakehouse architectures with bronze, silver, gold layering.
Expertise in optimizing Databricks performance using Delta Lake features such as OPTIMIZE, VACUUM, ZORDER, and Time Travel
Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing.
Experience with designing Data marts using Databricks SQL warehouse and integrating with BI tools (Power BI, Tableau, etc.).
Hands-on experience designing solutions using Workflows (Jobs), Delta Lake, Delta Live Tables (DLT), Unity Catalog, and MLflow.
Familiarity with Databricks REST APIs, Notebooks, and cluster configurations for automated provisioning and orchestration.
Experience in integrating Databricks with CI/CD pipelines using tools such as Azure DevOps, GitHub Actions.
Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning Databricks workspaces and resources
In-depth experience with Azure Cloud services such as ADF, Synapse, ADLS, Key Vault, Azure Monitor, and Azure Security Centre.
Strong understanding of data privacy, access controls, and governance best practices.
Experience working with Unity Catalog, RBAC, tokenization, and data classification frameworks
Worked as a consultant for more than 4-5 years with multiple clients
Contribute to pre-sales, proposals, and client presentations as a subject matter expert.
Participated and Lead RFP responses for your organization. Experience in providing solutions for technical problems and provide cost estimates
Excellent communication skills for stakeholder interaction, solution presentations, and team coordination.
Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements.
Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality.
Pay:
The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $ 200,000 - $300,000. In addition, you may be eligible for a discretionary bonus for the current performance period.
Benefits:
As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take time needed for either sick time or vacation.
Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
Head of Data Science & AI
Austin, TX jobs
Duration: 6 month contract-to-hire
Compensation: $150K-160K
Work schedule: Monday-Friday (8 AM-5PM CST) - onsite 3x per week
Benefits: This position is eligible for medical, dental, vision and 401(k)
The Head of Data Science & AI leads the organization's data science strategy and team, driving advanced analytics and AI initiatives to deliver business value and innovation. This role sets the strategic direction for data science, ensures alignment with organizational goals, and promotes a data-driven culture. It involves close collaboration with business and technology teams to identify opportunities for leveraging machine learning and AI to improve operations and customer experiences.
Key Responsibilities
Develop and execute a data science strategy and roadmap aligned with business objectives.
Build and lead the data science team, providing mentorship and fostering growth.
Partner with business leaders to identify challenges and deliver actionable insights.
Oversee design and deployment of predictive models, algorithms, and analytical frameworks.
Ensure data integrity, governance, and security in collaboration with engineering teams.
Communicate complex insights to non-technical stakeholders.
Manage infrastructure, tools, and budget for data science initiatives.
Drive experimentation with emerging AI technologies and ensure ethical AI practices.
Oversee full AI model lifecycle: development, deployment, monitoring, and compliance.
Qualifications
8+ years in data science/analytics with leadership experience.
Expertise in Python, R, SQL, and ML frameworks (TensorFlow, PyTorch, Scikit-Learn).
Experience deploying ML models and monitoring performance.
Familiarity with visualization tools (Tableau, Power BI).
Strong knowledge of data governance, advanced statistical methods, and AI trends.
Skills in project management tools (MS Project, JIRA) and software development best practices (CI/CD, Git, Agile).
Please apply directly to be considered.
AWS Data Architect
San Jose, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner.
Please visit Fractal | Intelligence for Imagination for more information about Fractal.
Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines.
Responsibilities:
Design & Architecture of Scalable Data Platforms
Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs
Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management).
Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility.
Client & Business Stakeholder Engagement
Partner with business stakeholders to translate functional requirements into scalable technical solutions.
Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases
Data Pipeline Development & Collaboration
Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL
Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets.
Performance, Scalability, and Reliability
Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques.
Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools
Security, Compliance & Governance
Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies.
Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging.
Adoption of AI Copilots & Agentic Development
Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for
Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks.
Generating documentation and test cases to accelerate pipeline development.
Interactive debugging and iterative code optimization within notebooks.
Advocate for agentic AI workflows that use specialized agents for
Data profiling and schema inference.
Automated testing and validation.
Innovation and Continuous Learning
Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling.
Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements.
Requirements:
Bachelor's or master's degree in computer science, Information Technology, or a related field.
8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark.
Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL.
Excellent hands on experience with workload automation tools such as Airflow, Prefect etc.
Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage
Experience designing Lakehouse architectures with bronze, silver, gold layering.
Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing.
Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.).
Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions.
Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources
In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc.
Strong understanding of data privacy, access controls, and governance best practices.
Experience working with RBAC, tokenization, and data classification frameworks
Excellent communication skills for stakeholder interaction, solution presentations, and team coordination.
Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements.
Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality
Must be able to work in PST time zone.
Pay:
The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period.
Benefits:
As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation.
Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
AWS Data Architect
Santa Rosa, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner.
Please visit Fractal | Intelligence for Imagination for more information about Fractal.
Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines.
Responsibilities:
Design & Architecture of Scalable Data Platforms
Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs
Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management).
Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility.
Client & Business Stakeholder Engagement
Partner with business stakeholders to translate functional requirements into scalable technical solutions.
Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases
Data Pipeline Development & Collaboration
Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL
Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets.
Performance, Scalability, and Reliability
Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques.
Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools
Security, Compliance & Governance
Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies.
Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging.
Adoption of AI Copilots & Agentic Development
Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for
Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks.
Generating documentation and test cases to accelerate pipeline development.
Interactive debugging and iterative code optimization within notebooks.
Advocate for agentic AI workflows that use specialized agents for
Data profiling and schema inference.
Automated testing and validation.
Innovation and Continuous Learning
Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling.
Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements.
Requirements:
Bachelor's or master's degree in computer science, Information Technology, or a related field.
8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark.
Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL.
Excellent hands on experience with workload automation tools such as Airflow, Prefect etc.
Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage
Experience designing Lakehouse architectures with bronze, silver, gold layering.
Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing.
Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.).
Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions.
Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources
In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc.
Strong understanding of data privacy, access controls, and governance best practices.
Experience working with RBAC, tokenization, and data classification frameworks
Excellent communication skills for stakeholder interaction, solution presentations, and team coordination.
Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements.
Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality
Must be able to work in PST time zone.
Pay:
The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period.
Benefits:
As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation.
Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
AWS Data Architect
San Francisco, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner.
Please visit Fractal | Intelligence for Imagination for more information about Fractal.
Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines.
Responsibilities:
Design & Architecture of Scalable Data Platforms
Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs
Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management).
Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility.
Client & Business Stakeholder Engagement
Partner with business stakeholders to translate functional requirements into scalable technical solutions.
Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases
Data Pipeline Development & Collaboration
Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL
Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets.
Performance, Scalability, and Reliability
Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques.
Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools
Security, Compliance & Governance
Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies.
Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging.
Adoption of AI Copilots & Agentic Development
Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for
Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks.
Generating documentation and test cases to accelerate pipeline development.
Interactive debugging and iterative code optimization within notebooks.
Advocate for agentic AI workflows that use specialized agents for
Data profiling and schema inference.
Automated testing and validation.
Innovation and Continuous Learning
Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling.
Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements.
Requirements:
Bachelor's or master's degree in computer science, Information Technology, or a related field.
8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark.
Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL.
Excellent hands on experience with workload automation tools such as Airflow, Prefect etc.
Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage
Experience designing Lakehouse architectures with bronze, silver, gold layering.
Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing.
Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.).
Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions.
Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources
In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc.
Strong understanding of data privacy, access controls, and governance best practices.
Experience working with RBAC, tokenization, and data classification frameworks
Excellent communication skills for stakeholder interaction, solution presentations, and team coordination.
Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements.
Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality
Must be able to work in PST time zone.
Pay:
The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period.
Benefits:
As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation.
Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
AWS Data Architect
Sunnyvale, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner.
Please visit Fractal | Intelligence for Imagination for more information about Fractal.
Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines.
Responsibilities:
Design & Architecture of Scalable Data Platforms
Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs
Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management).
Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility.
Client & Business Stakeholder Engagement
Partner with business stakeholders to translate functional requirements into scalable technical solutions.
Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases
Data Pipeline Development & Collaboration
Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL
Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets.
Performance, Scalability, and Reliability
Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques.
Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools
Security, Compliance & Governance
Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies.
Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging.
Adoption of AI Copilots & Agentic Development
Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for
Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks.
Generating documentation and test cases to accelerate pipeline development.
Interactive debugging and iterative code optimization within notebooks.
Advocate for agentic AI workflows that use specialized agents for
Data profiling and schema inference.
Automated testing and validation.
Innovation and Continuous Learning
Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling.
Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements.
Requirements:
Bachelor's or master's degree in computer science, Information Technology, or a related field.
8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark.
Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL.
Excellent hands on experience with workload automation tools such as Airflow, Prefect etc.
Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage
Experience designing Lakehouse architectures with bronze, silver, gold layering.
Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing.
Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.).
Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions.
Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources
In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc.
Strong understanding of data privacy, access controls, and governance best practices.
Experience working with RBAC, tokenization, and data classification frameworks
Excellent communication skills for stakeholder interaction, solution presentations, and team coordination.
Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements.
Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality
Must be able to work in PST time zone.
Pay:
The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period.
Benefits:
As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation.
Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
Senior Data Engineer
Los Angeles, CA jobs
Robert Half is partnering with a well known brand seeking an experienced Data Engineer with Databricks experience. Working alongside data scientists and software developers, you'll work will directly impact dynamic pricing strategies by ensuring the availability, accuracy, and scalability of data systems. This position is full time with full benefits and 3 days onsite in the Woodland Hills, CA area.
Responsibilities:
Design, build, and maintain scalable data pipelines for dynamic pricing models.
Collaborate with data scientists to prepare data for model training, validation, and deployment.
Develop and optimize ETL processes to ensure data quality and reliability.
Monitor and troubleshoot data workflows for continuous integration and performance.
Partner with software engineers to embed data solutions into product architecture.
Ensure compliance with data governance, privacy, and security standards.
Translate stakeholder requirements into technical specifications.
Document processes and contribute to data engineering best practices.
Requirements:
Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
4+ years of experience in data engineering, data warehousing, and big data technologies.
Proficiency in SQL and experience with relational databases (e.g., PostgreSQL, MySQL, SQL Server).
Must have experience in Databricks.
Experience working within Azure or AWS or GCP environment.
Familiarity with big data tools like Spark, Hadoop, or Databricks.
Experience in real-time data pipeline tools.
Experienced with Python
AWS Data Architect
Santa Clara, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner.
Please visit Fractal | Intelligence for Imagination for more information about Fractal.
Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines.
Responsibilities:
Design & Architecture of Scalable Data Platforms
Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs
Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management).
Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility.
Client & Business Stakeholder Engagement
Partner with business stakeholders to translate functional requirements into scalable technical solutions.
Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases
Data Pipeline Development & Collaboration
Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL
Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets.
Performance, Scalability, and Reliability
Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques.
Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools
Security, Compliance & Governance
Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies.
Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging.
Adoption of AI Copilots & Agentic Development
Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for
Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks.
Generating documentation and test cases to accelerate pipeline development.
Interactive debugging and iterative code optimization within notebooks.
Advocate for agentic AI workflows that use specialized agents for
Data profiling and schema inference.
Automated testing and validation.
Innovation and Continuous Learning
Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling.
Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements.
Requirements:
Bachelor's or master's degree in computer science, Information Technology, or a related field.
8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark.
Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL.
Excellent hands on experience with workload automation tools such as Airflow, Prefect etc.
Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage
Experience designing Lakehouse architectures with bronze, silver, gold layering.
Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing.
Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.).
Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions.
Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources
In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc.
Strong understanding of data privacy, access controls, and governance best practices.
Experience working with RBAC, tokenization, and data classification frameworks
Excellent communication skills for stakeholder interaction, solution presentations, and team coordination.
Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements.
Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality
Must be able to work in PST time zone.
Pay:
The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period.
Benefits:
As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation.
Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
AWS Data Architect
Fremont, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner.
Please visit Fractal | Intelligence for Imagination for more information about Fractal.
Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines.
Responsibilities:
Design & Architecture of Scalable Data Platforms
Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs
Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management).
Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility.
Client & Business Stakeholder Engagement
Partner with business stakeholders to translate functional requirements into scalable technical solutions.
Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases
Data Pipeline Development & Collaboration
Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL
Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets.
Performance, Scalability, and Reliability
Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques.
Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools
Security, Compliance & Governance
Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies.
Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging.
Adoption of AI Copilots & Agentic Development
Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for
Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks.
Generating documentation and test cases to accelerate pipeline development.
Interactive debugging and iterative code optimization within notebooks.
Advocate for agentic AI workflows that use specialized agents for
Data profiling and schema inference.
Automated testing and validation.
Innovation and Continuous Learning
Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling.
Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements.
Requirements:
Bachelor's or master's degree in computer science, Information Technology, or a related field.
8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark.
Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL.
Excellent hands on experience with workload automation tools such as Airflow, Prefect etc.
Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage
Experience designing Lakehouse architectures with bronze, silver, gold layering.
Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing.
Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.).
Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions.
Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources
In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc.
Strong understanding of data privacy, access controls, and governance best practices.
Experience working with RBAC, tokenization, and data classification frameworks
Excellent communication skills for stakeholder interaction, solution presentations, and team coordination.
Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements.
Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality
Must be able to work in PST time zone.
Pay:
The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period.
Benefits:
As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation.
Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
Data Engineer
Culver City, CA jobs
Robert Half is partnering with a well known high tech company seeking an experienced Data Engineer with strong Python and SQL skills. The primary duties involve managing the complete data lifecycle and utilizing extensive datasets across marketing, software, and web platforms. This position is full time with full benefits and 3 days onsite in the Culver CIty area.
Responsibilities:
4+ years of professional experience ideally in a combination of data engineering and business intelligence.
Working heavily with SQL and programming in Python.
Ownership mindset to oversee the entire data lifecycle, including collection, extraction, and cleansing processes.
Building reports and data visualization to help advance business.
Leverage industry-standard tools for data integration such as Talend.
Work extensively within Cloud based ecosystems such as AWS and GCP ecosystems.
Requirements:
Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
5+ years of experience in data engineering, data warehousing, and big data technologies.
Proficiency in SQL and experience with relational databases (e.g., PostgreSQL, MySQL, SQL Server) and NoSQL Technologies.
Experience working within GCP environments and AWS.
Experience in real-time data pipeline tools.
Hands-on expertise with Google Cloud services including BigQuery.
Deep knowledge of SQL including Dimension tables and experienced in Python programming.
Data Engineer
Coppell, TX jobs
Title: Data Engineer
Assignment Type: 6-12 month contract-to-hire
Compensation: $65/hr-$75/hr W2
Work Model: Hybrid (4 days on-site, 1 day remote)
Benefits: Medical, Dental, Vision, 401(k)
What we need is someone who comes 8+ years of experience in the Data Engineering space who specializes in Microsoft Azure and Databricks. This person will be a part of multiple initiatives for the "New Development" and "Data Reporting" teams but will be primarily tasked with designing, building, maintaining, and automating their enterprise data architecture/pipelines within the cloud.
Technology-wise we are needing to come with skills in Azure Databricks (5+ years), cloud-based environment (Azure and/or AWS), Azure DevOps (ADO), SQL (ETL, SSIS packages), and PySpark or Scala automation. Architecture experience in building pipelines, data modeling, data pipeline deployment, data mapping, etc.
Top Skills:
-8+ Years of Data Engineer/Business Intelligence
-Databricks and Azure Data Factory *Most updated is Unity Catalog for Databricks*
-Cloud-based environments (Azure or AWS)
-Data Pipeline Architecture and CI/CD methodology
-SQL
-Automation (Python (PySpark), Scala)
Data Engineer
Dallas, TX jobs
We are seeking a highly experienced Senior Data Engineer with deep expertise in modern data engineering frameworks and cloud-native architectures, primarily on AWS. This role focuses on designing, building, and optimizing scalable data pipelines and distributed systems.
You will collaborate cross-functionally to deliver secure, high-quality data solutions that drive business decisions.
Key Responsibilities
Design & Build: Develop and maintain scalable, highly available AWS-based data pipelines, specializing in EKS/ECS containerized workloads and services like Glue, EMR, and Lake Formation.
Orchestration: Implement automated data ingestion, transformation, and workflow orchestration using Airflow, NiFi, and AWS Step Functions.
Real-time: Architect and implement real-time streaming solutions with Kafka, MSK, and Flink.
Data Lake & Storage: Architect secure S3 data storage and govern data lakes using Lake Formation and Glue Data Catalog.
Optimization: Optimize distributed processing solutions (Databricks, Spark, Hadoop) and troubleshoot performance across cloud-native systems.
Governance: Ensure robust data quality, security, and governance via IAM, Lake Formation controls, and automated validations.
Mentorship: Mentor junior team members and foster technical excellence.
Requirements
Experience: 7+ years in data engineering; strong hands-on experience designing cloud data pipelines.
AWS Expertise: Deep proficiency in EKS, ECS, S3, Lake Formation, Glue, EMR, IAM, and MSK.
Core Tools: Strong experience with Kafka, Airflow, NiFi, Databricks, Spark, Hadoop, and Flink.
Coding: Proficiency in Python, Scala, or Java for building data pipelines and automation.
Databases: Strong SQL skills and experience with relational/NoSQL databases (e.g., Redshift, DynamoDB).
Cloud-Native Skills: Strong knowledge of Kubernetes, containerization, and CI/CD pipelines.
Education: Bachelor's degree in Computer Science or related field.
Senior Data Engineer
Houston, TX jobs
About the Role
The Senior Data Engineer will play a critical role in building and scaling an enterprise data platform to enable analytics, reporting, and operational insights across the organization.
This position requires deep expertise in Snowflake and cloud technologies (AWS or Azure), along with strong upstream oil & gas domain experience. The engineer will design and optimize data pipelines, enforce data governance and quality standards, and collaborate with cross-functional teams to deliver reliable, scalable data solutions.
Key Responsibilities
Data Architecture & Engineering
Design, develop, and maintain scalable data pipelines using Snowflake, AWS/Azure, and modern data engineering tools.
Implement ETL/ELT processes integrating data from upstream systems (SCADA, production accounting, drilling, completions, etc.).
Architect data models supporting both operational reporting and advanced analytics.
Establish and maintain frameworks for data quality, validation, and lineage to ensure enterprise data trust.
Platform Development & Optimization
Lead the build and optimization of Snowflake-based data warehouses for performance and cost efficiency.
Design cloud-native data solutions leveraging AWS/Azure services (S3, Lambda, Azure Data Factory, Databricks).
Manage large-scale time-series and operational data processing workflows.
Implement strong security, access control, and governance practices.
Technical Leadership & Innovation
Mentor junior data engineers and provide technical leadership across the data platform team.
Research and introduce new technologies to enhance platform scalability and automation.
Build reusable frameworks, components, and utilities to streamline delivery.
Support AI/ML initiatives by delivering production-ready, high-quality data pipelines.
Business Partnership
Collaborate with stakeholders across business units to translate requirements into technical solutions.
Work with analysts and data scientists to enable self-service analytics and reporting.
Ensure data integration supports regulatory and compliance reporting.
Act as a bridge between business and technical teams to ensure alignment and impact.
Qualifications & Experience
Education
Bachelor's degree in Computer Science, Engineering, Information Systems, or a related field.
Advanced degree or relevant certifications (SnowPro, AWS/Azure Data Engineer, Databricks) preferred.
Experience
7+ years in data engineering roles, with at least 3 years on cloud data platforms.
Proven expertise in Snowflake and at least one major cloud platform (AWS or Azure).
Hands-on experience with upstream oil & gas data (wells, completions, SCADA, production, reserves, etc.).
Demonstrated success delivering operational and analytical data pipelines.
Technical Skills
Advanced SQL and Python programming skills.
Strong background in data modeling, ETL/ELT, cataloging, lineage, and data security.
Familiarity with Airflow, Azure Data Factory, or similar orchestration tools.
Experience with CI/CD, Git, and automated testing.
Knowledge of BI tools such as Power BI, Spotfire, or Tableau.
Understanding of AI/ML data preparation and integration.
Data Architect
Orlando, FL jobs
Data Architect
Duration: 6 Months
Responsible for enterprise-wide data design, balancing optimization of data access with batch loading and resource
utilization factors. Knowledgeable in most aspects of designing and constructing data architectures, operational data
stores, and data marts. Focuses on enterprise-wide data modelling and database design. Defines data architecture
standards, policies and procedures for the organization, structure, attributes and nomenclature of data elements, and
applies accepted data content standards to technology projects. Responsible for business analysis, data acquisition and
access analysis and design, Database Management Systems optimization, recovery strategy and load strategy design and implementation.
Essential Position Functions:
Evaluate and recommend data management processes.
Design, prepare and optimize data pipelines and workflows.
Lead implementations of secure, scalable, and reliable Azure solutions.
Observe and recommend how to monitor and optimize Azure for performance and cost-efficiency.
Endorse and foster security best practices, access controls, and compliance standards for all data lake resources.
Perform knowledge transfer about troubleshooting and documenting Azure architectures and solutions.
Skills required:
Deep understanding of Azure synapse Analytics, Azure Data Factory, and related Azure data tools
Lead implementations of secure, scalable, and reliable Azure solutions.
Observe and recommend how to monitor and optimize Azure for performance and cost efficiency.
Expertise in implementing Data Vault 2.0 methodologies using Wherescape automation software.
Proficient in designing and optimizing fact and dimension table models.
Demonstrated ability to design, develop, and maintain data pipelines and workflows.
Strong skills in formulating, reviewing, and optimizing SQL code.
Expertise in data collection, storage, accessibility, and quality improvement processes.
Endorse and foster security best practices, access controls, and compliance standards for all data lake resources.
Proven track record of delivering consumable data using information marts.
Excellent communication skills to effectively liaise with technical and non-technical team members.
Ability to document designs, procedures, and troubleshooting methods clearly.
Proficiency in Python or PowerShell preferred.
Bachelor's or master's degree in computer science, Information Systems, or other related field. Or equivalent work experience.
A minimum of 7 years of experience with large and complex database management systems.
Data Engineer
Houston, TX jobs
Python Data Engineer - Houston, TX (Onsite Only)
A global energy and commodities organization is seeking an experienced Python Data Engineer to expand and optimize data assets that support high-impact analytics. This role works closely with traders, analysts, researchers, and data scientists to translate business needs into scalable technical solutions. The position is fully onsite due to the collaborative, fast-paced nature of the work.
MUST come from an Oil & Gas organization, prefer commodity trading firm.
CANNOT do C2C.
Key Responsibilities
Build modular, reusable Python components to connect external data sources with internal tools and databases.
Partner with business stakeholders to define data ingestion and access requirements.
Translate business requirements into well-designed technical deliverables.
Maintain and enhance the central Python codebase following established standards.
Contribute to internal developer tools and ETL frameworks, helping standardize and consolidate core functionality.
Collaborate with global engineering teams and participate in internal Python community initiatives.
Qualifications
7+ years of professional Python development experience.
Strong background in data engineering and pipeline development.
Experience with web scraping tools (Requests, BeautifulSoup, Selenium).
Hands-on Oracle/PL SQL development, including stored procedures.
Strong grasp of object-oriented design, design patterns, and service-oriented architectures.
Experience with Agile/Scrum, code reviews, version control, and issue tracking.
Familiarity with scientific computing libraries (Pandas, NumPy).
Excellent communication skills.
Industry experience in energy or commodities preferred.
Exposure to containerization (Docker, Kubernetes) is a plus.
Data Architect
Dallas, TX jobs
Primary responsibilities of the Senior Data Architect include designing and managing Data Architectural solutions for multiple environments including but not limited to Data Warehouse, ODS, Data Replication/ETL Data Management initiatives. The candidate will be in an expert role and will work closely with Business, DBA, ETL and Data Management teams providing solutions for complex Data related initiatives. This individual will also be responsible for developing and managing Data Governance and Master Data Management solutions. This candidate must have good technical and communication skills coupled with the ability to mentor effectively.
Responsibilities
Establishing policies, procedures and guidelines regarding all aspects of Data Governance
Ensure data decisions are consistent, and best practices are adhered to
Ensure Data Standardization definitions, Data Dictionary and Data Lineage are kept up to date and accessible
Work with ETL, Replication and DBA teams to determine best practices as it relates to data transformations, data movement and derivations
Work with support teams to ensure consistent and pro-active support methodologies are in place for all aspects of data movements and data transformations
Work with and mentor Data Architects and Data Analysts to ensure best practices are adhered to for database design and data management
Assist in overall Architectural solutions including, but not limited to Data Warehouse, ODS, Data Replication/ETL Data Management initiatives
Work with the business teams and Enterprise Architecture team to ensure best architectural solutions from a Data perspective
Create a strategic roadmap for MDM implementation
Responsible for implementing a Master Data Management tool
Establishing policies, procedures and guidelines regarding all aspects of Master Data Management
Ensure Architectural rules and design of the MDM process are documented and best practices are adhered to
Qualifications
5+ years of Data Architecture experience, including OLTP, Data Warehouse, Big Data
5+ years of Solution Architecture experience
5+ years of MDM experience
5+ years of Data Governance experience, working knowledge of best practices
Extensive working knowledge of all aspects of Data Movement and Processing, including Middleware, ETL, API, OLAP and best practices for data tracking
Good Communication skills
Self-Motivated
Capable of presenting to all levels of audiences
Works well in a team environment
Data Architect
Plano, TX jobs
KPI Partners is a 5 times Gartner-recognized data, analytics, and AI consulting company. We are leaders in data engineering on Azure, AWS, Google, Snowflake, and Databricks. Founded in 2006, KPI has over 400 consultants and has successfully delivered over 1,000 projects to our clients. We are looking for skilled data engineers who want to work with the best team in data engineering.
Title: Senior Data Architect
Location: Plano, TX (Hybrid)
Job Type: Contract - 6 Months
Key Skills: SQL, PySpark, Databricks, and Azure Cloud
Key Note: Looking for a Data Architect who is Hands-on with SQL, PySpark, Databricks, and Azure Cloud.
About the Role:
We are seeking a highly skilled and experienced Senior Data Architect to join our dynamic team at KPI, working on challenging and multi-year data transformation projects. This is an excellent opportunity for a talented data engineer to play a key role in building innovative data solutions using Azure Native Services and related technologies. If you are passionate about working with large-scale data systems and enjoy solving complex engineering problems, this role is for you.
Key Responsibilities:
Data Engineering: Design, development, and implementation of data pipelines and solutions using PySpark, SQL, and related technologies.
Collaboration: Work closely with cross-functional teams to understand business requirements and translate them into robust data solutions.
Data Warehousing: Design and implement data warehousing solutions, ensuring scalability, performance, and reliability.
Continuous Learning: Stay up to date with modern technologies and trends in data engineering and apply them to improve our data platform.
Mentorship: Provide guidance and mentorship to junior data engineers, ensuring best practices in coding, design, and development.
Must-Have Skills & Qualifications:
Minimum 12+ years of overall experience in IT Industry.
4+ years of experience in data engineering, with a strong background in building large-scale data solutions.
4+ years of hands-on experience developing and implementing data pipelines using Azure stack experience (Azure, ADF, Databricks, Functions)
Proven expertise in SQL for querying, manipulating, and analyzing large datasets.
Strong knowledge of ETL processes and data warehousing fundamentals.
Self-motivated and independent, with a “let's get this done” mindset and the ability to thrive in a fast-paced and dynamic environment.
Good-to-Have Skills:
Databricks Certification is a plus.
Data Modeling, Azure Architect Certification.
Senior Data Engineer
Austin, TX jobs
We are looking for a seasoned Azure Data Engineer to design, build, and optimize secure, scalable, and high-performance data solutions within the Microsoft Azure ecosystem. This will be a multi-year contract worked FULLY ONSITE in Austin, TX.
The ideal candidate brings deep technical expertise in data architecture, ETL/ELT engineering, data integration, and governance, along with hands-on experience in MDM, API Management, Lakehouse architectures, and data mesh or data hub frameworks. This position combines strategic architectural planning with practical, hands-on implementation, empowering cross-functional teams to leverage data as a key organizational asset.
Key Responsibilities
1. Data Architecture & Strategy
Design and deploy end-to-end Azure data platforms using Azure Data Lake, Azure Synapse Analytics, Azure Databricks, and Azure SQL Database.
Build and implement Lakehouse and medallion (Bronze/Silver/Gold) architectures for scalable and modular data processing.
Define and support data mesh and data hub patterns to promote domain-driven design and federated governance.
Establish standards for conceptual, logical, and physical data modeling across data warehouse and data lake environments.
2. Data Integration & Pipeline Development
Develop and maintain ETL/ELT pipelines using Azure Data Factory, Synapse Pipelines, and Databricks for both batch and streaming workloads.
Integrate diverse data sources (on-prem, cloud, SaaS, APIs) into a unified Azure data environment.
Optimize pipelines for cost-effectiveness, performance, and scalability.
3. Master Data Management (MDM) & Data Governance
Implement MDM solutions using Azure-native or third-party platforms (e.g., Profisee, Informatica, Semarchy).
Define and manage data governance, metadata, and data quality frameworks.
Partner with business teams to align data standards and maintain data integrity across domains.
4. API Management & Integration
Build and manage APIs for data access, transformation, and system integration using Azure API Management and Logic Apps.
Design secure, reliable data services for internal and external consumers.
Automate workflows and system integrations using Azure Functions, Logic Apps, and Power Automate.
5. Database & Platform Administration
Perform core DBA tasks, including performance tuning, query optimization, indexing, and backup/recovery for Azure SQL and Synapse.
Monitor and optimize cost, performance, and scalability across Azure data services.
Implement CI/CD and Infrastructure-as-Code (IaC) solutions using Azure DevOps, Terraform, or Bicep.
6. Collaboration & Leadership
Work closely with data scientists, analysts, business stakeholders, and application teams to deliver high-value data solutions.
Mentor junior engineers and define best practices for coding, data modeling, and solution design.
Contribute to enterprise-wide data strategy and roadmap development.
Required Qualifications
Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems, or related fields.
5+ years of hands-on experience in Azure-based data engineering and architecture.
Strong proficiency with the following:
Azure Data Factory, Azure Synapse, Azure Databricks, Azure Data Lake Storage Gen2
SQL, Python, PySpark, PowerShell
Azure API Management and Logic Apps
Solid understanding of data modeling approaches (3NF, dimensional modeling, Data Vault, star/snowflake schemas).
Proven experience with Lakehouse/medallion architectures and data mesh/data hub designs.
Familiarity with MDM concepts, data governance frameworks, and metadata management.
Experience with automation, data-focused CI/CD, and IaC.
Thorough understanding of Azure security, RBAC, Key Vault, and core networking principles.
What We Offer
Competitive compensation and benefits package
Luna Data Solutions, Inc. (LDS) provides equal employment opportunities to all employees. All applicants will be considered for employment. LDS prohibits discrimination and harassment of any type regarding age, race, color, religion, sexual orientation, gender identity, sex, national origin, genetics, protected veteran status, and/or disability status.
Data Engineer
Austin, TX jobs
About the Role
We are seeking a highly skilled Databricks Data Engineer with strong expertise in modern data engineering, Azure cloud technologies, and Lakehouse architectures. This role is ideal for someone who thrives in dynamic environments, enjoys solving complex data challenges, and can lead end-to-end delivery of scalable data solutions.
What We're Looking For
8+ years designing and delivering scalable data pipelines in modern data platforms
Deep experience in data engineering, data warehousing, and enterprise-grade solution delivery
Ability to lead cross-functional initiatives in matrixed teams
Advanced skills in SQL, Python, and ETL/ELT development, including performance tuning
Hands-on experience with Azure, Snowflake, and Databricks, including system integrations
Key Responsibilities
Design, build, and optimize large-scale data pipelines on the Databricks Lakehouse platform
Modernize and enhance cloud-based data ecosystems on Azure, contributing to architecture, modeling, security, and CI/CD
Use Apache Airflow and similar tools for workflow automation and orchestration
Work with financial or regulated datasets while ensuring strong compliance and governance
Drive best practices in data quality, lineage, cataloging, and metadata management
Primary Technical Skills
Develop and optimize ETL/ELT pipelines using Python, PySpark, Spark SQL, and Databricks Notebooks
Design efficient Delta Lake models for reliability and performance
Implement and manage Unity Catalog for governance, RBAC, lineage, and secure data sharing
Build reusable frameworks using Databricks Workflows, Repos, and Delta Live Tables
Create scalable ingestion pipelines for APIs, databases, files, streaming sources, and MDM systems
Automate ingestion and workflows using Python and REST APIs
Support downstream analytics for BI, data science, and application workloads
Write optimized SQL/T-SQL queries, stored procedures, and curated datasets
Automate DevOps workflows, testing pipelines, and workspace configurations
Additional Skills
Azure: Data Factory, Data Lake, Key Vault, Logic Apps, Functions
CI/CD: Azure DevOps
Orchestration: Apache Airflow (plus)
Streaming: Delta Live Tables
MDM: Profisee (nice-to-have)
Databases: SQL Server, Cosmos DB
Soft Skills
Strong analytical and problem-solving mindset
Excellent communication and cross-team collaboration
Detail-oriented with a high sense of ownership and accountability