Data Engineer
Data engineer job in Houston, TX
We are looking for a talented and motivated Python Data Engineers. We need help expanding our data assets in support of our analytical capabilities in a full-time role. This role will have the opportunity to interface directly with our traders, analysts, researchers and data scientists to drive out requirements and deliver a wide range of data related needs.
What you will do:
- Translate business requirements into technical deliveries. Drive out requirements for data ingestion and access
- Maintain the cleanliness of our Python codebase, while adhering to existing designs and coding conventions as much as possible
- Contribute to our developer tools and Python ETL toolkit, including standardization and consolidation of core functionality
- Efficiently coordinate with the rest of our team in different locations
Qualifications
- 6+ years of enterprise-level coding experience with Python
- Computer Science, MIS or related degree
- Familiarity with Pandas and NumPy packages
- Experience with Data Engineering and building data pipelines
- Experience scraping websites with Requests, Beautiful Soup, Selenium, etc.
- Strong understating of object-oriented design, design patterns, SOA architectures
- Proficient understanding of peer-reviewing, code versioning, and bug/issue tracking tools.
- Strong communication skills
- Familiarity with containerization solutions like Docker and Kubernetes is a plus
Senior Data Engineer
Data engineer job in Austin, TX
We are looking for a seasoned Azure Data Engineer to design, build, and optimize secure, scalable, and high-performance data solutions within the Microsoft Azure ecosystem. This will be a multi-year contract worked FULLY ONSITE in Austin, TX.
The ideal candidate brings deep technical expertise in data architecture, ETL/ELT engineering, data integration, and governance, along with hands-on experience in MDM, API Management, Lakehouse architectures, and data mesh or data hub frameworks. This position combines strategic architectural planning with practical, hands-on implementation, empowering cross-functional teams to leverage data as a key organizational asset.
Key Responsibilities
1. Data Architecture & Strategy
Design and deploy end-to-end Azure data platforms using Azure Data Lake, Azure Synapse Analytics, Azure Databricks, and Azure SQL Database.
Build and implement Lakehouse and medallion (Bronze/Silver/Gold) architectures for scalable and modular data processing.
Define and support data mesh and data hub patterns to promote domain-driven design and federated governance.
Establish standards for conceptual, logical, and physical data modeling across data warehouse and data lake environments.
2. Data Integration & Pipeline Development
Develop and maintain ETL/ELT pipelines using Azure Data Factory, Synapse Pipelines, and Databricks for both batch and streaming workloads.
Integrate diverse data sources (on-prem, cloud, SaaS, APIs) into a unified Azure data environment.
Optimize pipelines for cost-effectiveness, performance, and scalability.
3. Master Data Management (MDM) & Data Governance
Implement MDM solutions using Azure-native or third-party platforms (e.g., Profisee, Informatica, Semarchy).
Define and manage data governance, metadata, and data quality frameworks.
Partner with business teams to align data standards and maintain data integrity across domains.
4. API Management & Integration
Build and manage APIs for data access, transformation, and system integration using Azure API Management and Logic Apps.
Design secure, reliable data services for internal and external consumers.
Automate workflows and system integrations using Azure Functions, Logic Apps, and Power Automate.
5. Database & Platform Administration
Perform core DBA tasks, including performance tuning, query optimization, indexing, and backup/recovery for Azure SQL and Synapse.
Monitor and optimize cost, performance, and scalability across Azure data services.
Implement CI/CD and Infrastructure-as-Code (IaC) solutions using Azure DevOps, Terraform, or Bicep.
6. Collaboration & Leadership
Work closely with data scientists, analysts, business stakeholders, and application teams to deliver high-value data solutions.
Mentor junior engineers and define best practices for coding, data modeling, and solution design.
Contribute to enterprise-wide data strategy and roadmap development.
Required Qualifications
Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems, or related fields.
5+ years of hands-on experience in Azure-based data engineering and architecture.
Strong proficiency with the following:
Azure Data Factory, Azure Synapse, Azure Databricks, Azure Data Lake Storage Gen2
SQL, Python, PySpark, PowerShell
Azure API Management and Logic Apps
Solid understanding of data modeling approaches (3NF, dimensional modeling, Data Vault, star/snowflake schemas).
Proven experience with Lakehouse/medallion architectures and data mesh/data hub designs.
Familiarity with MDM concepts, data governance frameworks, and metadata management.
Experience with automation, data-focused CI/CD, and IaC.
Thorough understanding of Azure security, RBAC, Key Vault, and core networking principles.
What We Offer
Competitive compensation and benefits package
Luna Data Solutions, Inc. (LDS) provides equal employment opportunities to all employees. All applicants will be considered for employment. LDS prohibits discrimination and harassment of any type regarding age, race, color, religion, sexual orientation, gender identity, sex, national origin, genetics, protected veteran status, and/or disability status.
Senior Data Governance Consultant (Informatica)
Data engineer job in Plano, TX
Senior Data Governance Consultant (Informatica)
About Paradigm - Intelligence Amplified
Paradigm is a strategic consulting firm that turns vision into tangible results. For over 30 years, we've helped Fortune 500 and high-growth organizations accelerate business outcomes across data, cloud, and AI. From strategy through execution, we empower clients to make smarter decisions, move faster, and maximize return on their technology investments. What sets us apart isn't just what we do, it's how we do it. Driven by a clear mission and values rooted in integrity, excellence, and collaboration, we deliver work that creates lasting impact. At Paradigm, your ideas are heard, your growth is prioritized, your contributions make a difference.
Summary:
We are seeking a Senior Data Governance Consultant to lead and enhance data governance capabilities across a financial services organization
The Senior Data Governance Consultant will collaborate closely with business, risk, compliance, technology, and data management teams to define data standards, strengthen data controls, and drive a culture of data accountability and stewardship
The ideal candidate will have deep experience in developing and implementing data governance frameworks, data policies, and control mechanisms that ensure compliance, consistency, and trust in enterprise data assets
Hands-on experience with Informatica, including Master Data Management (MDM) or Informatica Data Management Cloud (IDMC), is preferred
This position is Remote, with occasional travel to Plano, TX
Responsibilities:
Data Governance Frameworks:
Design, implement, and enhance data governance frameworks aligned with regulatory expectations (e.g., BCBS 239, GDPR, CCPA, DORA) and internal control standards
Policy & Standards Development:
Develop, maintain, and operationalize data policies, standards, and procedures that govern data quality, metadata management, data lineage, and data ownership
Control Design & Implementation:
Define and embed data control frameworks across data lifecycle processes to ensure data integrity, accuracy, completeness, and timeliness
Risk & Compliance Alignment:
Work with risk and compliance teams to identify data-related risks and ensure appropriate mitigation and monitoring controls are in place
Stakeholder Engagement:
Partner with data owners, stewards, and business leaders to promote governance practices and drive adoption of governance tools and processes
Data Quality Management:
Define and monitor data quality metrics and KPIs, establishing escalation and remediation procedures for data quality issues
Metadata & Lineage:
Support metadata and data lineage initiatives to increase transparency and enable traceability across systems and processes
Reporting & Governance Committees:
Prepare materials and reporting for data governance forums, risk committees, and senior management updates
Change Management & Training:
Develop communication and training materials to embed governance culture and ensure consistent understanding across the organization
Required Qualifications:
7+ years of experience in data governance, data management, or data risk roles within financial services (banking, insurance, or asset management preferred)
Strong knowledge of data policy development, data standards, and control frameworks
Proven experience aligning data governance initiatives with regulatory and compliance requirements
Familiarity with Informatica data governance and metadata tools
Excellent communication skills with the ability to influence senior stakeholders and translate technical concepts into business language
Deep understanding of data management principles (DAMA-DMBOK, DCAM, or equivalent frameworks)
Bachelor's or Master's Degree in Information Management, Data Science, Computer Science, Business, or related field
Preferred Qualifications:
Hands-on experience with Informatica, including Master Data Management (MDM) or Informatica Data Management Cloud (IDMC), is preferred
Experience with data risk management or data control testing
Knowledge of financial regulatory frameworks (e.g., Basel, MiFID II, Solvency II, BCBS 239)
Certifications, such as Informatica, CDMP, or DCAM
Background in consulting or large-scale data transformation programs
Key Competencies:
Strategic and analytical thinking
Strong governance and control mindset
Excellent stakeholder and relationship management
Ability to drive organizational change and embed governance culture
Attention to detail with a pragmatic approach
Why Join Paradigm
At Paradigm, integrity drives innovation. You'll collaborate with curious, dedicated teammates, solving complex problems and unlocking immense data value for leading organizations. If you seek a place where your voice is heard, growth is supported, and your work creates lasting business value, you belong at Paradigm.
Learn more at ********************
Policy Disclosure:
Paradigm maintains a strict drug-free workplace policy. All offers of employment are contingent upon successfully passing a standard 5-panel drug screen. Please note that a positive test result for any prohibited substance, including marijuana, will result in disqualification from employment, regardless of state laws permitting its use. This policy applies consistently across all positions and locations.
Data Scientist
Data engineer job in Plano, TX
Job Title: Data Scientist / Gen AI Lead Consultant
Location: Bridgewater, NJ; Sunnyvale, CA; Austin, TX; Raleigh, NC; Richardson, TX; Tempe, AZ; Phoenix, AZ; Charlotte, NC; Houston, TX; Denver, CO; Hartford, CT; New York, NY; Palm Beach, FL; Tampa, FL; Alpharetta, GA
Job type: full-time
Job Description:
We are seeking a Data Scientist / Generative AI Lead Consultant with strong expertise in Generative AI, Agentic AI, Machine Learning, and Python. In this role, you will drive end-to-end implementation of AI solutions-from problem identification to model deployment-leveraging the latest advancements in Large Language Models (LLMs), RAG, agent frameworks, and cloud platforms.
You will work closely with client stakeholders, architects, and offshore teams to build scalable, production-grade AI systems aligned with enterprise data strategies.
Responsibilities
Lead end-to-end development of Generative AI and Agentic AI solutions, including business problem discovery, solution design, model development, optimization, and deployment.
Fine-tune, evaluate, and deploy Large Language Models and build Advanced RAG pipelines.
Architect and implement AI workflows using LangGraph, AutoGen, Crew AI, or similar agent frameworks.
Build scalable AI applications using Python, modern ML frameworks, and cloud-based GenAI services.
Deploy solutions using platforms such as AWS Bedrock, Azure OpenAI, Google Vertex AI, or IBM Watson.
Ingest and process unstructured data including PDFs, HTML, images, and audio-to-text pipelines.
Work with vector databases such as FAISS, Pinecone, Weaviate, or Azure AI Search.
Ensure data quality, data governance, and adherence to coding best practices across the AI lifecycle.
Collaborate with agile teams, drive sprint execution, provide mentorship, and coordinate with offshore delivery teams.
Build and publish reusable assets, best practices, and accelerators for AI implementations.
Required Qualifications
Bachelor's Degree or foreign equivalent (or 3 years of relevant progressive experience per year of missing education).
7+ years of experience in Information Technology.
4+ years of hands-on experience in Generative AI / Agentic AI / Machine Learning / Data Science.
Strong proficiency in Python programming.
Experience deploying AI applications using agent frameworks such as LangGraph, AutoGen, or Crew AI.
Experience with cloud-native Gen AI services on AWS, Azure, GCP, or IBM Watson.
Hands-on experience with RAG, multiple LLMs, and GenAI pipelines.
Experience processing unstructured data (PDF, image, HTML, OCR, audio-to-text).
Strong understanding of data gathering, data quality, system architecture, and ML coding best practices.
Experience with vector databases (FAISS, Pinecone, Weaviate, Azure AI Search).
Experience with Agile/Lean development methodologies.
Preferred Qualifications
Experience with multiple programming languages-Python, R, Scala, Java, SQL.
Hands-on experience with CI/CD pipelines & DevOps tools (Jenkins, GitHub Actions, Terraform).
Proficiency with both SQL and NoSQL databases (PostgreSQL, MongoDB, CosmosDB, DynamoDB).
Deep Learning experience: CNNs, RNNs, LSTMs, and exposure to emerging research.
Experience with AI/ML frameworks such as TensorFlow, PyTorch, LangChain.
Strong background in LLM fine-tuning, optimization, quantization, and local deployment.
Experience building RESTful APIs using FastAPI, Flask, or Django.
Knowledge of model evaluation tools such as DeepEval, FMEval, RAGAS, Bedrock evaluations.
Experience with computer vision, time-series, and NLP pipelines.
Strong Big Data skills: HDFS, Hive, Spark, Scala.
Exposure to data visualization tools (Tableau) and data query tools (SQL, Hive).
Strong applied statistics background (distributions, statistical testing, regression, etc.).
Data Engineer
Data engineer job in Houston, TX
About the Company:
This is a fantastic opportunity to join an internationally renowned powerhouse within the supply chain industry. You will have the opportunity to work on a variety of different projects across teams. In this position, you'll be the architect of our data pipelines, instrumental in operationalizing data & advanced analytics to drive organizational outcomes.
Role:
In this position, you'll be the architect of our data pipelines, instrumental in operationalizing data & advanced analytics to drive organizational outcomes. Your responsibilities will include:
Strategic: Provide architectural input to project teams, aligning solutions with long-term data warehouse strategy.
Data Management: Navigate various data architectures like Data Warehouse, Data Lake, Data Hub, and Data Vault, ensuring seamless integration and governance.
Integration: Harness the power of ETL/ELT, data replication, message-oriented data movement, and API-based data acquisition to work with large, heterogeneous datasets.
Data: Analyze large datasets, uncovering insights and anomalies that fuel informed decision-making.
Requirements:
Snowflake
Proficiency in complex SQL code and DevOps capabilities.
Experience with ETL tools and open-source/commercial message queuing technologies.
Strong Python skills for data manipulation and automation.
Preferred Experience:
Familiarity with SQL Server, SAP Hana, Tableau, Power BI, dbt, and Azure Data Factory.
Experience in building forecasting models including linear regression and time-series analysis.
This role is based in Houston and they work on a hybrid model. Compensation offered is in the region of $100,000-$150,000 with additional bonus compensation dependent on company performance. The bonus is approximately 15%.
If interested, please apply.
Equal Opportunity Employer/Disability/Veterans.
Data Scientist with DataBricks experience
Data engineer job in Plano, TX
About Company,
Droisys is an innovation technology company focused on helping companies accelerate their digital initiatives from strategy and planning through execution. We leverage deep technical expertise, Agile methodologies, and data-driven intelligence to modernize systems of engagement and simplify human/tech interaction.
Amazing things happen when we work in environments where everyone feels a true sense of belonging and when candidates have the requisite skills and opportunities to succeed. At Droisys, we invest in our talent and support career growth, and we are always on the lookout for amazing talent who can contribute to our growth by delivering top results for our clients. Join us to challenge yourself and accomplish work that matters.
Data Scientist with DataBricks experience
Plano, TX (5 Days Onsite)
Interview Mode:-Phone & F2F
Rate Range $40 to $47 hr W2 All Inc
Job Overview
We are seeking a highly skilled, independent, and results-driven Data Scientist with strong expertise in GenAI, LLMs, Python, Spark, Databricks, and End-to-End ML Model Development. The ideal candidate will be capable of translating business needs into scalable analytical solutions and delivering production-grade machine learning models.
Key Responsibilities
Build statistically sound analyses and production-ready ML models.
Develop and optimize models using H2O frameworks (XGBoost, Logistic Regression, Neural Networks, Random Forest).
Work extensively with MongoDB and NoSQL datasets.
Design, develop, and maintain ML training pipelines and support ML model deployment.
Utilize Databricks, Hadoop ecosystem, and PySpark for large-scale data preparation and modeling.
Implement advanced ML concepts such as:
Real-time distributed model inferencing pipelines
A/B Testing
Champion/Challenger frameworks
Develop scalable workflows using Python, Spark, and related libraries.
Use Unix/Linux and shell scripting for automation and environment management.
Support ML production implementation, monitoring, and troubleshooting.
Deliver high-quality solutions with strong attention to detail and timely execution.
Required Skills & Qualifications
Python & Spark Expertise
Databricks & PySpark Experience
GenAI & LLM Knowledge
H2O Model Development (XGBoost, Logistic Regression, Neural Networks, Random Forest)
MongoDB / NoSQL Databases
Hadoop Ecosystem
ML Pipeline Development & Deployment
Real-Time Model Inferencing
Champion/Challenger Frameworks & A/B Testing
Unix/Linux & Shell Scripting
DS/ML Production Implementation
Preferred Skills
Azure Cloud experience
Advanced experience with Databricks in cloud environments
Droisys is an equal opportunity employer. We do not discriminate based on race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. Droisys believes in diversity, inclusion, and belonging, and we are committed to fostering a diverse work environment.
Data Engineer
Data engineer job in Austin, TX
About the Role
We are seeking a highly skilled Databricks Data Engineer with strong expertise in modern data engineering, Azure cloud technologies, and Lakehouse architectures. This role is ideal for someone who thrives in dynamic environments, enjoys solving complex data challenges, and can lead end-to-end delivery of scalable data solutions.
What We're Looking For
8+ years designing and delivering scalable data pipelines in modern data platforms
Deep experience in data engineering, data warehousing, and enterprise-grade solution delivery
Ability to lead cross-functional initiatives in matrixed teams
Advanced skills in SQL, Python, and ETL/ELT development, including performance tuning
Hands-on experience with Azure, Snowflake, and Databricks, including system integrations
Key Responsibilities
Design, build, and optimize large-scale data pipelines on the Databricks Lakehouse platform
Modernize and enhance cloud-based data ecosystems on Azure, contributing to architecture, modeling, security, and CI/CD
Use Apache Airflow and similar tools for workflow automation and orchestration
Work with financial or regulated datasets while ensuring strong compliance and governance
Drive best practices in data quality, lineage, cataloging, and metadata management
Primary Technical Skills
Develop and optimize ETL/ELT pipelines using Python, PySpark, Spark SQL, and Databricks Notebooks
Design efficient Delta Lake models for reliability and performance
Implement and manage Unity Catalog for governance, RBAC, lineage, and secure data sharing
Build reusable frameworks using Databricks Workflows, Repos, and Delta Live Tables
Create scalable ingestion pipelines for APIs, databases, files, streaming sources, and MDM systems
Automate ingestion and workflows using Python and REST APIs
Support downstream analytics for BI, data science, and application workloads
Write optimized SQL/T-SQL queries, stored procedures, and curated datasets
Automate DevOps workflows, testing pipelines, and workspace configurations
Additional Skills
Azure: Data Factory, Data Lake, Key Vault, Logic Apps, Functions
CI/CD: Azure DevOps
Orchestration: Apache Airflow (plus)
Streaming: Delta Live Tables
MDM: Profisee (nice-to-have)
Databases: SQL Server, Cosmos DB
Soft Skills
Strong analytical and problem-solving mindset
Excellent communication and cross-team collaboration
Detail-oriented with a high sense of ownership and accountability
Data Scientist
Data engineer job in Dallas, TX
Data Scientist (F2F interview)
W2 Contract
Dallas, TX (Onsite)
We are seeking an experienced Data Scientist to join our team in Dallas, Texas. The ideal candidate will have a strong foundation in machine learning, data modeling, and statistical analysis, with the ability to transform complex datasets into clear, actionable insights that drive business impact.
Key Responsibilities
Develop, implement, and optimize machine learning models to support business objectives.
Perform exploratory data analysis, feature engineering, and predictive modeling.
Translate analytical findings into meaningful recommendations for technical and non-technical stakeholders.
Collaborate with cross-functional teams to identify data-driven opportunities and improve decision-making.
Build scalable data pipelines and maintain robust analytical workflows.
Communicate insights through reports, dashboards, and data visualizations.
Qualifications
Bachelor's or Master's degree in Data Science, Statistics, Computer Science, or a related field.
Proven experience working with machine learning algorithms and statistical modeling techniques.
Proficiency in Python or R, along with hands-on experience using libraries such as Pandas, NumPy, Scikit-learn, or TensorFlow.
Strong SQL skills and familiarity with relational or NoSQL databases.
Experience with data visualization tools (e.g., Tableau, Power BI, matplotlib).
Excellent problem-solving, communication, and collaboration skills.
Azure Data Engineer
Data engineer job in Irving, TX
Our client is seeking an Azure Data Engineer to join their team! This position is located in Irving, Texas. THIS ROLE REQUIRES AN ONSITE INTERVIEW IN IRVING, please only apply if you are local and available to interview onsite.
Duties:
Lead the design, architecture, and implementation of key data initiatives and platform capabilities
Optimize existing data workflows and systems to improve performance, cost-efficiency, identifying and guiding teams to implement solutions
Lead and mentor a team of 2-5 data engineers, providing guidance on technical best practices, career development, and initiative execution
Contribute to the development of data engineering standards, processes, and documentation, promoting consistency and maintainability across teams while enabling business stakeholders
Desired Skills/Experience:
Bachelor's degree or equivalent in Computer Science, Mathematics, Software Engineering, Management Information Systems, etc.
5+ years of relevant work experience in data engineering
Strong technical skills in SQL, PySpark/Python, Azure, and Databricks
Deep understanding of data engineering fundamentals, including database architecture and design, ETL, etc.
Benefits:
Medical, Dental, & Vision Insurance Plans
Employee-Owned Profit Sharing (ESOP)
401K offered
The approximate pay range for this position starting at $140-145,000+. Please note that the pay range provided is a good faith estimate. Final compensation may vary based on factors including but not limited to background, knowledge, skills, and location. We comply with local wage minimums.
Senior Upstream Data Engineer (Architect)
Data engineer job in Spring, TX
The Upstream Data Engineer will design, develop, and optimize enterprise data solutions that support drilling, reservoir engineering, completions, production optimization, and broader subsurface workflows. This role combines advanced data engineering expertise with deep functional knowledge of upstream oil and gas to enable high-quality analytics and accelerate operational decision making.
Key Responsibilities
Architect, build, and maintain scalable data pipelines for drilling, reservoir, and production datasets leveraging Python and modern ELT/ETL frameworks
Ingest, harmonize, and curate industry data sources such as WITSML, ProdML, LAS, SCADA historian data, seismic, well logs, and WellView/OpenWells datasets
Design and implement robust data models in Snowflake and Databricks to support operational reporting, subsurface analytics, AI/ML, and reservoir engineering workflows
Utilize open table formats such as Apache Iceberg to support efficient data lineage, versioning, governance, and incremental processing
Collaborate with drilling, geoscience, and reservoir engineering stakeholders to translate business requirements into reusable technology solutions
Apply orchestration, CI/CD, and DevOps practices to ensure reliability and automation across cloud environments
Improve data product performance, availability, quality, and compliance aligned with upstream data governance standards and PPDM/O&G reference models
Troubleshoot and support production data pipelines and ensure secure, optimized access to datasets
Required Qualifications
Bachelor's degree in Petroleum Engineering, Computer Science, Data Engineering, or related technical discipline
Proven experience working directly within upstream oil and gas domains such as drilling operations, reservoir management, completions, or production engineering
Strong Python programming skills and experience building reusable transformation frameworks
Hands-on experience with Snowflake and Databricks including Delta Lake or similar distributed processing capabilities
Experience with open data lakehouse architectures and formats (Apache Iceberg preferred)
Proficiency in SQL, cloud services (Azure or AWS), distributed compute concepts, and data ingestion frameworks
Solid understanding of the well lifecycle, subsurface engineering concepts, and upstream operational KPIs
Preferred Skills
Experience with Cognite Data Fusion for contextualization and integration of operational, engineering, and IT data to enable analytics and AI solutions
Familiarity with OSDU data platform or PPDM standards for upstream data governance
Experience building analytics-ready datasets for data science and real-time operational decision support
Knowledge of BI reporting tools such as Power BI or Spotfire used in E&P environments
Exposure to real-time data ingestion from drilling rigs, control systems, or production operations
Data Engineer(python, pyspark, databricks)
Data engineer job in Dallas, TX
Job Title: Data Engineer(python, pyspark, databricks)
Data Engineer with strong proficiency in SQL, Python, and PySpark to support high-performance data pipelines and analytics initiatives. This role will focus on scalable data processing, transformation, and integration efforts that enable business insights, regulatory compliance, and operational efficiency.
Data Engineer - SQL, Python and Pyspark Expert (Onsite - Dallas, TX)
Key Responsibilities
Design, develop, and optimize ETL/ELT pipelines using SQL, Python, and PySpark for large-scale data environments
Implement scalable data processing workflows in distributed data platforms (e.g., Hadoop, Databricks, or Spark environments)
Partner with business stakeholders to understand and model mortgage lifecycle data (origination, underwriting, servicing, foreclosure, etc.)
Create and maintain data marts, views, and reusable data components to support downstream reporting and analytics
Ensure data quality, consistency, security, and lineage across all stages of data processing
Assist in data migration and modernization efforts to cloud-based data warehouses (e.g., Snowflake, Azure Synapse, GCP BigQuery)
Document data flows, logic, and transformation rules
Troubleshoot performance and quality issues in batch and real-time pipelines
Support compliance-related reporting (e.g., HMDA, CFPB)
Required Qualifications
6+ years of experience in data engineering or data development
Advanced expertise in SQL (joins, CTEs, optimization, partitioning, etc.)
Strong hands-on skills in Python for scripting, data wrangling, and automation
Proficient in PySpark for building distributed data pipelines and processing large volumes of structured/unstructured data
Experience working with mortgage banking data sets and domain knowledge is highly preferred
Strong understanding of data modeling (dimensional, normalized, star schema)
Experience with cloud-based platforms (e.g., Azure Databricks, AWS EMR, GCP Dataproc)
Familiarity with ETL tools, orchestration frameworks (e.g., Airflow, ADF, dbt)
Data Analytics Engineer
Data engineer job in Houston, TX
Title: Data Analytics Engineer
Type: 6 Month Contract (Full-time is possible after contract period)
Schedule: Hybrid (3-4 days onsite)
Sector: Oil & Gas
Overview: You will be instrumental in developing and maintaining data models while delivering insightful analyses of maintenance operations, including uptime/downtime, work order metrics, and asset health.
Key Responsibilities:
Aggregate and transform raw data from systems such as CMMS, ERP, and SCADA into refined datasets and actionable reports/visualizations using tools like SQL, Python, Power BI, and/or Spotfire.
Own the creation and maintenance of dashboards for preventative and predictive maintenance.
Collaborate cross-functionally to identify data requirements, key performance indicators (KPIs), and reporting gaps.
Ensure high data quality through rigorous testing, validation, and documentation.
Qualifications and Skills:
Bachelor's degree required.
Proficiency in Python and SQL is essential.
Knowledge of API rules and protocols.
Experience organizing development workflows using GitHub.
Familiarity with Machine Learning is a plus.
Preference for candidates with experience in water midstream/infrastructure or Oil & Gas sectors.
Expertise in dashboard creation using tools like Tableau, Spotfire, Excel, or Power BI.
Ability to clearly communicate technical concepts to non-technical stakeholders.
Strong organizational skills and a customer-service mindset.
Capability to work independently or collaboratively with minimal supervision.
Exceptional analytical and problem-solving skills, with a strategic approach to prioritization.
Ability to analyze data, situations, and processes to make informed decisions or resolve issues, with regular communication to management.
Excellent written and verbal communication skills.
Staff Data Engineer
Data engineer job in Houston, TX
About the Company
We are an AI-driven technology company building the data infrastructure that powers how connected physical systems are monitored, maintained, and optimised. Our platform transforms high-volume sensor data into real-time intelligence, predictive insights, and automated decision-making.
We're creating a modern, AI-native data ecosystem from the ground up.
The Role
As our first Staff Data Engineer, you will design and build the core data platform that underpins analytics, real-time intelligence, and machine learning across the organisation. Working closely with senior leadership, you'll define architecture, create scalable pipelines, and establish best practices for a growing data ecosystem handling high-volume IoT data.
This is a hands-on Staff role for an engineer who thrives at the intersection of software systems, data engineering, and AI enablement.
What You'll Do
Architect and build scalable data platforms and pipelines (batch + streaming)
Design data models and abstractions to support analytics and ML workloads
Work with time-series and relational databases to manage high-volume telemetry
Enable real-time data consumption across product and AI systems
Implement best practices around data quality, lineage, observability, and governance
Partner with engineering and ML teams on platform strategy and architecture
Mentor and influence data engineering standards across teams
What We're Looking For
8+ years in data or software engineering with ownership of large-scale systems
Strong experience with modern data pipelines and ETL frameworks (Python, Spark, Airflow, dbt)
Expertise across data lakes, warehouses, and streaming systems
Advanced SQL and strong data modelling skills
Experience enabling ML workflows and AI-ready data platforms
Familiarity with cloud-native data environments (AWS preferred)
Systems thinker with strong engineering fundamentals and communication skills
Infrequent travel to Houston for all-team meets
Why Join
Build a data platform from first principles
Define architecture at the Staff level
Work with large-scale real-world data
High ownership and visible impact
Senior Data Engineer
Data engineer job in Houston, TX
Our Midstream Oil and Gas client is seeking a skilled Data Engineer to design, build, and maintain scalable data infrastructure that supports advanced analytics and business intelligence initiatives. You will work closely with data scientists, analysts, and software engineers to ensure data availability, quality, and reliability across the organization.
Key Responsibilities
Design, develop, and maintain data pipelines and ETL/ELT processes for collecting, transforming, and storing data from multiple sources
Build and optimize data models, data warehouses, and data lakes for analytics and reporting.
Collaborate with stakeholders to define data architecture and ensure data integrity, consistency, and security
Implement and maintain data quality frameworks, monitoring, and alerting systems.
Work with cloud platforms (AWS, Azure, or GCP) to deploy scalable data infrastructure
Support the integration of real-time streaming data using tools like Kafka, Kinesis, or Spark Streaming
Automate workflows using orchestration tools such as Airflow, Prefect, or Dagster
Ensure compliance with data governance, security, and privacy standards (e.g., GDPR, HIPAA).
Troubleshoot data-related issues and optimize performance of queries and pipelines
Required Qualifications
7+ years of experience as a Data Engineer or similar role
Strong proficiency in SQL and a programming language such as Python
Hands-on experience with ETL frameworks, data warehousing, and big data tools (e.g., Spark, Hadoop).
Experience with cloud-based data solutions such as AWS Redshift, Snowflake, BigQuery, or Azure Synapse.
Proficiency with data orchestration and workflow automation tools
Solid understanding of data modeling, data architecture, and API integration
Data Engineer
Data engineer job in Austin, TX
Hello,
Role: Data Engineering & Analytics Consultant
I am looking for EX-apple employee candidates
Note: Data engineering, Advance proficiency in Python, SQL with 4-5 years exp.
Job Overview :
We are seeking a Software Engineer with strong SQL and Python skills to develop reliable data pipelines, optimize complex workflows, and deliver scalable data products that empower decision-making across Apple's ecosystem.
Key Responsibilities:
Design, build, and optimize ETL/ELT data pipelines using Python, SQL, and modern orchestration tools.
Develop and maintain data models, APIs, and microservices that enable analytical and operational use cases.
Work closely with cross-functional partners (Data Science, Product, Finance, and Operations) to translate business needs into engineering solutions.
Apply software engineering best practices (version control, CI/CD, testing, observability) to data workflows.
Optimize data quality, scalability, and latency across distributed systems (Snowflake, Spark, Databricks, etc.).
Participate in architecture discussions on data warehousing, event streaming, and ML data pipelines.
Ensure compliance with Apple's privacy, security, and governance standards in all data operations.
Minimum Qualifications:
Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field.
3-7 years of experience in software or data engineering.
Advanced proficiency in Python (Pandas, PySpark, or similar frameworks).
Strong SQL expertise - ability to write and optimize complex queries and stored procedures.
Proven experience with data modeling, schema design, and performance tuning.
Experience building or orchestrating workflows using Airflow, Dagster, or similar tools.
Solid understanding of APIs, CI/CD pipelines, Git, and containerization (Docker/Kubernetes).
Data Engineer
Data engineer job in Temple, TX
SeAH Superalloy Technologies is building a world-class manufacturing facility in Temple, Texas, producing aerospace-grade nickel-based superalloys for investment casting and additive manufacturing. As part of SeAH Group's $150M U.S. greenfield investment, we're shaping the future of advanced manufacturing and establishing strong partnerships with industry leaders, suppliers, and communities.
Position Summary
We are seeking a highly skilled and proactive Data Engineer to lead and support the development and optimization of our analytics infrastructure. This role will focus on building scalable, secure, and maintainable data pipelines across enterprise systems like ERP, MES, SCADA, and WMS. The ideal candidate has a strong technical foundation in data engineering, exceptional problem-solving skills, and experience in both on-prem and cloud environments. This role will also involve the development of dashboards, visualization tools, and predictive analytics for use across operations, engineering, and executive leadership.
Key Responsibilities
Data Engineering & Pipeline Development:
Design, build, and maintain robust, fault-tolerant data pipelines and ingestion workflows.
Lead integration of key enterprise systems (ERP, MES, CMMS, SCADA, WMS).
Optimize pipelines for performance, scalability, and long-term maintainability.
Clean, transform, and augment raw industrial data to ensure accuracy and analytical value.
System Integration & API Management:
Develop and maintain RESTful API connectivity for cross-platform communication.
Work with structured and semi-structured data formats (SQL, CSV, PLC logs, etc.).
Translate complex business requirements into scalable data architecture.
Visualization & Reporting:
Create and maintain dashboards and reports using Power BI or similar tools.
Automate report generation for predictive analytics, anomaly detection, and performance insights.
Collaborate with stakeholders to customize visual outputs and provide decision-ready insights.
Data Collection, Governance & Security:
Implement ETL processes and ensure proper data governance protocols.
Conduct quality checks, monitor ingestion workflows, and enforce secure data handling practices.
Perform backups and manage version control for code and reports.
Collaboration & Agile Operations:
Participate in agile team meetings, code reviews, and sprint planning.
Support internal teams with technical troubleshooting and training.
Gather requirements directly from stakeholders to refine data strategies.
Qualifications
Bachelor's degree in Computer Science, Engineering, Data Science, or a related field.
5+ years of professional experience in data engineering, analytics, or a related technical role.
Strong experience with REST APIs, microservices, and data pipeline orchestration.
Proficient in SQL and scripting languages (Python, Bash, PowerShell).
Experience with data warehousing, ETL design, and industrial datasets.
Familiarity with on-prem and cloud environments.
Excellent analytical, communication, and problem-solving skills.
Preferred/Bonus Skills
Experience integrating data from PLCs or industrial protocols.
Familiarity with Power BI, MES, or CMMS tools.
Experience applying cybersecurity standards to data infrastructure.
Knowledge of manufacturing environments, especially in metals or high-spec industries.
GCP Data Engineer
Data engineer job in Fort Worth, TX
Job Title: GCP Data Engineer
Employment Type: W2/CTH
Client: Direct
We are seeking a highly skilled Data Engineer with strong expertise in Python, SQL, and Google Cloud Platform (GCP) services. The ideal candidate will have 6-8 years of hands-on experience in building and maintaining scalable data pipelines, working with APIs, and leveraging GCP tools such as BigQuery, Cloud Composer, and Dataflow.
Core Responsibilities:
• Design, build, and maintain scalable data pipelines to support analytics and business operations.
• Develop and optimize ETL processes for structured and unstructured data.
• Work with BigQuery, Cloud Composer, and other GCP services to manage data workflows.
• Collaborate with data analysts and business teams to ensure data availability and quality.
• Integrate data from multiple sources using APIs and custom scripts.
• Monitor and troubleshoot pipeline performance and reliability.
Technical Skills:
o Strong proficiency in Python and SQL.
o Experience with data pipeline development and ETL frameworks.
• GCP Expertise:
o Hands-on experience with BigQuery, Cloud Composer, and Dataflow.
• Additional Requirements:
o Familiarity with workflow orchestration tools and cloud-based data architecture.
o Strong problem-solving and analytical skills.
o Excellent communication and collaboration abilities.
Data Scientist with data analyst skills
Data engineer job in Houston, TX
One of our staffing partners is helping a financial client hire a Data Scientist with data analyst skills
Salary offered $60-75k per year
Direct applicants only no c2c or company candidates due to low margins. The staffing company or client will reach directly to the applicants.
Responsibilities:
Design, train, and fine-tune Large Language Models (LLMs) for various applications
experience with sql, excel, Powerbi
Collaborate with cross-functional teams to integrate AI into real-world applications
Analyze and preprocess massive datasets for AI-driven insights
Proficiency in Python, TensorFlow, PyTorch, and Hugging Face libraries
Knowledge of transformer architectures, attention mechanisms, and reinforcement learning
Experience with fine-tuning and optimizing foundation models
Familiarity with cloud-based AI environments (AWS, Azure, or GCP)
Experience with AI model deployment and optimization techniques
Data Engineer
Data engineer job in Dallas, TX
Junior Data Engineer
DESCRIPTION: BeaconFire is based in Central NJ, specializing in Software Development, Web Development, and Business Intelligence; looking for candidates who are good communicators and self-motivated. You will play a key role in building, maintaining, and operating integrations, reporting pipelines, and data transformation systems.
Qualifications:
Passion for data and a deep desire to learn.
Master's Degree in Computer Science/Information Technology, Data Analytics/Data
Science, or related discipline.
Intermediate Python. Experience in data processing is a plus. (Numpy, Pandas, etc)
Experience with relational databases (SQL Server, Oracle, MySQL, etc.)
Strong written and verbal communication skills.
Ability to work both independently and as part of a team.
Responsibilities:
Collaborate with the analytics team to find reliable data solutions to meet the business needs.
Design and implement scalable ETL or ELT processes to support the business demand for data.
Perform data extraction, manipulation, and production from database tables.
Build utilities, user-defined functions, and frameworks to better enable data flow patterns.
Build and incorporate automated unit tests, participate in integration testing efforts.
Work with teams to resolve operational & performance issues.
Work with architecture/engineering leads and other teams to ensure quality solutions are implemented, and engineering best practices are defined and adhered to.
Compensation: $65,000.00 to $80,000.00 /year
BeaconFire is an e-verified company. Work visa sponsorship is available.
Software Engineer, Platform - Corpus Christi, USA
Data engineer job in Corpus Christi, TX
The mission of Speechify is to make sure that reading is never a barrier to learning.
Over 50 million people use Speechify's text-to-speech products to turn whatever they're reading - PDFs, books, Google Docs, news articles, websites - into audio, so they can read faster, read more, and remember more. Speechify's text-to-speech reading products include its iOS app, Android App, Mac App, Chrome Extension, and Web App. Google recently named Speechify the Chrome Extension of the Year and Apple named Speechify its 2025 Design Award winner for Inclusivity.
Today, nearly 200 people around the globe work on Speechify in a 100% distributed setting - Speechify has no office. These include frontend and backend engineers, AI research scientists, and others from Amazon, Microsoft, and Google, leading PhD programs like Stanford, high growth startups like Stripe, Vercel, Bolt, and many founders of their own companies.
Overview
The responsibilities of our Platform team include building and maintaining all backend services, including, but not limited to, payments, analytics, subscriptions, new products, text to speech, and external APIs.
This is a key role and ideal for someone who thinks strategically, enjoys fast-paced environments, is passionate about making product decisions, and has experience building great user experiences that delight users.
We are a flat organization that allows anyone to become a leader by showing excellent technical skills and delivering results consistently and fast. Work ethic, solid communication skills, and obsession with winning are paramount.
Our interview process involves several technical interviews and we aim to complete them within 1 week.
What You'll Do
Design, develop, and maintain robust APIs including public TTS API, internal APIs like Payment, Subscription, Auth and Consumption Tracking, ensuring they meet business and scalability requirements
Oversee the full backend API landscape, enhancing and optimizing for performance and maintainability
Collaborate on B2B solutions, focusing on customization and integration needs for enterprise clients
Work closely with cross-functional teams to align backend architecture with overall product strategy and user experience
An Ideal Candidate Should Have
Proven experience in backend development: TS/Node (required)
Direct experience with GCP and knowledge of AWS, Azure, or other cloud providers
Efficiency in ideation and implementation, prioritizing tasks based on urgency and impact
Preferred: Experience with Docker and containerized deployments
Preferred: Proficiency in deploying high availability applications on Kubernetes
What We Offer
A dynamic environment where your contributions shape the company and its products
A team that values innovation, intuition, and drive
Autonomy, fostering focus and creativity
The opportunity to have a significant impact in a revolutionary industry
Competitive compensation, a welcoming atmosphere, and a commitment to an exceptional asynchronous work culture
The privilege of working on a product that changes lives, particularly for those with learning differences like dyslexia, ADD, and more
An active role at the intersection of artificial intelligence and audio - a rapidly evolving tech domain
The United States Based Salary range for this role is: 140,000-200,000 USD/Year + Bonus + Stock depending on experience
Think you're a good fit for this job?
Tell us more about yourself and why you're interested in the role when you apply.
And don't forget to include links to your portfolio and LinkedIn.
Not looking but know someone who would make a great fit?
Refer them!
Speechify is committed to a diverse and inclusive workplace.
Speechify does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.
Auto-Apply