Data Scientist jobs at U.s.government

- 1319 jobs

Data Engineer
Us Government Other Agencies and Independent Organizations 4.2
Data scientist job at U.s.government
Central Intelligence Agency Print Share * * * * Save * This job is open to * Requirements * How you will be evaluated * Required documents * How to apply DDI Data Engineers solve critical data challenges by building optimal data pipelines, analyzing and transforming data, and delivering data to systems to advance CIA's intelligence mission. Summary DDI Data Engineers solve critical data challenges by building optimal data pipelines, analyzing and transforming data, and delivering data to systems to advance CIA's intelligence mission. Overview Help Accepting applications Open & closing dates 10/01/2025 to 09/30/2026 Salary $74,584 to - $156,755 per year Pay scale & grade GS 9 - 13 Location Many vacancies in the following location: Washington, DC Remote job No Telework eligible No Travel Required Occasional travel - You may be expected to travel for this position. Relocation expenses reimbursed Yes-You may qualify for reimbursement of relocation expenses in accordance with agency policy. Appointment type Permanent Work schedule Full-time Service Excepted Promotion potential None Job family (Series) * 2210 Information Technology Management Supervisory status No Security clearance Sensitive Compartmented Information Drug test Yes Position sensitivity and risk Special-Sensitive (SS)/High Risk Trust determination process * National security Financial disclosure Yes Bargaining unit status No Announcement number 24-12198900-6898/DIHD Control number 759326100 This job is open to Help The public U.S. Citizens, Nationals or those who owe allegiance to the U.S. Federal employees - Competitive service Current federal employees whose agencies follow the U.S. Office of Personnel Management's hiring rules and pay scales. Federal employees - Excepted service Current federal employees whose agencies have their own hiring rules, pay scales and evaluation criteria. Videos Duties Help As a DDI Data Engineer at CIA, you will transform and deliver high-priority datasets to enable exploitation and analysis for CIA's intelligence mission. Using computer programming and other capabilities, DDI Data Engineers develop processes to transform raw structured or unstructured data into useful, actionable information for critical mission systems. They design and build scalable data pipelines using commercial cloud, open source, and custom software solutions to process large datasets. DDI Data Engineers create automated data validation methods to load data into systems accurately and consistently. Through partnerships with collectors, data scientists, and software engineers, DDI Data Engineers design how data will be integrated and stored for optimal effectiveness. DDI Data Engineers are curious and use their skills to identify problems and build solutions, applying updated expertise in areas such as ETL, workflow and container orchestration, database and search technologies, distributed computing, serverless design, and cloud solutions. DDI encourages and supports continual learning. Opportunities exist for continued professional development, to include advanced technical training. Travel opportunities may be available. Requirements Help Conditions of employment * You must be physically in the United States or one of its territories when you submit your resume via MyLINK. * You must be registered for the Selective Service, if applicable. * You must be a U.S. citizen and at least 18 years of age (dual-national US citizens are eligible). * You must be willing to move to the Washington, DC area. * You must successfully complete a thorough medical and psychological exam, a polygraph interview, and a comprehensive background investigation. * For further requirements information, please visit: ***************************************************** Qualifications Minimum Qualifications Interested candidates should be passionate about the ideals of our American republic, committed to upholding the rule of law and the U.S. Constitution, and committed to improving the efficiency of the Federal government. Hiring decisions will not be based on race, sex, color, religion, or national origin. * Proficiency with computer programming languages like Python, Java, Scala, or equivalent Object-Oriented Programming Language and knowledge of SQL * Knowledge of data ingestion, transformation, modeling, and storage: * Strong critical thinking and problem-solving skills * Ability to work independently and in groups within a team environment. * Curiosity, creativity, and initiative * Ability to meet the minimum requirements for joining CIA, including U.S. citizenship and a background investigation Desired Qualifications * Prior experience related to data engineering, data science, data architecture, database and search technologies * Experience analyzing or processing bulk datasets * Experience with signal/image processing, geospatial modeling or voice/data communications * Prior application of serverless design patterns, such as serverless applications or websites * Understanding of scaling and performance of distributed/cloud systems * Understanding of data science concepts, AI/ML, automation, and scripting Education * Bachelor's degree in one of the following fields or related studies: * Computer Science * Computer/Electrical Engineering or related field * Data Science or related quantitative field * Information Science/Technology * Forensics (Digital Forensics, Forensic Science, and Technology) * Mathematics/Statistics * At least a 3.0 GPA on a 4-point scale Additional information Candidates should be committed to improving the efficiency of the Federal government, passionate about the ideals of our American republic, and committed to upholding the rule of law and the United States Constitution. Benefits Help A career with the U.S. government provides employees with a comprehensive benefits package. As a federal employee, you and your family will have access to a range of benefits that are designed to make your federal career very rewarding. Opens in a new window Learn more about federal benefits. Review our benefits Eligibility for benefits depends on the type of position you hold and whether your position is full-time, part-time or intermittent. Contact the hiring agency for more information on the specific benefits offered. How you will be evaluated You will be evaluated for this job based on how well you meet the qualifications above. For more information, please visit: ***************************** Benefits Help A career with the U.S. government provides employees with a comprehensive benefits package. As a federal employee, you and your family will have access to a range of benefits that are designed to make your federal career very rewarding. Opens in a new window Learn more about federal benefits. Review our benefits Eligibility for benefits depends on the type of position you hold and whether your position is full-time, part-time or intermittent. Contact the hiring agency for more information on the specific benefits offered. Required documents Required Documents Help For further information, please visit: ***************************** If you are relying on your education to meet qualification requirements: Education must be accredited by an accrediting institution recognized by the U.S. Department of Education in order for it to be credited towards qualifications. Therefore, provide only the attendance and/or degrees from schools accredited by accrediting institutions recognized by the U.S. Department of Education. Failure to provide all of the required information as stated in this vacancy announcement may result in an ineligible rating or may affect the overall rating. How to Apply Help This post is for viewing purposes only. To get started, please visit ***************************** where you can read more about this position and express your interest in up to four jobs. Upon expressing your interest, you will be taken to MyLINK, which allows you to submit your resume and job rankings and provide basic information about yourself. Agency contact information Central Intelligence Agency Website *********************** Next steps After you express interest for up to four positions on cia.gov/careers, a CIA recruiter may contact you for further discussion if your qualifications meet our needs. See the MyLINK FAQs on our website for more information. Fair and transparent The Federal hiring process is set up to be fair and transparent. Please read the following guidance. Criminal history inquiries Equal Employment Opportunity (EEO) Policy Financial suitability New employee probationary period Privacy Act Reasonable accommodation policy Selective Service Signature and false statements Social security number request Required Documents Help For further information, please visit: ***************************** If you are relying on your education to meet qualification requirements: Education must be accredited by an accrediting institution recognized by the U.S. Department of Education in order for it to be credited towards qualifications. Therefore, provide only the attendance and/or degrees from schools accredited by accrediting institutions recognized by the U.S. Department of Education. Failure to provide all of the required information as stated in this vacancy announcement may result in an ineligible rating or may affect the overall rating.
$74.6k-156.8k yearly 25d ago
Data Scientist
Kavaliro 4.2
McLean, VA jobs
Kavaliro is seeking a Data Scientist to provide highly technical and in-depth data engineering support. The candidate MUST have experience designing and building data infrastructure, developing data pipelines, transforming and preparing data, ensuring data quality and security, and monitoring and optimizing systems. The candidate MUST have extensive experience with Python and AWS. Experience with SQL, multi-data source queries with database technologies (PostgreSQL, MySQL, RDS, etc.), NiFi, Git, Elasticsearch, Kibana, Jupyter Notebooks, NLP, AI, and any data visualization tools (Tableau, Kibana, Qlik, etc.) are desired. Required Skills and Demonstrated Experience Demonstrated experience with data engineering, to include designing and building data infrastructure, developing data pipelines, transforming/preparing data, ensuring data quality and security, and monitoring/optimizing systems. Demonstrated experience with data management and integration, including designing and perating robust data layers for application development across local and cloud or web data sources. Demonstrated work experience programming with Python Demonstrated experience building scalable ETL and ELT workflows for reporting and analytics. Demonstrated experience with general Linux computing and advanced bash scripting Demonstrated experience with SQL. Demonstrated experience constructing complex multi-data source queries with database technologies such as PostgreSQL, MySQL, Neo4J or RDS Demonstrated experience processing data sources containing structured or unstructured data Demonstrated experience developing data pipelines with NiFi to bring data into a central environment Demonstrated experience delivering results to stakeholders through written documentation and oral briefings Demonstrated experience using code repositories such as Git Demonstrated experience using Elastic and Kibana Demonstrated experience working with multiple stakeholders Demonstrated experience documenting such artifacts as code, Python packages and methodologies Demonstrated experience using Jupyter Notebooks Demonstrated experience with machine learning techniques including natural language processing Demonstrated experience explaining complex technical issues to more junior data scientists, in graphical, verbal, or written formats Demonstrated experience developing tested, reusable and reproducible work Work or educational background in one or more of the following areas: mathematics, statistics, hard sciences (e.g. Physics, Computational Biology, Astronomy, Neuroscience, etc.) computer science, data science, or business analytics Desired Skills and Demonstrated Experience Demonstrated experience with cloud services, such as AWS, as well as cloud data technologies and architecture. Demonstrated experience using big data processing tools such as Apache Spark or Trino Demonstrated experience with machine learning algorithms Demonstrated experience with using container frameworks such as Docker or Kubernetes Demonstrated experience with using data visualizations tools such as Tableau, Kibana or Apache Superset Demonstrated experience creating learning objectives and creating teaching curriculum in technical or scientific fields Location: McLean, Virginia This position is onsite and there is no remote availability. Clearance: TS/SCI with Full Scope Polygraph Applicant MUST hold a permanent U.S. citizenship for this position in accordance with government contract requirements. Kavaliro provides Equal Employment Opportunities to all employees and applicants. All qualified applicants will receive consideration for employment without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws. Kavaliro is committed to the full inclusion of all qualified individuals. In keeping with our commitment, Kavaliro will take the steps to assure that people with disabilities are provided reasonable accommodations. Accordingly, if reasonable accommodation is required to fully participate in the job application or interview process, to perform the essential functions of the position, and/or to receive all other benefits and privileges of employment, please respond to this posting to connect with a company representative.
$74k-105k yearly est. 5d ago
Data Scientist with ML
Kavaliro 4.2
Reston, VA jobs
Kavaliro is seeking a Data Scientist to provide highly technical and in-depth data engineering support. MUST have experience with Python, PyTorch, Flask (knowledge at minimum with ability to quickly pickup), Familiarity with REST APIs (at minimum), Statistics background/experience, Basic understanding of NLP. Desired skills for a candidate include experience performance R&D with natural language processing, deploying CNN and LLMs or foundational models, deploying ML models on multimedia data, experience with Linux System Administration (or bash), experience with Android Configuration, experience in embedded systems (Raspberry Pi). Required Skills and Demonstrated Experience Demonstrated experience in Python, Javascript, and R. Demonstrated experience employing machine learning and deep learning modules such as Pandas, Scikit, Tensorflow, Pytorch. Demonstrated experience with statistical inference, as well as building and understanding predictive models, using machine learning methods. Demonstrated experience with large-scale text analytics. Desired Skills Demonstrated hands-on experience performing research or development with natural language processing and working with, deploying, and testing Convolutional Neural Networks (CNN), large-language models (LLMs) or foundational models. Demonstrated experience developing and deploying testing and verification methodologies to evaluate algorithm performance and identify strategies for improvement or optimization. Demonstrated experience deploying machine learning models on multimedia data, to include joint text, audio, video, hardware, and peripherals. Demonstrated experience with Linux System Administration and associated scripting languages (Bash) Demonstrated experience with Android configuration, software development, and interfacing. Demonstrated experience in embedded systems (Raspberry Pi) Develops and conducts independent testing and evaluation methods on research-grade algorithms in applicable fields. Reports results and provide documentation and guidance on working with the research-grade algorithms. Evaluates, Integrates and leverage internally-hosted data science tools. Customize research grade algorithms to be optimized for memory and computational efficiency through quantizing, trimming layers, or through custom methods Location: Reston, Virginia This position is onsite and there is no remote availability. Clearance: Active TS/SCI with Full Scope Polygraph Applicant MUST hold a permanent U.S. citizenship for this position in accordance with government contract requirements. Kavaliro provides Equal Employment Opportunities to all employees and applicants. All qualified applicants will receive consideration for employment without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws. Kavaliro is committed to the full inclusion of all qualified individuals. In keeping with our commitment, Kavaliro will take the steps to assure that people with disabilities are provided reasonable accommodations. Accordingly, if reasonable accommodation is required to fully participate in the job application or interview process, to perform the essential functions of the position, and/or to receive all other benefits and privileges of employment, please respond to this posting to connect with a company representative.
$74k-105k yearly est. 2d ago
Data Scientist
The Intersect Group 4.2
Phoenix, AZ jobs
We are seeking a Data Scientist to support advanced analytics and machine learning initiatives across the organization. This role involves working with large, complex datasets to uncover insights, validate data integrity, and build predictive models. A key focus will be developing and refining machine learning models that leverage sales and operational data to optimize pricing strategies at the store level. Day-to-Day Responsibilities Compare and validate numbers across multiple data systems Investigate discrepancies and understand how metrics are derived Perform data science and data analysis tasks Build and maintain AI/ML models using Python Interpret model results, fine-tune algorithms, and iterate based on findings Validate and reconcile data from different sources to ensure accuracy Work with sales and production data to produce item-level pricing recommendations Support ongoing development of a new data warehouse and create queries as needed Review Power BI dashboards (Power BI expertise not required) Contribute to both ML-focused work and general data science responsibilities Improve and refine an existing ML pricing model already in production Qualifications Strong proficiency with MS SQL Server Experience creating and deploying machine learning models in Python Ability to interpret, evaluate, and fine-tune model outputs Experience validating and reconciling data across systems Strong foundation in machine learning, data modeling, and backend data operations Familiarity with querying and working with evolving data environments
$76k-109k yearly est. 2d ago
ETL/ELT Data Engineer (Secret Clearance) - Hybrid
Launchcode 2.9
Austin, TX jobs
LaunchCode is recruiting for a Software Data Engineer to work at one of our partner companies! Details: Full-Time W2, Salary Immediate opening Hybrid - Austin, TX (onsite 1-2 times a week) Pay $85K-$120K Minimum Experience: 4 years Security Clearance: Active DoD Secret Clearance Disclaimer: Please note that we are unable to provide work authorization or sponsorship for this role, now or in the future. Candidates requiring current or future sponsorship will not be considered. Job description Job Summary A Washington, DC-based software solutions provider founded in 2017, specializes in delivering mission-critical and enterprise solutions to the federal government. Originating from the Department of Defense's software factory ecosystem, the company focuses on Command and Control, Cybersecurity, Space, Geospatial, and Modeling & Simulation. The company leverages commercial technology to enhance the capabilities of the DoD, IC, and their end-users, with innovation driven by its Innovation centers. The company has a presence in Boston, MA, Colorado Springs, CO, San Antonio, TX, and St. Louis, MO. Why the company? Environment of Autonomy Innovative Commercial Approach People over process We are seeking a passionate Software Data Engineer to support the Army Software Factory (ASWF) in aligning with DoDM 8140.03 Cyber Workforce requirements and broader compliance mandates. The Army Software Factory (ASWF), a first-of-its-kind initiative under Army Futures Command, is revolutionizing the Army's approach to software development by training and employing self-sustaining technical talent from across the military and civilian workforce. Guided by the motto “By Soldiers, For Soldiers,” ASWF equips service members to develop mission-critical software solutions independently-especially vital for future contested environments where traditional technical support may be unavailable. This initiative also serves as a strategic prototype to modernize legacy IT processes and build technical readiness across the force to ensure battlefield dominance in the digital age. Required Skills: Active DoD Secret Clearance (Required) 4+ years of experience in data science, data engineering, or similar roles. Expertise in designing, building, and maintaining scalable ETL/ELT pipelines using tools and languages such as Python, SQL, Apache Spark, or Airflow. Strong proficiency in working with relational and NoSQL databases, including experience with database design, optimization, and query performance tuning (e.g., PostgreSQL, MySQL, MongoDB, Cassandra). Demonstrable experience with cloud data platforms and services (e.g., AWS Redshift, S3, Glue, Athena; Azure Data Lake, Data Factory, Synapse; Google BigQuery, Cloud Storage, Dataflow). Solid understanding of data warehousing concepts (e.g., Kimball, Inmon methodologies) and experience with data modeling for analytical purposes. Proficiency in at least one programming language commonly used in data engineering (e.g., Python, Java, Scala) for data manipulation, scripting, and automation. CompTIA Security+ Certified or otherwise DoDM 8140.03 (formerly DoD 8570.01-M) compliant. Nice to Have: Familiarity with SBIR technologies and transformative platform shifts Experience working in Agile or DevSecOps environments 2+ years of experience interfacing with Platform Engineers and data visibility team, manage AWS resources, and GitLab admin #LI-hybrid #austintx #ETLengineer #dataengineer #army #aswf #clearancejobs #clearedjobs #secretclearance #ETL
$85k-120k yearly 5d ago
Head of Data Science & AI
Addison Group 4.6
Austin, TX jobs
Duration: 6 month contract-to-hire Compensation: $150K-160K Work schedule: Monday-Friday (8 AM-5PM CST) - onsite 3x per week Benefits: This position is eligible for medical, dental, vision and 401(k) The Head of Data Science & AI leads the organization's data science strategy and team, driving advanced analytics and AI initiatives to deliver business value and innovation. This role sets the strategic direction for data science, ensures alignment with organizational goals, and promotes a data-driven culture. It involves close collaboration with business and technology teams to identify opportunities for leveraging machine learning and AI to improve operations and customer experiences. Key Responsibilities Develop and execute a data science strategy and roadmap aligned with business objectives. Build and lead the data science team, providing mentorship and fostering growth. Partner with business leaders to identify challenges and deliver actionable insights. Oversee design and deployment of predictive models, algorithms, and analytical frameworks. Ensure data integrity, governance, and security in collaboration with engineering teams. Communicate complex insights to non-technical stakeholders. Manage infrastructure, tools, and budget for data science initiatives. Drive experimentation with emerging AI technologies and ensure ethical AI practices. Oversee full AI model lifecycle: development, deployment, monitoring, and compliance. Qualifications 8+ years in data science/analytics with leadership experience. Expertise in Python, R, SQL, and ML frameworks (TensorFlow, PyTorch, Scikit-Learn). Experience deploying ML models and monitoring performance. Familiarity with visualization tools (Tableau, Power BI). Strong knowledge of data governance, advanced statistical methods, and AI trends. Skills in project management tools (MS Project, JIRA) and software development best practices (CI/CD, Git, Agile). Please apply directly to be considered.
$150k-160k yearly 3d ago
Data Architect - Azure Databricks
Fractal 4.2
Palo Alto, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner. Please visit Fractal | Intelligence for Imagination for more information about Fractal. Job Posting Title: Principal Architect - Azure Databricks Job Description Seeking a visionary and hands-on Principal Architect to lead large-scale, complex technical initiatives leveraging Databricks within the healthcare payer domain. This role is pivotal in driving data modernization, advanced analytics, and AI/ML solutions for our clients. You will serve as a strategic advisor, technical leader, and delivery expert across multiple engagements. Responsibilities: Design & Architecture of Scalable Data Platforms Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs such as sales forecasting, trade promotions, supply chain optimization etc... Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management). Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility. Client & Business Stakeholder Engagement Partner with business stakeholders to translate functional requirements into scalable technical solutions. Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases Data Pipeline Development & Collaboration Collaborate with data engineers and data scientists to develop end-to-end pipelines using PySpark, SQL, DLT (Delta Live Tables), and Databricks Workflows. Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets. Performance, Scalability, and Reliability Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques. Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools Security, Compliance & Governance Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies. Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging. Adoption of AI Copilots & Agentic Development Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for: Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks. Generating documentation and test cases to accelerate pipeline development. Interactive debugging and iterative code optimization within notebooks. Advocate for agentic AI workflows that use specialized agents for: Data profiling and schema inference. Automated testing and validation. Innovation and Continuous Learning Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling. Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements. Requirements: Bachelor's or master's degree in computer science, Information Technology, or a related field. 12-18 years of hands-on experience in data engineering, with at least 5+ years on Databricks Architecture and Apache Spark. Expertise in building high-throughput, low-latency ETL/ELT pipelines on Azure Databricks using PySpark, SQL, and Databricks-native features. Familiarity with ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage (Azure Data Lake Storage Gen2) Experience designing Lakehouse architectures with bronze, silver, gold layering. Expertise in optimizing Databricks performance using Delta Lake features such as OPTIMIZE, VACUUM, ZORDER, and Time Travel Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing. Experience with designing Data marts using Databricks SQL warehouse and integrating with BI tools (Power BI, Tableau, etc.). Hands-on experience designing solutions using Workflows (Jobs), Delta Lake, Delta Live Tables (DLT), Unity Catalog, and MLflow. Familiarity with Databricks REST APIs, Notebooks, and cluster configurations for automated provisioning and orchestration. Experience in integrating Databricks with CI/CD pipelines using tools such as Azure DevOps, GitHub Actions. Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning Databricks workspaces and resources In-depth experience with Azure Cloud services such as ADF, Synapse, ADLS, Key Vault, Azure Monitor, and Azure Security Centre. Strong understanding of data privacy, access controls, and governance best practices. Experience working with Unity Catalog, RBAC, tokenization, and data classification frameworks Worked as a consultant for more than 4-5 years with multiple clients Contribute to pre-sales, proposals, and client presentations as a subject matter expert. Participated and Lead RFP responses for your organization. Experience in providing solutions for technical problems and provide cost estimates Excellent communication skills for stakeholder interaction, solution presentations, and team coordination. Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements. Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality. Pay: The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $ 200,000 - $300,000. In addition, you may be eligible for a discretionary bonus for the current performance period. Benefits: As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take time needed for either sick time or vacation. Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
$200k-300k yearly 1d ago
AWS Data Architect
Fractal 4.2
San Jose, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner. Please visit Fractal | Intelligence for Imagination for more information about Fractal. Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines. Responsibilities: Design & Architecture of Scalable Data Platforms Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management). Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility. Client & Business Stakeholder Engagement Partner with business stakeholders to translate functional requirements into scalable technical solutions. Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases Data Pipeline Development & Collaboration Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets. Performance, Scalability, and Reliability Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques. Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools Security, Compliance & Governance Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies. Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging. Adoption of AI Copilots & Agentic Development Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks. Generating documentation and test cases to accelerate pipeline development. Interactive debugging and iterative code optimization within notebooks. Advocate for agentic AI workflows that use specialized agents for Data profiling and schema inference. Automated testing and validation. Innovation and Continuous Learning Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling. Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements. Requirements: Bachelor's or master's degree in computer science, Information Technology, or a related field. 8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark. Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL. Excellent hands on experience with workload automation tools such as Airflow, Prefect etc. Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage Experience designing Lakehouse architectures with bronze, silver, gold layering. Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing. Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.). Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions. Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc. Strong understanding of data privacy, access controls, and governance best practices. Experience working with RBAC, tokenization, and data classification frameworks Excellent communication skills for stakeholder interaction, solution presentations, and team coordination. Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements. Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality Must be able to work in PST time zone. Pay: The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period. Benefits: As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation. Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
$150k-180k yearly 3d ago
AWS Data Architect
Fractal 4.2
Santa Rosa, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner. Please visit Fractal | Intelligence for Imagination for more information about Fractal. Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines. Responsibilities: Design & Architecture of Scalable Data Platforms Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management). Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility. Client & Business Stakeholder Engagement Partner with business stakeholders to translate functional requirements into scalable technical solutions. Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases Data Pipeline Development & Collaboration Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets. Performance, Scalability, and Reliability Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques. Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools Security, Compliance & Governance Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies. Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging. Adoption of AI Copilots & Agentic Development Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks. Generating documentation and test cases to accelerate pipeline development. Interactive debugging and iterative code optimization within notebooks. Advocate for agentic AI workflows that use specialized agents for Data profiling and schema inference. Automated testing and validation. Innovation and Continuous Learning Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling. Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements. Requirements: Bachelor's or master's degree in computer science, Information Technology, or a related field. 8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark. Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL. Excellent hands on experience with workload automation tools such as Airflow, Prefect etc. Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage Experience designing Lakehouse architectures with bronze, silver, gold layering. Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing. Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.). Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions. Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc. Strong understanding of data privacy, access controls, and governance best practices. Experience working with RBAC, tokenization, and data classification frameworks Excellent communication skills for stakeholder interaction, solution presentations, and team coordination. Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements. Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality Must be able to work in PST time zone. Pay: The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period. Benefits: As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation. Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
$150k-180k yearly 3d ago
AWS Data Architect
Fractal 4.2
San Francisco, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner. Please visit Fractal | Intelligence for Imagination for more information about Fractal. Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines. Responsibilities: Design & Architecture of Scalable Data Platforms Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management). Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility. Client & Business Stakeholder Engagement Partner with business stakeholders to translate functional requirements into scalable technical solutions. Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases Data Pipeline Development & Collaboration Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets. Performance, Scalability, and Reliability Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques. Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools Security, Compliance & Governance Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies. Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging. Adoption of AI Copilots & Agentic Development Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks. Generating documentation and test cases to accelerate pipeline development. Interactive debugging and iterative code optimization within notebooks. Advocate for agentic AI workflows that use specialized agents for Data profiling and schema inference. Automated testing and validation. Innovation and Continuous Learning Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling. Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements. Requirements: Bachelor's or master's degree in computer science, Information Technology, or a related field. 8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark. Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL. Excellent hands on experience with workload automation tools such as Airflow, Prefect etc. Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage Experience designing Lakehouse architectures with bronze, silver, gold layering. Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing. Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.). Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions. Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc. Strong understanding of data privacy, access controls, and governance best practices. Experience working with RBAC, tokenization, and data classification frameworks Excellent communication skills for stakeholder interaction, solution presentations, and team coordination. Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements. Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality Must be able to work in PST time zone. Pay: The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period. Benefits: As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation. Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
$150k-180k yearly 3d ago
AWS Data Architect
Fractal 4.2
Sunnyvale, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner. Please visit Fractal | Intelligence for Imagination for more information about Fractal. Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines. Responsibilities: Design & Architecture of Scalable Data Platforms Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management). Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility. Client & Business Stakeholder Engagement Partner with business stakeholders to translate functional requirements into scalable technical solutions. Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases Data Pipeline Development & Collaboration Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets. Performance, Scalability, and Reliability Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques. Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools Security, Compliance & Governance Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies. Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging. Adoption of AI Copilots & Agentic Development Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks. Generating documentation and test cases to accelerate pipeline development. Interactive debugging and iterative code optimization within notebooks. Advocate for agentic AI workflows that use specialized agents for Data profiling and schema inference. Automated testing and validation. Innovation and Continuous Learning Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling. Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements. Requirements: Bachelor's or master's degree in computer science, Information Technology, or a related field. 8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark. Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL. Excellent hands on experience with workload automation tools such as Airflow, Prefect etc. Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage Experience designing Lakehouse architectures with bronze, silver, gold layering. Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing. Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.). Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions. Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc. Strong understanding of data privacy, access controls, and governance best practices. Experience working with RBAC, tokenization, and data classification frameworks Excellent communication skills for stakeholder interaction, solution presentations, and team coordination. Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements. Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality Must be able to work in PST time zone. Pay: The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period. Benefits: As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation. Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
$150k-180k yearly 3d ago
AWS Data Architect
Fractal 4.2
Santa Clara, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner. Please visit Fractal | Intelligence for Imagination for more information about Fractal. Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines. Responsibilities: Design & Architecture of Scalable Data Platforms Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management). Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility. Client & Business Stakeholder Engagement Partner with business stakeholders to translate functional requirements into scalable technical solutions. Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases Data Pipeline Development & Collaboration Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets. Performance, Scalability, and Reliability Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques. Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools Security, Compliance & Governance Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies. Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging. Adoption of AI Copilots & Agentic Development Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks. Generating documentation and test cases to accelerate pipeline development. Interactive debugging and iterative code optimization within notebooks. Advocate for agentic AI workflows that use specialized agents for Data profiling and schema inference. Automated testing and validation. Innovation and Continuous Learning Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling. Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements. Requirements: Bachelor's or master's degree in computer science, Information Technology, or a related field. 8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark. Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL. Excellent hands on experience with workload automation tools such as Airflow, Prefect etc. Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage Experience designing Lakehouse architectures with bronze, silver, gold layering. Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing. Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.). Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions. Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc. Strong understanding of data privacy, access controls, and governance best practices. Experience working with RBAC, tokenization, and data classification frameworks Excellent communication skills for stakeholder interaction, solution presentations, and team coordination. Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements. Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality Must be able to work in PST time zone. Pay: The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period. Benefits: As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation. Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
$150k-180k yearly 3d ago
AWS Data Architect
Fractal 4.2
Fremont, CA jobs
Fractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite is the one who empowers imagination with intelligence. Fractal has been featured as a Great Place to Work by The Economic Times in partnership with the Great Place to Work Institute and recognized as a ‘Cool Vendor' and a ‘Vendor to Watch' by Gartner. Please visit Fractal | Intelligence for Imagination for more information about Fractal. Fractal is looking for a proactive and driven AWS Lead Data Architect/Engineer to join our cloud and data tech team. In this role, you will work on designing the system architecture and solution, ensuring the platform is scalable while performant, and creating automated data pipelines. Responsibilities: Design & Architecture of Scalable Data Platforms Design, develop, and maintain large-scale data processing architectures on the Databricks Lakehouse Platform to support business needs Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for various domains (e.g., Retail Execution, Digital Commerce, Logistics, Category Management). Leverage Delta Lake, Unity Catalog, and advanced features of Databricks for governed data sharing, versioning, and reproducibility. Client & Business Stakeholder Engagement Partner with business stakeholders to translate functional requirements into scalable technical solutions. Conduct architecture workshops and solutioning sessions with enterprise IT and business teams to define data-driven use cases Data Pipeline Development & Collaboration Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL Enable data ingestion from diverse sources such as ERP (SAP), POS data, Syndicated Data, CRM, e-commerce platforms, and third-party datasets. Performance, Scalability, and Reliability Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques. Implement monitoring and alerting using Databricks Observability, Ganglia, Cloud-native tools Security, Compliance & Governance Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies. Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging. Adoption of AI Copilots & Agentic Development Utilize GitHub Copilot, Databricks Assistant, and other AI code agents for Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks. Generating documentation and test cases to accelerate pipeline development. Interactive debugging and iterative code optimization within notebooks. Advocate for agentic AI workflows that use specialized agents for Data profiling and schema inference. Automated testing and validation. Innovation and Continuous Learning Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling. Evaluate and pilot new features from Databricks releases and partner integrations for modern data stack improvements. Requirements: Bachelor's or master's degree in computer science, Information Technology, or a related field. 8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark. Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS/Azure/GCP using Python, PySpark, SQL. Excellent hands on experience with workload automation tools such as Airflow, Prefect etc. Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage Experience designing Lakehouse architectures with bronze, silver, gold layering. Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing. Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.). Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions. Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc. Strong understanding of data privacy, access controls, and governance best practices. Experience working with RBAC, tokenization, and data classification frameworks Excellent communication skills for stakeholder interaction, solution presentations, and team coordination. Proven experience leading or mentoring global, cross-functional teams across multiple time zones and engagements. Ability to work independently in agile or hybrid delivery models, while guiding junior engineers and ensuring solution quality Must be able to work in PST time zone. Pay: The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Fractal, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is: $150k - $180k. In addition, you may be eligible for a discretionary bonus for the current performance period. Benefits: As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation. Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
$150k-180k yearly 3d ago
Data Engineer
Addison Group 4.6
Coppell, TX jobs
Title: Data Engineer Assignment Type: 6-12 month contract-to-hire Compensation: $65/hr-$75/hr W2 Work Model: Hybrid (4 days on-site, 1 day remote) Benefits: Medical, Dental, Vision, 401(k) What we need is someone who comes 8+ years of experience in the Data Engineering space who specializes in Microsoft Azure and Databricks. This person will be a part of multiple initiatives for the "New Development" and "Data Reporting" teams but will be primarily tasked with designing, building, maintaining, and automating their enterprise data architecture/pipelines within the cloud. Technology-wise we are needing to come with skills in Azure Databricks (5+ years), cloud-based environment (Azure and/or AWS), Azure DevOps (ADO), SQL (ETL, SSIS packages), and PySpark or Scala automation. Architecture experience in building pipelines, data modeling, data pipeline deployment, data mapping, etc. Top Skills: -8+ Years of Data Engineer/Business Intelligence -Databricks and Azure Data Factory *Most updated is Unity Catalog for Databricks* -Cloud-based environments (Azure or AWS) -Data Pipeline Architecture and CI/CD methodology -SQL -Automation (Python (PySpark), Scala)
$65-75 hourly 3d ago
Senior Data Engineer
Addison Group 4.6
Houston, TX jobs
About the Role The Senior Data Engineer will play a critical role in building and scaling an enterprise data platform to enable analytics, reporting, and operational insights across the organization. This position requires deep expertise in Snowflake and cloud technologies (AWS or Azure), along with strong upstream oil & gas domain experience. The engineer will design and optimize data pipelines, enforce data governance and quality standards, and collaborate with cross-functional teams to deliver reliable, scalable data solutions. Key Responsibilities Data Architecture & Engineering Design, develop, and maintain scalable data pipelines using Snowflake, AWS/Azure, and modern data engineering tools. Implement ETL/ELT processes integrating data from upstream systems (SCADA, production accounting, drilling, completions, etc.). Architect data models supporting both operational reporting and advanced analytics. Establish and maintain frameworks for data quality, validation, and lineage to ensure enterprise data trust. Platform Development & Optimization Lead the build and optimization of Snowflake-based data warehouses for performance and cost efficiency. Design cloud-native data solutions leveraging AWS/Azure services (S3, Lambda, Azure Data Factory, Databricks). Manage large-scale time-series and operational data processing workflows. Implement strong security, access control, and governance practices. Technical Leadership & Innovation Mentor junior data engineers and provide technical leadership across the data platform team. Research and introduce new technologies to enhance platform scalability and automation. Build reusable frameworks, components, and utilities to streamline delivery. Support AI/ML initiatives by delivering production-ready, high-quality data pipelines. Business Partnership Collaborate with stakeholders across business units to translate requirements into technical solutions. Work with analysts and data scientists to enable self-service analytics and reporting. Ensure data integration supports regulatory and compliance reporting. Act as a bridge between business and technical teams to ensure alignment and impact. Qualifications & Experience Education Bachelor's degree in Computer Science, Engineering, Information Systems, or a related field. Advanced degree or relevant certifications (SnowPro, AWS/Azure Data Engineer, Databricks) preferred. Experience 7+ years in data engineering roles, with at least 3 years on cloud data platforms. Proven expertise in Snowflake and at least one major cloud platform (AWS or Azure). Hands-on experience with upstream oil & gas data (wells, completions, SCADA, production, reserves, etc.). Demonstrated success delivering operational and analytical data pipelines. Technical Skills Advanced SQL and Python programming skills. Strong background in data modeling, ETL/ELT, cataloging, lineage, and data security. Familiarity with Airflow, Azure Data Factory, or similar orchestration tools. Experience with CI/CD, Git, and automated testing. Knowledge of BI tools such as Power BI, Spotfire, or Tableau. Understanding of AI/ML data preparation and integration.
$86k-125k yearly est. 5d ago
Data Engineer
Sharp Decisions 4.6
New York, NY jobs
Hey All, We are looking for a mid-level data engineer. No third parties As a result of this expansion, we are seeking experienced software Data engineers with 5+ years of relevant experience to support the design and development of a strategic data platform for SMBC Capital Markets and Nikko Securities Group. Qualifications and Skills • Proven experience as a Data Engineer with experience in Azure cloud. • Experience implementing solutions using - • Azure cloud services • Azure Data Factory • Azure Lake Gen 2 • Azure Databases • Azure Data Fabric • API Gateway management • Azure Functions • Well versed with Azure Databricks • Strong SQL skills with RDMS or no SQL databases • Experience with developing APIs using FastAPI or similar frameworks in Python • Familiarity with the DevOps lifecycle (git, Jenkins, etc.), CI/CD processes • Good understanding of ETL/ELT processes • Experience in financial services industry, financial instruments, asset classes and market data are a plus.
$85k-111k yearly est. 5d ago
Data Architect
Mindlance 4.6
Washington, DC jobs
Job Title: Developer Premium I Duration: 7 Months with long term extension Hybrid Onsite: 4 days per week from Day 1, with a full transition to 100% onsite anticipated soon Job Requirement: Strong expertise in Data Architecture & Date model design. MS Azure (core experiment) Experience with SAP ECC preferred SAFE agile certification is a plus Ability to work flexibility including off hours to support critical IT task & migration activities. Educational Qualifications and Experience: Bachelor's degree in Computer Science, Information Systems or in a related area of expertise. Required number of years of proven experience in the specific technology/toolset as per Experience Matrix below for each Level. Essential Job Functions: Take functional specs and produce high quality technical specs Take technical specs and produce completed and well tested programs which meet user satisfaction and acceptance, and precisely reflect the requirements - business logic, performance, and usability requirements Conduct/attend requirements definition meetings with end-users and document system/business requirements Conduct Peer Review on Code and Test Cases, prepared by other team members, to assess quality and compliance with coding standards As required for the role, perform end-user demos of proposed solution and finished product, provide end user training and provide support for user acceptance testing As required for the role, troubleshoot production support issues and find appropriate solutions within defined SLA to ensure minimal disruption to business operations Ensure that Bank policies, procedures, and standards are factored into project design and development As required for the role, install new release, and participate in upgrade activities As required for the role, perform integration between systems that are on prem and also on the cloud and third-party vendors As required for the role, collaborate with different teams within the organization for infrastructure, integration, database administration support Adhere to project schedules and report progress regularly Prepare weekly status reports and participate in status meetings and highlight issues and constraints that would impact timely delivery of work program items Find the appropriate tools to implement the project Maintain knowledge of current industry standards and practices As needed, interact and collaborate with Enterprise Architects (EA), Office of Information Security (OIS) to obtain approvals and accreditations “Mindlance is an Equal Opportunity Employer and does not discriminate in employment on the basis of - Minority/Gender/Disability/Religion/LGBTQI/Age/Veterans.”
$93k-124k yearly est. 5d ago
Data Architect
KPI Partners 4.8
Plano, TX jobs
KPI Partners is a 5 times Gartner-recognized data, analytics, and AI consulting company. We are leaders in data engineering on Azure, AWS, Google, Snowflake, and Databricks. Founded in 2006, KPI has over 400 consultants and has successfully delivered over 1,000 projects to our clients. We are looking for skilled data engineers who want to work with the best team in data engineering. Title: Senior Data Architect Location: Plano, TX (Hybrid) Job Type: Contract - 6 Months Key Skills: SQL, PySpark, Databricks, and Azure Cloud Key Note: Looking for a Data Architect who is Hands-on with SQL, PySpark, Databricks, and Azure Cloud. About the Role: We are seeking a highly skilled and experienced Senior Data Architect to join our dynamic team at KPI, working on challenging and multi-year data transformation projects. This is an excellent opportunity for a talented data engineer to play a key role in building innovative data solutions using Azure Native Services and related technologies. If you are passionate about working with large-scale data systems and enjoy solving complex engineering problems, this role is for you. Key Responsibilities: Data Engineering: Design, development, and implementation of data pipelines and solutions using PySpark, SQL, and related technologies. Collaboration: Work closely with cross-functional teams to understand business requirements and translate them into robust data solutions. Data Warehousing: Design and implement data warehousing solutions, ensuring scalability, performance, and reliability. Continuous Learning: Stay up to date with modern technologies and trends in data engineering and apply them to improve our data platform. Mentorship: Provide guidance and mentorship to junior data engineers, ensuring best practices in coding, design, and development. Must-Have Skills & Qualifications: Minimum 12+ years of overall experience in IT Industry. 4+ years of experience in data engineering, with a strong background in building large-scale data solutions. 4+ years of hands-on experience developing and implementing data pipelines using Azure stack experience (Azure, ADF, Databricks, Functions) Proven expertise in SQL for querying, manipulating, and analyzing large datasets. Strong knowledge of ETL processes and data warehousing fundamentals. Self-motivated and independent, with a “let's get this done” mindset and the ability to thrive in a fast-paced and dynamic environment. Good-to-Have Skills: Databricks Certification is a plus. Data Modeling, Azure Architect Certification.
$88k-123k yearly est. 2d ago
Senior Data Engineer
Luna Data Solutions, Inc. 4.4
Austin, TX jobs
We are looking for a seasoned Azure Data Engineer to design, build, and optimize secure, scalable, and high-performance data solutions within the Microsoft Azure ecosystem. This will be a multi-year contract worked FULLY ONSITE in Austin, TX. The ideal candidate brings deep technical expertise in data architecture, ETL/ELT engineering, data integration, and governance, along with hands-on experience in MDM, API Management, Lakehouse architectures, and data mesh or data hub frameworks. This position combines strategic architectural planning with practical, hands-on implementation, empowering cross-functional teams to leverage data as a key organizational asset. Key Responsibilities 1. Data Architecture & Strategy Design and deploy end-to-end Azure data platforms using Azure Data Lake, Azure Synapse Analytics, Azure Databricks, and Azure SQL Database. Build and implement Lakehouse and medallion (Bronze/Silver/Gold) architectures for scalable and modular data processing. Define and support data mesh and data hub patterns to promote domain-driven design and federated governance. Establish standards for conceptual, logical, and physical data modeling across data warehouse and data lake environments. 2. Data Integration & Pipeline Development Develop and maintain ETL/ELT pipelines using Azure Data Factory, Synapse Pipelines, and Databricks for both batch and streaming workloads. Integrate diverse data sources (on-prem, cloud, SaaS, APIs) into a unified Azure data environment. Optimize pipelines for cost-effectiveness, performance, and scalability. 3. Master Data Management (MDM) & Data Governance Implement MDM solutions using Azure-native or third-party platforms (e.g., Profisee, Informatica, Semarchy). Define and manage data governance, metadata, and data quality frameworks. Partner with business teams to align data standards and maintain data integrity across domains. 4. API Management & Integration Build and manage APIs for data access, transformation, and system integration using Azure API Management and Logic Apps. Design secure, reliable data services for internal and external consumers. Automate workflows and system integrations using Azure Functions, Logic Apps, and Power Automate. 5. Database & Platform Administration Perform core DBA tasks, including performance tuning, query optimization, indexing, and backup/recovery for Azure SQL and Synapse. Monitor and optimize cost, performance, and scalability across Azure data services. Implement CI/CD and Infrastructure-as-Code (IaC) solutions using Azure DevOps, Terraform, or Bicep. 6. Collaboration & Leadership Work closely with data scientists, analysts, business stakeholders, and application teams to deliver high-value data solutions. Mentor junior engineers and define best practices for coding, data modeling, and solution design. Contribute to enterprise-wide data strategy and roadmap development. Required Qualifications Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems, or related fields. 5+ years of hands-on experience in Azure-based data engineering and architecture. Strong proficiency with the following: Azure Data Factory, Azure Synapse, Azure Databricks, Azure Data Lake Storage Gen2 SQL, Python, PySpark, PowerShell Azure API Management and Logic Apps Solid understanding of data modeling approaches (3NF, dimensional modeling, Data Vault, star/snowflake schemas). Proven experience with Lakehouse/medallion architectures and data mesh/data hub designs. Familiarity with MDM concepts, data governance frameworks, and metadata management. Experience with automation, data-focused CI/CD, and IaC. Thorough understanding of Azure security, RBAC, Key Vault, and core networking principles. What We Offer Competitive compensation and benefits package Luna Data Solutions, Inc. (LDS) provides equal employment opportunities to all employees. All applicants will be considered for employment. LDS prohibits discrimination and harassment of any type regarding age, race, color, religion, sexual orientation, gender identity, sex, national origin, genetics, protected veteran status, and/or disability status.
$74k-95k yearly est. 2d ago
Data Engineer
Interactive Resources-IR 4.2
Austin, TX jobs
About the Role We are seeking a highly skilled Databricks Data Engineer with strong expertise in modern data engineering, Azure cloud technologies, and Lakehouse architectures. This role is ideal for someone who thrives in dynamic environments, enjoys solving complex data challenges, and can lead end-to-end delivery of scalable data solutions. What We're Looking For 8+ years designing and delivering scalable data pipelines in modern data platforms Deep experience in data engineering, data warehousing, and enterprise-grade solution delivery Ability to lead cross-functional initiatives in matrixed teams Advanced skills in SQL, Python, and ETL/ELT development, including performance tuning Hands-on experience with Azure, Snowflake, and Databricks, including system integrations Key Responsibilities Design, build, and optimize large-scale data pipelines on the Databricks Lakehouse platform Modernize and enhance cloud-based data ecosystems on Azure, contributing to architecture, modeling, security, and CI/CD Use Apache Airflow and similar tools for workflow automation and orchestration Work with financial or regulated datasets while ensuring strong compliance and governance Drive best practices in data quality, lineage, cataloging, and metadata management Primary Technical Skills Develop and optimize ETL/ELT pipelines using Python, PySpark, Spark SQL, and Databricks Notebooks Design efficient Delta Lake models for reliability and performance Implement and manage Unity Catalog for governance, RBAC, lineage, and secure data sharing Build reusable frameworks using Databricks Workflows, Repos, and Delta Live Tables Create scalable ingestion pipelines for APIs, databases, files, streaming sources, and MDM systems Automate ingestion and workflows using Python and REST APIs Support downstream analytics for BI, data science, and application workloads Write optimized SQL/T-SQL queries, stored procedures, and curated datasets Automate DevOps workflows, testing pipelines, and workspace configurations Additional Skills Azure: Data Factory, Data Lake, Key Vault, Logic Apps, Functions CI/CD: Azure DevOps Orchestration: Apache Airflow (plus) Streaming: Delta Live Tables MDM: Profisee (nice-to-have) Databases: SQL Server, Cosmos DB Soft Skills Strong analytical and problem-solving mindset Excellent communication and cross-team collaboration Detail-oriented with a high sense of ownership and accountability
$84k-111k yearly est. 2d ago