Data scientist jobs in Millcreek, PA

- 2,465 jobs

All

Data Scientist

Data Engineer

Statistician

Principal Biostatistician

Data Architect

Principal Biostatistician
CSL Behring 4.6
Data scientist job in King of Prussia, PA
CSL's R&D organization is accelerating innovation to deliver greater impact for patients. With a project-led structure and a focus on collaboration, we're building a future-ready team that thrives in dynamic biotech ecosystems. Joining CSL now means being part of an agile team committed to developing therapies that make a meaningful difference worldwide. Could you be our next Principal Biostatistician? The job is in our King of Prussia, PA, Waltham MA and Maidenhead UK office. This is a hybrid position and is onsite three days a week. You will report to the Director of Biostats. You will lead components of statistical contribution to a clinical development program. The Principal Biostatistician implements the statistical strategies for the clinical trials and regulatory submissions within the program, and is accountable for the statistical deliverables Main Responsibilities: Input to statistical strategy and ensure appropriate statistical methodologies applied to study design and data analysis for clinical trials and regulatory submissions. Lead components and fully support Biostatistics conduct in study design, protocol development, data collection, data analysis, reporting, and submission preparation. Author the initial statistical analysis plan for clinical trials and regulatory submissions. Be accountable for timely completion and quality of the statistical analysis plan. Support Biostatistics interactions with regulatory authorities (eg FDA, EMA, PMDA) Be responsible for interpreting analysis results and ensuring reporting accuracy. Manage outsourcing operations or work with internal statistical programmers within the responsible projects. Ensure timeliness and quality of deliverables by CRO/FSP. Conduct reviews of deliverables to ensure quality. Be accountable for the TFL/CDISC package for study report and regulatory submission. Provide statistical thought partnership for innovative study design and clinical development plans, including Go-No Go criteria and probability of technical success calculations Qualifications and Experience Requirements: PhD or MS in Biostatistics, Statistics 7+ years or relevant work experience Experience with CROs (either managing a CRO, or having worked in a CRO) Experience providing statistical leadership at a study level Demonstrated statistical contribution in facilitating and optimizing clinical development #LI-HYBRID Our Benefits CSL employees that work at least 30 hours per week are eligible for benefits effective day 1. We are committed to the wellbeing of our employees and their loved ones. CSL offers resources and benefits, from health care to financial protection, so you can focus on doing work that matters. Our benefits are designed to support the needs of our employees at every stage of their life. Whether you are considering starting a family, need help paying for emergency back up care or summer camp, looking for mental health resources, planning for your financial future, or supporting your favorite charity with a matching contribution, CSL has many benefits to help achieve your goals. Please take the time to review our benefits site to see what's available to you as a CSL employee. About CSL Behring CSL Behring is a global biotherapeutics leader driven by our promise to save lives. Focused on serving patients' needs by using the latest technologies, we discover, develop and deliver innovative therapies for people living with conditions in the immunology, hematology, cardiovascular and metabolic, respiratory, and transplant therapeutic areas. We use three strategic scientific platforms of plasma fractionation, recombinant protein technology, and cell and gene therapy to support continued innovation and continually refine ways in which products can address unmet medical needs and help patients lead full lives. CSL Behring operates one of the world's largest plasma collection networks, CSL Plasma. Our parent company, CSL, headquartered in Melbourne, Australia, employs 32,000 people, and delivers its lifesaving therapies to people in more than 100 countries. We want CSL to reflect the world around us At CSL, Inclusion and Belonging is at the core of our mission and who we are. It fuels our innovation day in and day out. By celebrating our differences and creating a culture of curiosity and empathy, we are able to better understand and connect with our patients and donors, foster strong relationships with our stakeholders, and sustain a diverse workforce that will move our company and industry into the future. Learn more Inclusion and Belonging | CSL. Do work that matters at CSL Behring!
$83k-118k yearly est. Auto-Apply 5d ago
Lead Data Scientist
Insight Global
Data scientist job in Columbus, OH
Candidates MUST go on-site at one of the following locations Columbus, OH Cincinnati, OH Cleveland, OH Indianapolis, IN Hagerstown, MD Chicago, IL Detroit, MI Minnetonka, MN Houston, TX Charlotte, NC Akron, OH Experience: · Master's degree and 5+ years of experience related work experience using statistics and machine learning to solve complex business problems, experience conducting statistical analysis with advanced statistical software, scripting languages, and packages, experience with big data analysis tools and techniques, and experience building and deploying predictive models, web scraping, and scalable data pipelines · Expert understanding of statistical methods and skills such as Bayesian Networks Inference, linear and non-linear regression, hierarchical, mixed models/multi-level modeling Python, R, or SAS SQL and some sort of lending experience (i.e. HELOC, Mortgage etc) is most important Excellent communication skills If a candidate has cred card experience (i.e. Discover or Bread financial ) THEY ARE A+ fit! Education: Master's degree or PhD in computer science, statistics, economics or related fields Responsibilities: · Prioritizes analytical projects based on business value and technological readiness Performs large-scale experimentation and build data-driven models to answer business questions Conducts research on cutting-edge techniques and tools in machine learning/deep learning/artificial intelligence Evangelizes best practices to analytics and products teams Acts as the go-to resource for machine learning across a range of business needs Owns the entire model development process, from identifying the business requirements, data sourcing, model fitting, presenting results, and production scoring Provides leadership, coaching, and mentoring to team members and develops the team to work with all areas of the organization Works with stakeholders to ensure that business needs are clearly understood and that services meet those needs Anticipates and analyzes trends in technology while assessing the emerging technology's impact(s) Coaches' individuals through change and serves as a role model Skills: · Up-to-date knowledge of machine learning and data analytics tools and techniques Strong knowledge in predictive modeling methodology Experienced at leveraging both structured and unstructured data sources Willingness and ability to learn new technologies on the job Demonstrated ability to communicate complex results to technical and non-technical audiences Strategic, intellectually curious thinker with focus on outcomes Professional image with the ability to form relationships across functions Ability to train more junior analysts regarding day-to-day activities, as necessary Proven ability to lead cross-functional teams Strong experience with Cloud Machine Learning technologies (e.g., AWS Sagemaker) Strong experience with machine learning environments (e.g., TensorFlow, scikit-learn, caret) Demonstrated Expertise with at least one Data Science environment (R/RStudio, Python, SAS) and at least one database architecture (SQL, NoSQL) Financial Services background preferred
$69k-96k yearly est. 3d ago
Data Scientist
First Quality 4.7
Data scientist job in Lewistown, PA
Founded over 35 years ago, First Quality is a family-owned company that has grown from a small business in McElhattan, Pennsylvania into a group of companies, employing over 5,000 team members, while maintaining our family values and entrepreneurial spirit. With corporate offices in New York and Pennsylvania and 8 manufacturing campuses across the U.S. and Canada, the companies within the First Quality group produce high-quality personal care and household products for large retailers and healthcare organizations. Our personal care and household product portfolio includes baby diapers, wipes, feminine pads, paper towels, bath tissue, adult incontinence products, laundry detergents, fabric finishers, and dishwash solutions. In addition, we manufacture certain raw materials and components used in the manufacturing of these products, including flexible print and packaging solutions. Guided by our values of humility, unity, and integrity, we leverage advanced technology and innovation to drive growth and create new opportunities. At First Quality, you'll find a collaborative environment focused on continuous learning, professional development, and our mission to Make Things Better . We are seeking a Data Scientist for our First Quality facilities located in McElhattan, PA; Lewistown, PA; and Macon, GA. **Must have manufacturing experience with consumer goods.** The role will provide meaningful insight on how to improve our current business operations. This position will work closely with domain experts and SMEs to understand the business problem or opportunity and assess the potential of machine learning to enable accelerated performance improvements. Principle Accountabilities/Responsibilities Design, build, tune, and deploy divisional AI/ML tools that meet the agreed upon functional and non-functional requirements within the framework established by the Enterprise IT and IS departments. Perform large scale experimentation to identify hidden relationships between different data sets and engineer new features Communicate model performance & results & tradeoffs to stake holders Determine requirements that will be used to train and evolve deep learning models and algorithms Visualize information and develop engaging dashboards on the results of data analysis. Build reports and advanced dashboards to tell stories with the data. Lead, develop and deliver divisional strategies to demonstrate the: what, why and how of delivering AI/ML business outcomes Build and deploy divisional AI strategy and roadmaps that enable long-term success for the organization that aligned with the Enterprise AI strategy. Proactively mine data to identify trends and patterns and generate insights for business units and management. Mentor other stakeholders to grow in their expertise, particularly in AI / ML, and taking an active leadership role in divisional executive forums Work collaboratively with the business to maximize the probability of success of AI projects and initiatives. Identify technical areas for improvement and present detailed business cases for improvements or new areas of opportunities. Qualifications/Education/Experience Requirements PhD or master's degree in Statistics, Mathematics, Computer Science or other relevant discipline. 5+ years of experience using large scale data to solve problems and answer questions. Prior experience in the Manufacturing Industry. Skills/Competencies Requirements Experience in building and deploying predictive models and scalable data pipelines Demonstrable experience with common data science toolkits, such as Python, PySpark, R, Weka, NumPy, Pandas, scikit-learn, SpaCy/Gensim/NLTK etc. Knowledge of data warehousing concepts like ETL, dimensional modeling, and sematic/reporting layer design. Knowledge of emerging technologies such as columnar and NoSQL databases, predictive analytics, and unstructured data. Fluency in data science, analytics tools, and a selection of machine learning methods - Clustering, Regression, Decision Trees, Time Series Analysis, Natural Language Processing. Strong problem solving and decision-making skills Ability to explain deep technical information to non-technical parties Demonstrated growth mindset, enthusiastic about learning new technologies quickly and applying the gained knowledge to address business problems. Strong understanding of data governance/management concepts and practices. Strong background in systems development, including an understanding of project management methodologies and the development lifecycle. Proven history managing stakeholder relationships. Business case development. What We Offer You We believe that by continuously improving the quality of our benefits, we can help to raise the quality of life for our team members and their families. At First Quality you will receive: Competitive base salary and bonus opportunities Paid time off (three-week minimum) Medical, dental and vision starting day one 401(k) with employer match Paid parental leave Child and family care assistance (dependent care FSA with employer match up to $2500) Bundle of joy benefit (year's worth of free diapers to all team members with a new baby) Tuition assistance Wellness program with savings of up to $4,000 per year on insurance premiums ...and more! First Quality is committed to protecting information under the care of First Quality Enterprises commensurate with leading industry standards and applicable regulations. As such, First Quality provides at least annual training regarding data privacy and security to employees who, as a result of their role specifications, may come in to contact with sensitive data. First Quality is an Equal Opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability, sexual orientation, gender identification, or protected Veteran status. For immediate consideration, please go to the Careers section at ******************** to complete our online application.
$57k-73k yearly est. 4d ago
Data Architect
Optech 4.6
Data scientist job in Cincinnati, OH
THIS IS A W2 (NOT C2C OR REFERRAL BASED) CONTRACT OPPORTUNITY REMOTE MOSTLY WITH 1 DAY/MO ONSITE IN CINCINNATI-LOCAL CANDIDATES TAKE PREFERENCE RATE: $75-85/HR WITH BENEFITS We are seeking a highly skilled Data Architect to function in a consulting capacity to analyze, redesign, and optimize a Medical Payments client's environment. The ideal candidate will have deep expertise in SQL, Azure cloud services, and modern data architecture principles. Responsibilities Design and maintain scalable, secure, and high-performing data architectures. Lead migration and modernization projects in heavy use production systems. Develop and optimize data models, schemas, and integration strategies. Implement data governance, security, and compliance standards. Collaborate with business stakeholders to translate requirements into technical solutions. Ensure data quality, consistency, and accessibility across systems. Required Qualifications Bachelor's degree in Computer Science, Information Systems, or related field. Proven experience as a Data Architect or similar role. Strong proficiency in SQL (query optimization, stored procedures, indexing). Hands-on experience with Azure cloud services for data management and analytics. Knowledge of data modeling, ETL processes, and data warehousing concepts. Familiarity with security best practices and compliance frameworks. Preferred Skills Understanding of Electronic Health Records systems. Understanding of Big Data technologies and modern data platforms outside the scope of this project.
$75-85 hourly 4d ago
Senior Data Engineer
Godel Terminal
Data scientist job in New York, NY
Godel Terminal is a cutting edge financial platform that puts the world's financial data at your fingertips. From Equities and SEC filings, to global news delivered in milliseconds, thousands of customers rely on Godel every day to be their guide to the world of finance. We are looking for a senior engineer in New York City to join our team and help build out live data services as well as historical data for US markets and international exchanges. This position will specifically work on new asset classes and exchanges, but will be expected to contribute to the core architecture as we expand to international markets. Our team works quickly and efficiently, we are opinionated but flexible when it's time to ship. We know what needs to be done, and how to do it. We are laser focused on not just giving our customers what they want, but exceeding their expectations. We are very proud that when someone opens the app the first time they ask: “How on earth does this work so fast”. If that sounds like a team you want to be part of, here is what we need from you: Minimum qualifications: Able to work out of our Manhattan office minimum 4 days a week 5+ years of experience in a financial or startup environment 5+ years of experience working on live data as well as historical data 3+ years of experience in Java, Python, and SQL Experience managing multiple production ETL pipelines that reliably store and validate financial data Experience launching, scaling, and improving backend services in cloud environments Experience migrating critical data across different databases Experience owning and improving critical data infrastructure Experience teaching best practices to junior developers Preferred qualifications: 5+ years of experience in a fintech startup 5+ years of experience in Java, Kafka, Python, PostgreSQL 5+ years of experience working with Websockets like RXStomp or Socket.io 5+ years of experience wrangling cloud providers like AWS, Azure, GCP, or Linode 2+ years of experience shipping and optimizing Rust applications Demonstrated experience keeping critical systems online Demonstrated creativity and resourcefulness under pressure Experience with corporate debt / bonds and commodities data Salary range begins at $150,000 and increases with experience Benefits: Health Insurance, Vision, Dental To try the product, go to *************************
$150k yearly 3d ago
Sr Data Engineer
Emerald Resource Group
Data scientist job in Beachwood, OH
Rate: Up to $75/hr The Opportunity: Emerald Resource Group is exclusively partnering with a Fortune 500-level Manufacturing & Technology Leader to identify a Senior Data Engineer. This organization operates globally and is currently investing heavily in a massive digital transformation to modernize how they utilize R&D and manufacturing data. This is a rare opportunity to join a stable, high-revenue enterprise environment where you will build the "data plumbing" that supports critical analytics for global operations. The Role: Architect & Build: You will design and implement robust, scalable data pipelines using the Microsoft Azure stack, ensuring data flows seamlessly from legacy on-prem sources to the cloud. Data Strategy: Partner with the Agile Data Project Manager to translate complex business requirements into technical data models. Performance Tuning: Serve as the Subject Matter Expert (SME) for query optimization and database performance, handling massive datasets generated by global labs and factories. Responsibilities: Develop and maintain ETL/ELT processes using Azure Data Factory (ADF) and Databricks. Write advanced, high-efficiency SQL queries and stored procedures. Design data lakes and data warehouses that support Power BI reporting and advanced analytics. Collaborate with Data Scientists to prepare raw data for machine learning models. Mentor junior engineers and ensure code quality through rigorous peer reviews. Requirements (Senior/Principal Level): 8+ years of hands-on experience in Data Engineering or Database Development. Deep expertise in the Azure Data Stack (Azure SQL, Azure Data Factory, Azure Synapse/Data Warehouse, Databricks). Mastery of SQL (T-SQL) and experience with Python or Scala for data manipulation. Proven experience migrating on-premise data (from ERPs like SAP) to the Cloud. Perfered: Experience in Manufacturing or Process Industries (Chemical/Pharma). Knowledge of SAP data structures (extracting data from SAP ECC or S/4HANA). Familiarity with DevOps practices (CI/CD pipelines for data).
$75 hourly 1d ago
Data Engineer
DL Software Inc. 3.3
Data scientist job in New York, NY
DL Software produces Godel, a financial information and trading terminal. Role Description This is a full-time, on-site role based in New York, NY, for a Data Engineer. The Data Engineer will design, build, and maintain scalable data systems and pipelines. Responsibilities include data modeling, developing and managing ETL workflows, optimizing data storage solutions, and supporting data warehousing initiatives. The role also involves collaborating with cross-functional teams to improve data accessibility and analytics capabilities. Qualifications Strong proficiency in Data Engineering and Data Modeling Mandatory: strong experience in global financial instruments including equities, fixed income, options and exotic asset classes Strong Python background Expertise in Extract, Transform, Load (ETL) processes and tools Experience in designing, managing, and optimizing Data Warehousing solutions
$91k-123k yearly est. 4d ago
Data Engineer
Agility Partners 4.6
Data scientist job in Columbus, OH
We're seeking a skilled Data Engineer based in Columbus, OH, to support a high-impact data initiative. The ideal candidate will have hands-on experience with Python, Databricks, SQL, and version control systems, and be comfortable building and maintaining robust, scalable data solutions. Key Responsibilities Design, implement, and optimize data pipelines and workflows within Databricks. Develop and maintain data models and SQL queries for efficient ETL processes. Partner with cross-functional teams to define data requirements and deliver business-ready solutions. Use version control systems to manage code and ensure collaborative development practices. Validate and maintain data quality, accuracy, and integrity through testing and monitoring. Required Skills Proficiency in Python for data engineering and automation. Strong, practical experience with Databricks and distributed data processing. Advanced SQL skills for data manipulation and analysis. Experience with Git or similar version control tools. Strong analytical mindset and attention to detail. Preferred Qualifications Experience with cloud platforms (AWS, Azure, or GCP). Familiarity with enterprise data lake architectures and best practices. Excellent communication skills and the ability to work independently or in team environments.
$95k-127k yearly est. 2d ago
Data Engineer
Sharp Decisions 4.6
Data scientist job in New York, NY
Hey All, We are looking for a mid-level data engineer. No third parties As a result of this expansion, we are seeking experienced software Data engineers with 5+ years of relevant experience to support the design and development of a strategic data platform for SMBC Capital Markets and Nikko Securities Group. Qualifications and Skills • Proven experience as a Data Engineer with experience in Azure cloud. • Experience implementing solutions using - • Azure cloud services • Azure Data Factory • Azure Lake Gen 2 • Azure Databases • Azure Data Fabric • API Gateway management • Azure Functions • Well versed with Azure Databricks • Strong SQL skills with RDMS or no SQL databases • Experience with developing APIs using FastAPI or similar frameworks in Python • Familiarity with the DevOps lifecycle (git, Jenkins, etc.), CI/CD processes • Good understanding of ETL/ELT processes • Experience in financial services industry, financial instruments, asset classes and market data are a plus.
$85k-111k yearly est. 3d ago
Data Engineer
Aaratech
Data scientist job in New York, NY
Data Engineer (3-4 Years Experience) Remote / On-site - based on client needs ) Employment Type: Full-time ( Contract or Contract-to-Hire ) Experience Level: Mid-level (3-4 years) Company: Aaratech Inc 🛑 Eligibility: Open to U.S. Citizens and Green Card holders only. We do not offer visa sponsorship. 🔍 About Aaratech Inc Aaratech Inc is a specialized IT consulting and staffing company that places elite engineering talent into high-impact roles at leading U.S. organizations. We focus on modern technologies across cloud, data, and software disciplines. Our client engagements offer long-term stability, competitive compensation, and the opportunity to work on cutting-edge data projects. 🎯 Position Overview We are seeking a Data Engineer with 3-4 years of experience to join a client-facing role focused on building and maintaining scalable data pipelines, robust data models, and modern data warehousing solutions. You'll work with a variety of tools and frameworks, including Apache Spark, Snowflake, and Python, to deliver clean, reliable, and timely data for advanced analytics and reporting. 🛠️ Key Responsibilities Design and develop scalable Data Pipelines to support batch and real-time processing Implement efficient Extract, Transform, Load (ETL) processes using tools like Apache Spark and dbt Develop and optimize queries using SQL for data analysis and warehousing Build and maintain Data Warehousing solutions using platforms like Snowflake or BigQuery Collaborate with business and technical teams to gather requirements and create accurate Data Models Write reusable and maintainable code in Python (Programming Language) for data ingestion, processing, and automation Ensure end-to-end Data Processing integrity, scalability, and performance Follow best practices for data governance, security, and compliance ✅ Required Skills & Experience 3-4 years of experience in Data Engineering or a similar role Strong proficiency in SQL and Python (Programming Language) Experience with Extract, Transform, Load (ETL) frameworks and building data pipelines Solid understanding of Data Warehousing concepts and architecture Hands-on experience with Snowflake, Apache Spark, or similar big data technologies Proven experience in Data Modeling and data schema design Exposure to Data Processing frameworks and performance optimization techniques Familiarity with cloud platforms like AWS, GCP, or Azure ⭐ Nice to Have Experience with streaming data pipelines (e.g., Kafka, Kinesis) Exposure to CI/CD practices in data development Prior work in consulting or multi-client environments Understanding of data quality frameworks and monitoring strategies
$90k-123k yearly est. 2d ago
Market Data Engineer
Harrington Starr
Data scientist job in New York, NY
🚀 Market Data Engineer - New York | Cutting-Edge Trading Environment I'm partnered with a leading technology-driven trading team in New York looking to bring on a Market Data Engineer to support global research, trading, and infrastructure groups. This role is central to managing the capture, normalization, and distribution of massive volumes of historical market data from exchanges worldwide. What You'll Do Own large-scale, time-sensitive market data capture + normalization pipelines Improve internal data formats and downstream datasets used by research and quantitative teams Partner closely with infrastructure to ensure reliability of packet-capture systems Build robust validation, QA, and monitoring frameworks for new market data sources Provide production support, troubleshoot issues, and drive quick, effective resolutions What You Bring Experience building or maintaining large-scale ETL pipelines Strong proficiency in Python + Bash, with familiarity in C++ Solid understanding of networking fundamentals Experience with workflow/orchestration tools (Airflow, Luigi, Dagster) Exposure to distributed computing frameworks (Slurm, Celery, HTCondor, etc.) Bonus Skills Experience working with binary market data protocols (ITCH, MDP3, etc.) Understanding of high-performance filesystems and columnar storage formats
$90k-123k yearly est. 4d ago
Data Engineer - VC Backed Healthcare Firm - NYC or San Francisco
Saragossa
Data scientist job in New York, NY
Are you a data engineer who loves building systems that power real impact in the world? A fast growing healthcare technology organization is expanding its innovation team and is looking for a Data Engineer II to help build the next generation of its data platform. This team sits at the center of a major transformation effort, partnering closely with engineering, analytics, and product to design the foundation that supports advanced automation, AI, intelligent workflows, and high scale data operations that drive measurable outcomes for hospitals, health systems, and medical groups. In this role, you will design, develop, and maintain software applications that process large volumes of data every day. You will collaborate with cross functional teams to understand data requirements, build and optimize data models, and create systems that ensure accuracy, reliability, and performance. You will write code that extracts, transforms, and loads data from a variety of sources into modern data warehouses and data lakes, while implementing best in class data quality and governance practices. You will work hands on with big data technologies such as Hadoop, Spark, and Kafka, and you will play a critical role in troubleshooting, performance tuning, and ensuring the scalability of complex data applications. To thrive here, you should bring strong problem solving ability, analytical thinking, and excellent communication skills. This is an opportunity to join an expanding innovation group within a leading healthcare platform that is investing heavily in data, AI, and the future of intelligent revenue operations. If you want to build systems that make a real difference and work with teams that care deeply about improving patient experiences and provider performance, this is a chance to do highly meaningful engineering at scale.
$90k-123k yearly est. 3d ago
Data Engineer
Iqventures
Data scientist job in Dublin, OH
The Data Engineer is a technical leader and hands-on developer responsible for designing, building, and optimizing data pipelines and infrastructure to support analytics and reporting. This role will serve as the lead developer on strategic data initiatives, ensuring scalable, high-performance solutions are delivered effectively and efficiently. The ideal candidate is self-directed, thrives in a fast-paced project environment, and is comfortable making technical decisions and architectural recommendations. The ideal candidate has prior experience in modern data platforms, most notable Databricks and the “lakehouse” architecture. They will work closely with cross-functional teams, including business stakeholders, data analysts, and engineering teams, to develop data solutions that align with enterprise strategies and business goals. Experience in the financial industry is a plus, particularly in designing secure and compliant data solutions. Responsibilities: Design, build, and maintain scalable ETL/ELT pipelines for structured and unstructured data. Optimize data storage, retrieval, and processing for performance, security, and cost-efficiency. Ensure data integrity and governance by implementing robust validation, monitoring, and compliance processes. Consume and analyze data from the data pipeline to infer, predict and recommend actionable insight, which will inform operational and strategic decision making to produce better results. Empower departments and internal consumers with metrics and business intelligence to operate and direct our business, better serving our end customers. Determine technical and behavioral requirements, identify strategies as solutions, and section solutions based on resource constraints. Work with the business, process owners, and IT team members to design solutions for data and advanced analytics solutions. Perform data modeling and prepare data in databases for analysis and reporting through various analytics tools. Play a technical specialist role in championing data as a corporate asset. Provide technical expertise in collaborating with project and other IT teams, internal and external to the company. Contribute to and maintain system data standards. Research and recommend innovative, and where possible automated approaches for system data administration tasks. Identify approaches that leverage our resources and provide economies of scale. Engineer system that balances and meets performance, scalability, recoverability (including backup design), maintainability, security, high availability requirements and objectives. Skills: Databricks and related - SQL, Python, PySpark, Delta Live Tables, Data pipelines, AWS S3 object storage, Parquet/Columnar file formats, AWS Glue. Systems Analysis - The application of systems analysis techniques and procedures, including consulting with users, to determine hardware, software, platform, or system functional specifications. Time Management - Managing one's own time and the time of others. Active Listening - Giving full attention to what other people are saying, taking time to understand the points being made, asking questions as appropriate, and not interrupting at inappropriate times. Critical Thinking - Using logic and reasoning to identify the strengths and weaknesses of alternative solutions, conclusions or approaches to problems. Active Learning - Understanding the implications of new information for both current and future problem-solving and decision-making. Writing - Communicating effectively in writing as appropriate for the needs of the audience. Speaking - Talking to others to convey information effectively. Instructing - Teaching others how to do something. Service Orientation - Actively looking for ways to help people. Complex Problem Solving - Identifying complex problems and reviewing related information to develop and evaluate options and implement solutions. Troubleshooting - Determining causes of operating errors and deciding what to do about it. Judgment and Decision Making - Considering the relative costs and benefits of potential actions to choose the most appropriate one. Experience and Education: High School Diploma (or GED or High School Equivalence Certificate). Associate degree or equivalent training and certification. 5+ years of experience in data engineering including SQL, data warehousing, cloud-based data platforms. Databricks experience. 2+ years Project Lead or Supervisory experience preferred. Must be legally authorized to work in the United States. We are unable to sponsor or take over sponsorship at this time.
$76k-103k yearly est. 5d ago
Data Engineer (Web Scraping technologies)
Gotham Technology Group 4.5
Data scientist job in New York, NY
Title: Data Engineer (Web Scraping technologies) Duration: FTE/Perm Salary: 125-190k plus bonus Responsibilities: Utilize AI Models, Code, Libraries or applications to enable a scalable Web Scraping capability Web Scraping Request Management including intake, assessment, accessing sites to scrape, utilizing tools to scrape, storage of scrape, validation and entitlement to users Fielding Questions from users about the scrapes and websites Coordinating with Compliance on approvals and TOU reviews Some Experience building Data pipelines in AWS platform utilizing existing tools like Cron, Glue, Eventbridge, Python based ETL, AWS Redshift Normalizing/standardizing vendor data, firm data for firm consumption Implement data quality checks to ensure reliability and accuracy of scraped data Coordinate with Internal teams on delivery, access, requests, support Promote Data Engineering best practices Required Skills and Qualifications: Bachelor's degree in computer science, Engineering, Mathematics or related field 2-5 experience in a similar role Prior buy side experience is strongly preferred (Multi-Strat/Hedge Funds) Capital markets experience is necessary with good working knowledge of reference data across asset classes and experience with trading systems AWS cloud experience with commons services (S3, lambda, cron, Event Bridge etc.) Experience with web-scraping frameworks (Scrapy, BeautifulSoup, Selenium, Playwright etc.) Strong hands-on skills with NoSQL and SQL databases, programming in Python, data pipeline orchestration tools and analytics tools Familiarity with time series data and common market data sources (Bloomberg, Refinitiv etc.) Familiarity with modern Dev Ops practices and infrastructure-as-code tools (e.g. Terraform, CloudFormation) Strong communication skills to work with stakeholders across technology, investment, and operations teams.
$86k-120k yearly est. 4d ago
Time-Series Data Engineer
Kane Partners LLC 4.1
Data scientist job in Doylestown, PA
Local Candidates Only - No Sponsorship** A growing technology company in the Warrington, PA area is seeking a Data Engineer to join its analytics and machine learning team. This is a hands-on, engineering-focused role working with real operational time-series data-not a dashboard or BI-heavy position. We're looking for someone who's naturally curious, self-driven, and enjoys taking ownership. If you like solving real-world problems, building clean and reliable data systems, and contributing ideas that actually get implemented, you'll enjoy this environment. About the Role You will work directly with internal engineering teams to build and support production data pipelines, deploy Python-based analytics and ML components, and work with high-volume time-series data from complex systems. This is a hybrid position requiring regular on-site collaboration. What You'll Do ● Build and maintain data pipelines for time-series and operational datasets ● Deploy Python and SQL-based data processing components using cloud resources ● Troubleshoot issues, optimize performance, and support new customer implementations ● Document deployment workflows and data behaviors ● Work with engineering/domain specialists to identify opportunities for improvement ● Proactively correct inefficiencies-if something can work better, you take the initiative Required Qualifications ● 2+ years of professional experience in data engineering, data science, ML engineering, or a related field ● Strong Python and SQL skills ● Experience with time-series data or operational/industrial datasets (preferred) ● Exposure to cloud environments; Azure experience is a plus but not required ● Ability to think independently, problem-solve, and build solutions with minimal oversight ● Strong communication skills and attention to detail Local + Work Authorization Requirements (Strict) ● Must currently live within daily commuting distance of Warrington, PA (Philadelphia suburbs / Montgomery County / Bucks County / surrounding PA/NJ areas) ● No relocation, no remote-only applicants ● No sponsorship-must be authorized to work in the U.S. now and in the future These requirements are firm and help ensure strong team collaboration. What's Offered ● Competitive salary + bonus potential ● Health insurance and paid time off ● Hybrid work flexibility ● Opportunity to grow, innovate, and have a direct impact on meaningful technical work ● Supportive, engineering-first culture If This Sounds Like You We'd love to hear from local candidates who are excited about Python, data engineering, and solving real-world problems with time-series data. Work Authorization: Applicants must have valid, independent authorization to work in the United States. This position does not offer, support, or accept any form of sponsorship-whether employer, third-party, future, contingent, transfer, or otherwise. Candidates must be able to work for any employer in the U.S. without current or future sponsorship of any kind. Work authorization will be verified, and misrepresentation will result in immediate removal from consideration.
$86k-116k yearly est. 3d ago
Data Engineer
Dataexl Information LLC
Data scientist job in Cincinnati, OH
Title: Azure Data Engineer (Only W2) Duration: 1 Year Contract (potential for conversion/extension) We need a strong Azure Data Engineer with expertise in Databricks, including Unity Catalog experience), strong Pyspark and understanding of CI/CD and Infrastructure as Code (IaC) using Terraform. Requirements • 7+ years of experience as a Data Engineer • Hands-on experience with Azure Databricks, Spark, and Python • Experience with Delta Live Tables (DLT) and Databricks SQL • Strong SQL and database background • Experience with Azure Functions, messaging services, or orchestration tools • Familiarity with data governance, lineage, or cataloging tools (e.g., Purview, Unity Catalog) • Experience monitoring and optimizing Databricks clusters or workflows • Experience working with Azure cloud data services and understanding how they integrate with Databricks and enterprise data platforms • Experience with Terraform for cloud infrastructure provisioning • Experience with GitHub and GitHub Actions for version control and CI/CD automation • Strong understanding of distributed computing concepts (partitions, joins, shuffles, cluster behavior) • Familiarity with SDLC and modern engineering practices • Ability to balance multiple priorities, work independently, and stay organized
$75k-101k yearly est. 2d ago
Hadoop Data Engineer
Smart It Frame LLC
Data scientist job in Pittsburgh, PA
About the job: We are seeking an accomplished Tech Lead - Data Engineer to architect and drive the development of large-scale, high-performance data platforms supporting critical customer and transaction-based systems. The ideal candidate will have a strong background in data pipeline design, Hadoop ecosystem, and real-time data processing, with proven experience building data solutions that power digital products and decisioning platforms in a complex, regulated environment. As a technical leader, you will guide a team of engineers to deliver scalable, secure, and reliable data solutions enabling advanced analytics, operational efficiency, and intelligent customer experiences. Key Roles & Responsibilities Lead and oversee the end-to-end design, implementation, and optimization of data pipelines supporting key customer onboarding, transaction, and decisioning workflows. Architect and implement data ingestion, transformation, and storage frameworks leveraging Hadoop, Avro, and distributed data processing technologies. Partner with product, analytics, and technology teams to translate business requirements into scalable data engineering solutions that enhance real-time data accessibility and reliability. Provide technical leadership and mentorship to a team of data engineers, ensuring adherence to coding, performance, and data quality standards. Design and implement robust data frameworks to support next-generation customer and business product launches. Develop best practices for data governance, security, and compliance aligned with enterprise and regulatory requirements. Drive optimization of existing data pipelines and workflows for improved efficiency, scalability, and maintainability. Collaborate closely with analytics and risk modeling teams to ensure data readiness for predictive insights and strategic decision-making. Evaluate and integrate emerging data technologies to future-proof the data platform and enhance performance. Must-Have Skills 8-10 years of experience in data engineering, with at least 2-3 years in a technical leadership role. Strong expertise in the Hadoop ecosystem (HDFS, Hive, MapReduce, HBase, Pig, etc.). Experience working with Avro, Parquet, or other serialization formats. Proven ability to design and maintain ETL / ELT pipelines using tools such as Spark, Flink, Airflow, or NiFi. Proficiency in Python, Scala for large-scale data processing. Strong understanding of data modeling, data warehousing, and data lake architectures. Hands-on experience with SQL and both relational and NoSQL data stores. Cloud data platform experience with AWS. Deep understanding of data security, compliance, and governance frameworks. Excellent problem-solving, communication, and leadership skills.
$79k-107k yearly est. 4d ago
Data Engineer
Realtime Recruitment
Data scientist job in Philadelphia, PA
Data Engineer - Job Opportunity Full time Permanent Remote - East coast only Please note this role is open for US citizens or Green Card Holders only We're looking for a Data Engineer to help build and enhance scalable data systems that power analytics, reporting, and business decision-making. This role is ideal for someone who enjoys solving complex technical challenges, optimizing data workflows, and collaborating across teams to deliver reliable, high-quality data solutions. What You'll Do Develop and maintain scalable data infrastructure, cloud-native workflows, and ETL/ELT pipelines supporting analytics and operational workloads. Transform, model, and organize data from multiple sources to enable accurate reporting and data-driven insights. Improve data quality and system performance by identifying issues, optimizing architecture, and enhancing reliability and scalability. Monitor pipelines, troubleshoot discrepancies, and resolve data or platform issues-including participating in on-call support when needed. Prototype analytical tools, automation solutions, and algorithms to support complex analysis and drive operational efficiency. Collaborate closely with BI, Finance, and cross-functional teams to deliver robust and scalable data products. Create and maintain clear, detailed documentation (configurations, specifications, test scripts, and project tracking). Contribute to Agile development processes, engineering excellence, and continuous improvement initiatives. What You Bring Bachelor's degree in Computer Science or a related technical field. 2-4 years of hands-on SQL experience (Oracle, PostgreSQL, etc.). 2-4 years of experience with Java or Groovy. 2+ years working with orchestration and ingestion tools (e.g., Airflow, Airbyte). 2+ years integrating with APIs (SOAP, REST). Experience with cloud data warehouses and modern ELT/ETL frameworks (e.g., Snowflake, Redshift, DBT) is a plus. Comfortable working in an Agile environment. Practical knowledge of version control and CI/CD workflows. Experience with automation, including unit and integration testing. Understanding of cloud storage solutions (e.g., S3, Blob Storage, Object Store). Proactive mindset with strong analytical, logical-thinking, and consultative skills. Ability to reason about design decisions and understand their broader technical impact. Strong collaboration, adaptability, and prioritization abilities. Excellent problem-solving and troubleshooting skills.
$81k-111k yearly est. 3d ago
Senior Data Engineer
Vista Applied Solutions Group Inc. 4.0
Data scientist job in Cincinnati, OH
Data Engineer III About the Role We're looking for a Data Engineer III to play a key role in a large-scale data migration initiative within Client's commercial lending, underwriting, and reporting areas. This is a hands-on engineering role that blends technical depth with business analysis, focused on transforming legacy data systems into modern, scalable pipelines. What You'll Do Analyze legacy SQL, DataStage, and SAS code to extract business logic and identify key data dependencies. Document current data usage and evaluate the downstream impact of migrations. Design, build, and maintain data pipelines and management systems to support modernization goals. Collaborate with business and technology teams to translate requirements into technical solutions. Improve data quality, reliability, and performance across multiple environments. Develop backend solutions using Python, Java, or J2EE, and integrate with tools like DataStage and dbt. What You Bring 5+ years of experience with relational and non-relational databases (SQL, Snowflake, DB2, MongoDB). Strong background in legacy system analysis (SQL, DataStage, SAS). Experience with Python or Java for backend development. Proven ability to build and maintain ETL pipelines and automate data processes. Exposure to AWS, Azure, or GCP. Excellent communication and stakeholder engagement skills. Financial domain experience-especially commercial lending or regulatory reporting-is a big plus. Familiarity with Agile methodologies preferred.
$74k-97k yearly est. 4d ago
Data Engineer
Haptiq
Data scientist job in New York, NY
Haptiq is a leader in AI-powered enterprise operations, delivering digital solutions and consulting services that drive value and transform businesses. We specialize in using advanced technology to streamline operations, improve efficiency, and unlock new revenue opportunities, particularly within the private capital markets. Our integrated ecosystem includes PaaS - Platform as a Service, the Core Platform, an AI-native enterprise operations foundation built to optimize workflows, surface insights, and accelerate value creation across portfolios; SaaS - Software as a Service, a cloud platform delivering unmatched performance, intelligence, and execution at scale; and S&C - Solutions and Consulting Suite, modular technology playbooks designed to manage, grow, and optimize company performance. With over a decade of experience supporting high-growth companies and private equity-backed platforms, Haptiq brings deep domain expertise and a proven ability to turn technology into a strategic advantage. The Opportunity As a Data Engineer within the Global Operations team, you will be responsible for managing the internal data infrastructure, building and maintaining data pipelines, and ensuring the integrity, cleanliness, and usability of data across our critical business systems. This role will play a foundational part in developing a scalable internal data capability to drive decision-making across Haptiq's operations. Responsibilities and Duties Design, build, and maintain scalable ETL/ELT pipelines to consolidate data from delivery, finance, and HR systems (e.g., Kantata, Salesforce, JIRA, HRIS platforms). Ensure consistent data hygiene, normalization, and enrichment across source systems. Develop and maintain data models and data warehouses optimized for analytics and operational reporting. Partner with business stakeholders to understand reporting needs and ensure the data structure supports actionable insights. Own the documentation of data schemas, definitions, lineage, and data quality controls. Collaborate with the Analytics, Finance, and Ops teams to build centralized reporting datasets. Monitor pipeline performance and proactively resolve data discrepancies or failures. Contribute to architectural decisions related to internal data infrastructure and tools. Requirements 3-5 years of experience as a data engineer, analytics engineer, or similar role. Strong experience with SQL, data modeling, and pipeline orchestration (e.g., Airflow, dbt). Hands-on experience with cloud data warehouses (e.g., Snowflake, BigQuery, Redshift). Experience working with REST APIs and integrating with SaaS platforms like Salesforce, JIRA, or Workday. Proficiency in Python or another scripting language for data manipulation. Familiarity with modern data stack tools (e.g., Fivetran, Stitch, Segment). Strong understanding of data governance, documentation, and schema management. Excellent communication skills and ability to work cross-functionally. Benefits Flexible work arrangements (including hybrid mode) Great Paid Time Off (PTO) policy Comprehensive benefits package (Medical / Dental / Vision / Disability / Life) Healthcare and Dependent Care Flexible Spending Accounts (FSAs) 401(k) retirement plan Access to HSA-compatible plans Pre-tax commuter benefits Employee Assistance Program (EAP) Opportunities for professional growth and development. A supportive, dynamic, and inclusive work environment. Why Join Us? We value creative problem solvers who learn fast, work well in an open and diverse environment, and enjoy pushing the bar for success ever higher. We do work hard, but we also choose to have fun while doing it. The compensation range for this role is $75,000 to $80,000 USD
$75k-80k yearly 5d ago

Learn more about data scientist jobs

How much does a data scientist earn in Millcreek, PA?

The average data scientist in Millcreek, PA earns between $61,000 and $115,000 annually. This compares to the national average data scientist range of $75,000 to $148,000.

Average data scientist salary in Millcreek, PA

$84,000

$61,00010%

$84,000Median

$115,00090%

What are the biggest employers of Data Scientists in Millcreek, PA?

The biggest employers of Data Scientists in Millcreek, PA are:

Sedgwick LLP

Job type you want

Full Time

Part Time

Internship

Temporary

Data scientist jobs in Millcreek, PA

Principal Biostatistician

Lead Data Scientist

Data Scientist

Data Architect

Senior Data Engineer

Sr Data Engineer

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Market Data Engineer

Data Engineer - VC Backed Healthcare Firm - NYC or San Francisco

Data Engineer

Data Engineer (Web Scraping technologies)

Time-Series Data Engineer

Data Engineer

Hadoop Data Engineer

Data Engineer

Senior Data Engineer

Data Engineer

Learn more about data scientist jobs

How much does a data scientist earn in Millcreek, PA?

What are the biggest employers of Data Scientists in Millcreek, PA?