Reliability Engineer jobs at NBCUniversal - 75 jobs
Sr Site Reliability Engineer
Nbcuniversal 4.8
Reliability engineer job at NBCUniversal
NBCUniversal is one of the world's leading media and entertainment companies. We create world-class content, which we distribute across our portfolio of film, television, and streaming, and bring to life through our global theme park destinations, consumer products, and experiences. We own and operate leading entertainment and news brands, including NBC, NBC News, NBC Sports, Telemundo, NBC Local Stations, Bravo, and Peacock, our premium ad-supported streaming service. We produce and distribute premier filmed entertainment and programming through our powerhouse film and television studios, including Universal Pictures, DreamWorks Animation, and Focus Features, and the four global television studios under the Universal Studio Group banner, and operate industry-leading theme parks and experiences around the world through Universal Destinations & Experiences, including Universal Orlando Resort, home to Universal Epic Universe, and Universal Studios Hollywood. NBCUniversal is a subsidiary of Comcast Corporation. Visit ******************** for more information.
Our impact is rooted in improving the communities where our employees, customers, and audiences live and work. We have a rich tradition of giving back and ensuring our employees have the opportunity to serve their communities. We champion an inclusive culture and strive to attract and develop a talented workforce to create and deliver a wide range of content reflecting our world.
Job Description
The Unified Communication Engineer at NBC Universal holds extensive responsibility across various Unified Communications platforms. This position requires practical experience with configuring, integrating, and troubleshooting systems based on Cisco, Microsoft, AWS, and Zoom technologies. Key duties include designing, engineering, implementing, and providing technical support for all UC projects, such as installing and maintaining equipment, monitoring systems, responding to alerts, and resolving issues promptly. The engineer also oversees and improves complex telecommunications systems that support both contact center operations and enterprise infrastructure.
Responsibilities:
Manage and improve on-premises telecom systems including Cisco Unified Communications Manager, Unity Connection, Emergency Responder, Customer Voice Portal, Telepresence Video Communication Server, CUBE, and Ribbon SBC platforms.
Support and optimize cloud telecom systems like Cisco Webex, MS Teams, and Zoom.
Plan, design, and implement IVR scripts across Cisco Unity Connection, MS Teams, and Amazon Connect.
Deploy, operate, support, and maintain Contact Centre environments (Amazon Connect, MS Teams, Zoom).
Integrate and test new UC technologies with enterprise applications.
Install, maintain, and monitor VOIP equipment, troubleshoot issues, and ensure stability of the voice network.
Handle service tickets, provide root-cause analysis, prepare outage reports, and resolve issues at all levels.
Collaborate with cross-functional teams to complete tasks efficiently.
Oversee vendor management.
Specific solutions include:
Cisco Unified Communications Manager
Cisco Unity Connection, Emergency Responder.
Ribbon SBC Platforms
MS Teams Voice Integrations
AWS Contact Center Operations
Qualifications
Comprehensive understanding of LAN/WAN networking fundamentals in relation to voice systems.
Advanced proficiency in troubleshooting voice networks, including Quality of Service (QoS), dial plans, and call routing.
Extensive experience with Cisco PBX infrastructure, encompassing configuration, troubleshooting, and change management.
In-depth knowledge and hands-on experience with Microsoft Teams Admin Portal and Webex Control Hub.
Expertise in hosted voice solutions and VoIP trunking, with practical exposure to Session Border Controllers such as Cisco CUBE and Ribbon Core/Edge SBC.
Familiarity with virtualization technologies and practical experience working with UCS servers, VMware, and vCenter.
Understanding of Amazon Connect Contact Center programming and operational functionality.
Demonstrated ability to implement and monitor project progress.
Hybrid: This position has been designated as hybrid, which currently requires contributing from the office a minimum of three days per week. Beginning January 5, 2026, hybrid employees will be required to work from the office a minimum of four days per week. The Company reserves the right to change in-office requirements at any time.
This position is eligible for company sponsored benefits, including medical, dental and vision insurance, 401(k), paid leave, tuition reimbursement, and a variety of other discounts and perks. Learn more about the benefits offered by NBCUniversal by visiting the Benefits page of the Careers website.
Salary range: $130,000 - $160,000
#LI-hybrid
Additional Information
As part of our selection process, external candidates may be required to attend an in-person interview with an NBCUniversal employee at one of our locations prior to a hiring decision. NBCUniversal's policy is to provide equal employment opportunities to all applicants and employees without regard to race, color, religion, creed, gender, gender identity or expression, age, national origin or ancestry, citizenship, disability, sexual orientation, marital status, pregnancy, veteran status, membership in the uniformed services, genetic information, or any other basis protected by applicable law.
NBCUniversal will consider for employment qualified applicants with criminal histories, or arrest or conviction records, in a manner consistent with relevant legal requirements, including the City of Los Angeles' Fair Chance Initiative For Hiring Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, where applicable.
If you are a qualified individual with a disability or a disabled veteran and require support throughout the application and/or recruitment process as a result of your disability, you have the right to request a reasonable accommodation. You can submit your request to [email protected].
$130k-160k yearly 11d ago
Looking for a job?
Let Zippia find it for you.
Sr Audio Digital Signal Processing Engineer
The Walt Disney Company (Germany) GmbH 4.6
San Francisco, CA jobs
The Skywalker Sound Development Group is seeking an innovative and experienced Audio Digital Signal Processing Engineer to develop and implement advanced audio processing solutions that integrate seamlessly into modern media production pipelines. This role bridges deep DSP expertise with practical engineering skills, enabling the creation of tools and algorithms that meet the demanding standards of the film and entertainment industries.
As an Senior Audio DSP Engineer, you will design and optimize audio processing algorithms while collaborating with cross-functional teams to deploy solutions using modern development practices. Your work will focus on enabling high-quality, efficient, and reliable audio workflows for applications such as speech enhancement, dynamics processing, noise reduction and spatial audio processing.
This role is considered Hybrid, which means the employee will work 2-3 days onsite at our San Francisco office and occasionally from home.
What you'll do
Develop, implement, and optimize DSP algorithms for audio applications.
Collaborate with AI/ML researchers, data scientists, and engineers to combine traditional DSP techniques with machine learning approaches.
Ensure compliance with film and media audio standards, including channel formats (e.g., 5.1, 7.1, Atmos), bit depth, sample rates and timecode.
Utilize collaborative development workflows, including GitLab, to manage source control and ensure clean, maintainable codebases.
Leverage CI/CD pipelines for building, testing, and deploying DSP applications.
Integrate DSP solutions into scalable, containerized environments using tools like Docker.
Write and execute unit tests to validate algorithm accuracy and performance.
Work with media-specific file containers (e.g., MXF, WAV, AIFF) and codecs, ensuring compatibility across workflows.
Collaborate with cross-disciplinary teams to ensure solutions meet both technical and creative needs.
Stay informed of emerging trends in audio technology, DSP, and media workflows to guide future development efforts.
What we're looking for
Bachelor's Degree in Electrical Engineering, Computer Science, or a related field; equivalent professional experience considered. Master's degree is preferred.
5+ years of experience in DSP development, with applications in media, film, or entertainment industries.
Strong knowledge of digital signal processing principles, including analog and digital filtering, audio effects, spectral and spatial analysis/synthesis, perceptual audio coding, parametric audio coding and dynamics processing.
Proficiency in programming languages such as C/C++, Python, or MATLAB, with experience using collaborative Git workflows.
Experience with containerized deployments (e.g., Docker).
Experience with optimization of processing for cloud and on-prem application deployment.
Familiarity with modern CI/CD tools (e.g. GitLab CI/CD) for building and deploying software.
Solid understanding of film audio standards and formats, including multichannel and immersive audio (e.g., Dolby Atmos, Ambisonics).
Experience with unit testing frameworks and automated test pipelines.
Preferred Qualifications
Experience working with media production pipelines, including audio post-production workflows.
Familiarity with audio processing libraries and plugin frameworks such as JUCE, CoreAudio, or PortAudio.
Knowledge of machine learning frameworks and how they interact with DSP, such as using neural networks for audio effects.
Expertise with audio file formats and metadata, including WAV, AIFF, and MXF containers.
Contributions to open-source projects or published research in the field of audio DSP.
Understanding of GPU acceleration for audio processing tasks.
Knowledge of machine learning techniques, tools and libraries such as Pytorch.
The hiring range for this position in San Francisco, CA is $117,100 to $156,900 per year. The base pay actually offered will take into account internal equity and also may vary depending on the candidate's geographic region, job-related knowledge, skills, and experience among other factors. A bonus and/or long-term incentive units may be provided as part of the compensation package, in addition to the full range of medical, financial, and/or other benefits, dependent on the level and position offered.
Sobre The Walt Disney Company: Disability Accommodation for Employment Applications
The Walt Disney Company and its Affiliated Companies are Equal Employment Opportunity employers and welcome all job seekers including individuals with disabilities and veterans with disabilities. If you have a disability and believe you need a reasonable accommodation in order to search for a job opening or apply for a position, visit the Disney candidate disability accommodations FAQs. We will only respond to those requests that are related to the accessibility of the online application system due to a disability.
#J-18808-Ljbffr
$117.1k-156.9k yearly 5d ago
Lead Site Reliability Engineer
One Dynamic 3.7
Remote
Quick Details
Rate
Duration
Fully Remote (US)
8+ Years
$70-75/hour
6 months+
About One Dynamic
One Dynamic is a Service-Disabled Veteran-Owned Small Business (SDVOSB) headquartered in Fairfax, VA. We specialize in digital transformation, cloud infrastructure, quality assurance, and enterprise architecture for federal and healthcare organizations. We are currently seeking a Lead Site Reliability Engineer to support our client ARC, a rapidly growing device management company revolutionizing how frontline workers interact with enterprise mobile devices.
About the Role
The Lead Site Reliability Engineer is a senior technical leadership role responsible for the reliability, availability, and operational excellence of the cloud infrastructure and kiosks platform. This role owns uptime, SLAs, and incident response while driving long-term improvements to system resilience, observability, and operational maturity. The Lead SRE serves as both a hands-on technical leader and a force multiplier across platform, QA, and development teams.
This role is well-suited for an experienced engineer who thrives in high-ownership environments and can balance real-time operational demands with strategic reliability initiatives. Strong communication, sound technical judgment, and a bias toward preventative engineering are critical to success.
Key Responsibilities
Own uptime, SLAs, and overall reliability of the cloud infrastructure and kiosks platform
Lead incident response, root-cause analysis, and drive actionable postmortems
Automate infrastructure, deployments, and operational tasks using modern IaC and scripting in collaboration with the Platform Engineering team
Maintain and improve monitoring, alerting, and observability (e.g., Grafana, Prometheus, New Relic).
Execute and continuously improve disaster recovery and business continuity plans
Partner with platform engineering, QA, and development teams to ensure operational readiness
Establish and maintain runbooks, operational standards, and reliability best practices
Provide leadership, mentorship, and clear communication during both normal operations and incidents
Optimize cloud and Kubernetes environments for reliability, performance, and scalability
Required Qualifications
8+ years in SRE, DevOps, or Platform Engineering roles; 2+ years in a senior or lead capacity
Strong experience supporting production environments with strict SLAs and high uptime requirements
Deep knowledge of Kubernetes, containers, and cloud-native infrastructure
Proficiency in automation and scripting using Bash, Python, or Go
Hands-on experience with CI/CD pipelines and release engineering in modern environments
Expert-level familiarity with IaC tools (Terraform preferred)
Strong understanding of monitoring, alerting, logging, and observability tooling
Experience implementing and managing GitOps workflows (ArgoCD or similar)
Demonstrated ability to lead incidents and communicate effectively with technical and non-technical stakeholders
Solid understanding of disaster recovery planning, resilience practices, and system hardening
Must be authorized to work in the United States (US-based candidates only)
The Ideal Candidate
You think several steps ahead. You are relentless, strategic, and a long-term thinker. You believe the details are essential, and so you get them right. You are a fast learner. You take feedback well and implement it. You care about achieving the best outcome and do not focus on being right or wrong.
About the Client
ARC is a device management solution integrated with smart lockers, designed to store, secure, and charge company-owned handheld devices (E.g., Zebra, Honeywell) used by frontline workers to perform core job functions. Launched in late 2021, ARC was spun off from ChargeItSpot, a consumer-facing phone-charging technology company established in 2012.
ARC's Mission: Minimize Device Waste. Maximize Worker Productivity. Make Life Easier.
How to Apply
If you have the unique combination of skills and qualities we are seeking, please submit your resume via One Dynamic's careers portal. We look forward to hearing from you!
One Dynamic is an Equal Opportunity employer. Personnel are chosen based on ability without regard to race, color, religion, sex, national origin, disability, marital status, or sexual orientation, in accordance with federal and state law.
$70-75 hourly Auto-Apply 32d ago
FedRAMP Site Reliability Engineer (FedSRE) - CloudVision
Arista 4.1
Remote
Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. What sets us apart is our relentless pursuit of innovation. We leverage the latest advancements in cloud computing, artificial intelligence, and software-defined networking to provide our clients with a competitive edge in an increasingly interconnected world. Our solutions are designed to not only meet the current demands of the digital landscape but to also anticipate and adapt to future challenges.
At Arista we value the diversity of thought and perspectives that each employee brings to the table. We believe that fostering an inclusive environment, where individuals from various backgrounds and experiences feel welcome, is essential for driving creativity and innovation.
Our commitment to excellence has earned us several prestigious awards, such as Best Engineering Team, Best Company for Diversity, Compensation, and Work-Life Balance. At Arista, we take pride in our track record of success and strive to maintain the highest standards of quality and performance in everything we do.
Job Description
Who You'll Work With
We're looking for Site Reliability Engineers to join Arista's FedRAMP CloudVision-as-a-Service (CVaaS) SRE team.
SREs at Arista combine strong software engineering background, systems architecture knowledge, with passion for operating production systems at scale. We are responsible for our global CloudVision service fleet, ensuring scalability, reliability, and stability. You'll have firsthand experience in being part of a rapidly growing product with a passionate group of engineers that unapologetically put product reliability and customer experience first. We deeply believe in building highly automated and self-sustaining environments, prioritizing safe and efficient operations that leverage cutting edge technologies and tools.
What You'll Do
As an SRE you'll have the chance to be drive, develop, and lead projects in any of the following areas, with FedRAMP perspective:
Data Platform (NetDL) Architecture and Performance
Capacity Planning
Autoscaling
Disaster Recovery
Observability
Change Management - CI/CD
Service Network Architecture
Cost Optimizations
Infrastructure and Cloud-First Application Security
You will also be joining FedRAMP on-call rotation split between West and East coast timezones.
Arista's CloudVision is an enterprise network management and streaming telemetry SaaS offering. CloudVision stack is built entirely Kubernetes-native. Familiarity with GCP (Google Cloud Platform) and GKE (Google Kubernetes Engine) is preferred. Our technical stack includes but not limited to: Golang, Python, Ansible/Pulumi, Bash. You will be expected to develop, operate, and work with many different types of databases, both directly on Kubernetes or leveraging managed DB products. We integrate with many different Open Source Software (OSS) projects that both power our microservices stack, monitoring infrastructure, and much more.
Qualifications
BS or MS in Computer Science or a related technical field, or equivalent practical experience
5+ years of software engineering experience
U.S. Citizenship required
Experience managing FedRAMP SaaS environments or other highly regulated systems
Proven experience developing or managing deployments of distributed database systems or large-scale SaaS applications
Proficiency in Python, Go (Golang), or other programming languages
Strong scripting skills in Bash or other scripting languages
Compensation Information
The new hire base pay for this role has a pay range of $101,000 to $161,000 across the US. Arista offers different pay ranges based on work location, so that we can offer consistent and competitive pay appropriate to the market. The actual base pay offered will be based on a wide range of factors, including skills, qualifications, relevant experience, and work location. The pay range provided reflects base pay only and in addition certain roles may also be eligible for discretionary Arista bonuses and equity. Employees in Sales roles are eligible to participate in Arista's Sales Incentive Plan, which pays commissions calculated as a percentage of eligible sales. US-based employees are also entitled to benefits including medical, dental, vision, wellbeing, tax savings and income protection. The recruiting team can share more details during the hiring process specific to the role and location.
Additional Information
Arista Networks is an equal opportunity employer. Arista makes all hiring and employment-related decisions in a non-discriminatory manner without regard to race, color, religion, sex, sexual orientation, gender identity, national origin or any other factor determined to be unlawful under applicable federal, state, or law law. All your information will be kept confidential according to EEO guidelines.
$101k-161k yearly 1d ago
Site Reliability Engineer III
Verisk Analytics 4.6
Jersey City, NJ jobs
As a Senior Site Reliability Engineer, you'll bridge the gap between software development and operations, applying software engineering principles to infrastructure and operations problems. You'll help design, build, and maintain the systems that keep our services reliable and scalable while working closely with development teams to improve application performance and resilience.
10+ years of experience in SRE, DevOps, or similar roles
Expertise in incident management, disaster recovery, and building resilience engineering frameworks
Strong programming skills in at least one language such as Java, or Python
Extensive experience with Linux/Unix systems administration
Hands-on experience with serverless (Lambda) and containerization technologies (Docker)
Experience implementing and managing cloud infrastructure (AWS, Azure DevOps)
Advanced understanding of networking concepts, load balancing, security best practices, and CDN technologies
Strong experience with observability systems (like Dynatrace)
Knowledge of database technologies and their performance characteristics
Demonstrated experience leading incident response and post-mortem analysis
Bachelor's degree in computer science or equivalent practical experience
Deep knowledge of infrastructure-as-code tools (Terraform, CloudFormation)
Mastery of CI/CD pipeline design and implementation (Jenkins, GitLab CI, Azure DevOps)
Experience building and maintaining comprehensive monitoring and alerting systems
Experience managing high-traffic, mission-critical production environments
Strong background in capacity planning and performance optimization
Proven ability to mentor junior SREs and elevate team capabilities
Experience driving cross-team initiatives to improve reliability practices
Track record of successfully advocating for and implementing architectural improvements
Strong incident management skills, including crisis communication
Design, implement, and maintain reliable infrastructure systems with a focus on security, scalability, reliability, and automation using tools like Terraform or CloudFormation
Build and maintain scalable and resilient production systems with a focus on automation
Develop and implement monitoring solutions to ensure system health, performance, and availability
Lead incident response, perform root cause analysis, and implement preventative measures
Track SLOs, and SLAs to measure and improve service reliability and error budgets to drive reliability improvements
Design and implement CI/CD pipelines to enable rapid and reliable software delivery
Partner with development teams to improve application performance, resilience, and scalability
Contribute to capacity planning and performance optimization initiatives
Participate in an on-call rotation to support production systems
Mentor junior engineers and contribute to the growth of the team
Develop and evolve security monitoring, alerting, and incident response frameworks
$89k-125k yearly est. Auto-Apply 60d+ ago
Site Reliability Engineer - Media Production Infrastructure (Open to relocation)
Media.Monks 4.1
Cupertino, CA jobs
Please note that we will never request payment or bank account information at any stage of the recruitment process. As we continue to grow our teams, we urge you to be cautious of fraudulent job postings or recruitment activities that misuse our company name and information. Please protect your personal information during any recruitment process. While Monks may contact potential candidates via LinkedIn, all applications must be submitted through our official website (monks.com/careers).
About the Role
This position is on-site and based in Cupertino, CA. We are open to candidates who are willing to relocate within a reasonable timeframe.
We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) to join our Platform Engineering team, supporting a world-class media production environment for a leading global technology company. This is a crucial role within a Managed Services model, focused on ensuring the high availability, performance, and resilience of critical server, storage, and media workflow systems. You will be one of two dedicated on-site SREs who will partner with remote and consulting staff to provide around-the-clock operational support and continuous infrastructure improvement.
Key Responsibilities
* Infrastructure Management: Maintain and troubleshoot all production hardware, servers, and storage infrastructure, with a specialized focus on the Storage Area Network (SAN).
* Storage Expertise: Execute key maintenance and support for the SAN environment, including firmware/software updates for fiber switches, RAIDs, and ape systems.
* Networking and System Admin: Manage Directory services, network services (DNS, static IPs, subnet masks), and configure shares and permissions on the SAN.
* Monitoring and Observability: Manage and improve custom dashboards for 24/7 monitoring of systems, RAIDs, temperature sensors, and backup/archive processes.
* Custom Application Support: Contribute to the development and maintenance of custom applications and dashboards that support media workflows, including tools for project deployment, directory services integration, and ticketing.
* Remote/On-Demand Support: Provide active on-site support and participate in a 24/7 on-call rotation for critical interventions (e.g., power/cooling issues).
* Backup and Archive: Manage the Backup and Archive environment, maintain tape systems, and prepare projects for archiving to the cloud.
Qualifications & Experience
* Experience: 14+ years of experience working with mac OS and SAN environments, preferably Xsan.
* Experience working with Stornext and Jamf
* Technical Depth:
* Deep expertise in Fibre Channel networking.
* Demonstrated experience with hardware RAIDs, block storage, and LUN creation.
* Thorough knowledge of mac OS ACLs, POSIX permissions, and Directory Services.
* Expertise in installing and configuring Prometheus and Grafana, including creating Prometheus exporters.
* Software & Scripting:
* Experience with Shell Scripting
* Experience with remote connection technologies
* Thorough knowledge of data management for media and entertainment
* ]:pointer-events-auto scroll-mt-[calc(var(--header-height)+min(200px,max(70px,20svh)))]" data-turn-id="16f1818b-f628-4b67-8304-fd8de9c27d52" data-testid="conversation-turn-54" data-scroll-anchor="false" data-turn="assistant">
Please note that while this role is open to candidates willing to relocate, Monks does not offer relocation assistance for this position. Any relocation costs would be the responsibility of the candidate.
* ]:pointer-events-auto scroll-mt-(--header-height)" data-turn-id="a7f02ecb-8455-4f1d-9b4e-2622c1a997fe" data-testid="conversation-turn-55" data-scroll-anchor="false" data-turn="user">
What We Offer
.Monks has provided a compensation range that represents its good faith estimate of what Media.Monks may pay for the position at the time of posting .Monks may ultimately pay more or less than the posted compensation range. The salary offered to the selected candidate will be determined based on job-related factors, but not based on a candidate's sex or any other protected status.
Salary range
$133,298.00 - $150,925.00 USD
#LI-AO1
#LI-Hybrid
What We Offer
Benefits
* Excellent, full coverage medical, dental, and vision insurance - more about our coverage here!
* Generous PTO and 15 company-wide holidays
* 401k with company contribution
* Paid parental leave
* Work-life balance with an emphasis on personal well-being
* Career growth in a disruptor space & entrepreneurial opportunities within the Monks network
* A globally diverse & inclusive culture with employee resource groups such as S4 Melanin, Pride.Monks, Cultura.Monks, and more!
* Authentic commitment to DEI efforts and sustainable growth. (Why Sir Martin Sorrell signed The Climate Pledge here!)
This role is subject to our Return to Office (RTO) policy. If you reside within a commutable distance of one of our office locations, you will be expected to work from the office a set number of days per week. The specific details, including the number of required office days, will be in accordance with the company's then-current RTO policy, which is subject to change from time to time.
Monks has provided a compensation range that represents its good faith estimate of what Monks may pay for the position at the time of posting. Monks may ultimately pay more or less than the posted compensation range. The salary offered to the selected candidate will be determined based on job-related factors, but not based on a candidate's sex or any other protected status.
Salary Range: $133,298-$150,925 USD
About Monks
Monks is the global, purely digital, unitary operating brand of S4Capital plc. With a legacy of innovation and specialized expertise, Monks combines an extraordinary range of global marketing and technology services to accelerate business possibilities and redefine how brands and businesses interact with the world. Its integration of systems and workflows delivers unfettered content production, scaled experiences, enterprise-grade technology and data science fueled by AI-managed by the industry's best and most diverse digital talent-to help the world's trailblazing companies outmaneuver and outpace their competition.
Monks was named a Contender in The Forrester Wave: Global Marketing Services. It has remained a constant presence on Adweek's Fastest Growing lists (2019-23), ranks among Cannes Lions' Top 10 Creative Companies (2022-23) and is the only partner to have been placed in AdExchanger's Programmatic Power Players list every year (2020-24). In addition to being named Adweek's first AI Agency of the Year (2023), Monks has been recognized by Business Intelligence in its 2024 Excellence in Artificial Intelligence Awards program in three categories: the Individual category, Organizational Winner in AI Strategic Planning and AI Product for its service Monks.Flow. Monks has also garnered the title of Webby Production Company of the Year (2021-24), won a record number of FWAs and has earned a spot on Newsweek's Top 100 Global Most Loved Workplaces 2023.
We are an equal-opportunity employer committed to building a respectful and empowering work environment for all people to freely express themselves amongst colleagues who embrace diversity in all respects. Including fresh voices and unique points of view in all aspects of our business not only creates an environment where we can all grow and thrive but also increases our potential to produce work that better represents-and resonates with-the world around us.
$133.3k-150.9k yearly 2d ago
Site reliability engineer
Writer 4.2
New York, NY jobs
WRITER is where the world's leading enterprises orchestrate AI-powered work. Our vision is to expand human capacity through superintelligence. And we're proving it's possible - through powerful, trustworthy AI that unites IT and business teams together to unlock enterprise-wide transformation. With WRITER's end-to-end platform, hundreds of companies like Mars, Marriott, Uber, and Vanguard are building and deploying AI agents that are grounded in their company's data and fueled by WRITER's enterprise-grade LLMs. Valued at $1.9B and backed by industry-leading investors including Premji Invest, Radical Ventures, and ICONIQ Growth, WRITER is rapidly cementing its position as the leader in enterprise generative AI.
Founded in 2020 with office hubs in San Francisco, New York City, Austin, Chicago, and London, our team thinks big and moves fast, and we're looking for smart, hardworking builders and scalers to join us on our journey to create a better future of work with AI.
📐 About the role
At WRITER, our mission to expand human capacity with superintelligence relies on a foundational truth: our platform must be available, performant, and reliable, 24/7. As a site reliability engineer, you'll be at the heart of making this a reality, impacting every enterprise customer who trusts us with their AI-powered workflows. This isn't just about keeping the lights on; it's about pushing the boundaries of what's possible, proactively identifying and solving complex systemic challenges, and laying the groundwork for our rapid growth and the evolving demands of enterprise generative AI. You'll build resilient systems, automate across the stack, and champion reliability best practices, directly enabling our ambitious product roadmap and ensuring our customers always have access to the powerful tools they need.
This is a hybrid position, based out of our New York City or London hubs. You'll report to our director of engineering.
🦸🏻 ♀️ What you'll do
Automate operational tasks and infrastructure management by developing robust tools and platforms using Python, Go, or similar languages, significantly reducing manual toil across our production environment
Design and implement scalable, fault-tolerant infrastructure solutions on public cloud providers (AWS, GCP, Azure) to support WRITER's rapidly expanding, high-traffic AI platform
Own the reliability, performance, and efficiency of WRITER's core services, defining and upholding stringent Service Level Objectives (SLOs) and Error Budgets
Own the observability stack for monitoring, logging, and alerting systems to ensure rapid detection of issues across our complex distributed systems
Lead incident response, post-mortems, and root cause analyses, applying learnings to proactively prevent future outages and build a more resilient system architecture
Collaborate closely with product and engineering teams, providing expert guidance on system design for reliability, performance, and scalability from conception through launch
⭐️ What you need
A solid 7+ years of experience in site reliability engineering, DevOps, or a similar role focused on building and operating large-scale, high-availability production systems
Deep expertise with cloud platforms (AWS strongly preferred), containerization technologies like Docker and Kubernetes, and Infrastructure-as-Code tools such as Terraform
Strong proficiency in programming languages such as Python, Java, Go for automation and monitoring
Knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack) to maintain system health and performance
Demonstrated ability to Challenge the status quo, proactively identify systemic weaknesses, and propose innovative solutions to complex reliability problems
Excellent communication, collaboration, and problem-solving skills, with a talent for building strong relationships and Connecting with cross-functional teams
A strong sense of ownership and accountability, eager to Own mission-critical systems and drive them toward peak performance and unparalleled reliability
🍩 Benefits & perks (US Full-time employees)
Generous PTO, plus company holidays
Medical, dental, and vision coverage for you and your family
Paid parental leave for all parents (12 weeks)
Fertility and family planning support
Early-detection cancer testing through Galleri
Flexible spending account and dependent FSA options
Health savings account for eligible plans with company contribution
Annual work-life stipends for:
Wellness stipend for gym, massage/chiropractor, personal training, etc.
Learning and development stipend
Company-wide off-sites and team off-sites
Competitive compensation, company stock options and 401k
WRITER is an equal-opportunity employer and is committed to diversity. We don't make hiring or employment decisions based on race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other basis protected by applicable local, state or federal law. Under the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
By submitting your application on the application page, you acknowledge and agree to WRITER's Global Candidate Privacy Notice.
$103k-135k yearly est. Auto-Apply 11d ago
Site reliability engineer
Writer 4.2
New York, NY jobs
WRITER is where the world's leading enterprises orchestrate AI-powered work. Our vision is to expand human capacity through superintelligence. And we're proving it's possible - through powerful, trustworthy AI that unites IT and business teams together to unlock enterprise-wide transformation. With WRITER's end-to-end platform, hundreds of companies like Mars, Marriott, Uber, and Vanguard are building and deploying AI agents that are grounded in their company's data and fueled by WRITER's enterprise-grade LLMs. Valued at $1.9B and backed by industry-leading investors including Premji Invest, Radical Ventures, and ICONIQ Growth, WRITER is rapidly cementing its position as the leader in enterprise generative AI.
Founded in 2020 with office hubs in San Francisco, New York City, Austin, Chicago, and London, our team thinks big and moves fast, and we're looking for smart, hardworking builders and scalers to join us on our journey to create a better future of work with AI.
About the role
At WRITER, our mission to expand human capacity with superintelligence relies on a foundational truth: our platform must be available, performant, and reliable, 24/7. As a site reliability engineer, you'll be at the heart of making this a reality, impacting every enterprise customer who trusts us with their AI-powered workflows. This isn't just about keeping the lights on; it's about pushing the boundaries of what's possible, proactively identifying and solving complex systemic challenges, and laying the groundwork for our rapid growth and the evolving demands of enterprise generative AI. You'll build resilient systems, automate across the stack, and champion reliability best practices, directly enabling our ambitious product roadmap and ensuring our customers always have access to the powerful tools they need.
This is a hybrid position, based out of our New York City or London hubs. You'll report to our director of engineering.
️ What you'll do
* Automate operational tasks and infrastructure management by developing robust tools and platforms using Python, Go, or similar languages, significantly reducing manual toil across our production environment
* Design and implement scalable, fault-tolerant infrastructure solutions on public cloud providers (AWS, GCP, Azure) to support WRITER's rapidly expanding, high-traffic AI platform
* Own the reliability, performance, and efficiency of WRITER's core services, defining and upholding stringent Service Level Objectives (SLOs) and Error Budgets
* Own the observability stack for monitoring, logging, and alerting systems to ensure rapid detection of issues across our complex distributed systems
* Lead incident response, post-mortems, and root cause analyses, applying learnings to proactively prevent future outages and build a more resilient system architecture
* Collaborate closely with product and engineering teams, providing expert guidance on system design for reliability, performance, and scalability from conception through launch
️ What you need
* A solid 7+ years of experience in site reliability engineering, DevOps, or a similar role focused on building and operating large-scale, high-availability production systems
* Deep expertise with cloud platforms (AWS strongly preferred), containerization technologies like Docker and Kubernetes, and Infrastructure-as-Code tools such as Terraform
* Strong proficiency in programming languages such as Python, Java, Go for automation and monitoring
* Knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack) to maintain system health and performance
* Demonstrated ability to Challenge the status quo, proactively identify systemic weaknesses, and propose innovative solutions to complex reliability problems
* Excellent communication, collaboration, and problem-solving skills, with a talent for building strong relationships and Connecting with cross-functional teams
* A strong sense of ownership and accountability, eager to Own mission-critical systems and drive them toward peak performance and unparalleled reliability
Benefits & perks (US Full-time employees)
* Generous PTO, plus company holidays
* Medical, dental, and vision coverage for you and your family
* Paid parental leave for all parents (12 weeks)
* Fertility and family planning support
* Early-detection cancer testing through Galleri
* Flexible spending account and dependent FSA options
* Health savings account for eligible plans with company contribution
* Annual work-life stipends for:
* Wellness stipend for gym, massage/chiropractor, personal training, etc.
* Learning and development stipend
* Company-wide off-sites and team off-sites
* Competitive compensation, company stock options and 401k
WRITER is an equal-opportunity employer and is committed to diversity. We don't make hiring or employment decisions based on race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other basis protected by applicable local, state or federal law. Under the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
By submitting your application on the application page, you acknowledge and agree to WRITER's Global Candidate Privacy Notice.
$103k-135k yearly est. 10d ago
Site Reliability Engineer II
RELX 4.1
Evanston, IL jobs
About the Business:
LexisNexis Risk Solutions is the essential partner in the assessment of risk. Within our Business Services vertical, we offer a multitude of solutions focused on helping businesses of all sizes drive higher revenue growth, maximize operational efficiencies, and improve customer experience. Our solutions help our customers solve difficult problems in the areas of Anti-Money Laundering/Counter Terrorist Financing, Identity Authentication & Verification, Fraud and Credit Risk mitigation and Customer Data Management. You can learn more about LexisNexis Risk at the link below, ***************************
About our Team:
The Core Engineering Team is the core driver of our enterprise level standard process creation and delivery. We are a high-impact group focused on building, automating, and maintaining the standards that ensure our products are deployed reliably, securely, and at high velocity.
We are champions of automation, Infrastructure as Code (IaC), and operational excellence. Joining us means working hands-on with modern multi cloud platforms and cutting-edge tools to enhance system reliability, visibility, and security across the entire development lifecycle.
If you are passionate about scalable systems and accelerating engineering teams, you will make a significant impact here.
About the Role:
This position will resolve incidents and collate data in support of root cause analysis and systems design
Key Responsibilities:
Monitoring & Observability: Create and optimize monitoring queries; establish service level baselines.
Incident Response: Support senior engineers during incidents; contribute to post-incident reviews.
Disaster Recovery: Participate in and help execute disaster recovery tests.
Automation & Infrastructure as Code: Implement automation and execute code in production environments.
Documentation: Contribute to SRE knowledge bases and documentation.
Collaboration: Work with cross-functional teams including Development, QA, IT Operations, and Product SRE.
Required Skills & Tools
Programming & Scripting: Python, Bash scripting, Java, Angular
Cloud Platforms: AWS (EC2, S3, Lambda, Glue), Azure (Functions, Logic Apps, AKS), GCP (GKE, Cloud Functions)
Infrastructure as Code: Terraform, Ansible, Chef, Puppet
Containerization & Orchestration: Docker, Kubernetes
CI/CD & Automation: Jenkins, GitHub Actions, Bitbucket, GitLab
Monitoring & Observability: Prometheus, Grafana, DataDog, Dynatrace, Splunk, SignalFx
Networking & Security: AWS: VPCs, IAM, Transit Gateway, CloudWAN, route53, AWS KMS, RDS Azure: Application Gateway, VNET, Express route, private link, Azure firewall, MS Sentinel, Azure Entra ID, RBAC
Primary Location Base Pay Range: Evanston, IL $75,200 - $125,500. If performed in Illinois, the base pay range is $75,200 - $125,500.If performed in Chicago, IL, the base pay range is $78,700 - $131,400.If performed in New York, the base pay range is $78,700 - $131,400.If performed in New York City, the base pay range is $82,300 - $137,400.If performed in Rochester, NY, the base pay range is $68,000 - $113,400.If performed in New Jersey, the base pay range is $80,927 - $129,273.U.S. National Base Pay Range: $71,600 - $119,400. Geographic differentials may apply in some locations to better reflect local market rates. This job is eligible for an annual incentive bonus.
We know your well-being and happiness are key to a long and successful career. We are delighted to offer country specific benefits. Click here to access benefits specific to your location.
We are committed to providing a fair and accessible hiring process. If you have a disability or other need that requires accommodation or adjustment, please let us know by completing our Applicant Request Support Form or please contact **************.
Criminals may pose as recruiters asking for money or personal information. We never request money or banking details from job applicants. Learn more about spotting and avoiding scams here.
Please read our Candidate Privacy Policy.
We are an equal opportunity employer: qualified applicants are considered for and treated during employment without regard to race, color, creed, religion, sex, national origin, citizenship status, disability status, protected veteran status, age, marital status, sexual orientation, gender identity, genetic information, or any other characteristic protected by law.
USA Job Seekers:
EEO Know Your Rights.
$82.3k-137.4k yearly Auto-Apply 1d ago
Reliability Engineer
Gemini 4.9
Houston, TX jobs
Axiom Space is building the world's first commercial space station - Axiom Station. Serving as a cornerstone for sustained human presence in space, this next-generation orbital platform fosters groundbreaking innovation and research in microgravity, and cultivates the vibrant, global space economy of tomorrow. Today, driven by the vision of leading humanity's journey off planet, Axiom Space is the principal provider of commercial human spaceflight services to the International Space Station and developer of advanced spacesuits for the Moon and low-Earth orbit. Axiom Space is building era-defining space infrastructure that drives exploration and fuels a vibrant space economy that will empower our civilization to transcend Earth for the benefit of every human, everywhere.
Axiom Space fosters a work environment inclusive of all perspectives. We are the pioneers of commercial space, leading the transformation of low-Earth orbit into a global space marketplace. Our mission-driven team is seeking a bold and dynamic Reliability Engineer who is fueled by high ownership, execution horsepower, growth mindset, and driven to understand our world, science/technology, and life itself, for the benefit of all on Earth and beyond.
POSITION SUMMARY
We are looking for a resilient, high-energy, experienced Reliability Engineer, who will be part of a multi-disciplinary team responsible for flight system safety and reliability that will interface with design engineers, safety engineers, and representatives from other subsystems to plan, document, and execute effective strategies for eliminating and/or minimizing risk and ensuring operational reliability.
KEY DUTIES & RESPONSIBILITIES
Plan and implement reliability engineering strategy, including implementation practices and processes.
Establish reliability requirements and reliability program growth strategy, as needed, for on-orbit component failures throughout the life of the station.
Conduct detailed system, assembly, and component level Failure Mode & Effects Analysis (FMEA) along with criticality analysis.
Develop and apply data analysis techniques for mechanical and electrical and/or electro-mechanical designs, including statistical process control, reliability modeling and prediction.
Generate reliability predictions, such as Mean Time between Failures (MTBF) and Mean Time to Failure (MTTF), for new product designs through analysis or accelerated aging tests and to inform maintenance and logistics in sparing posture.
Experience in performing Fault Tree Analysis (FTA), Reliability Block Diagrams (RBD) and Functional Block Diagrams (FBD). Preferred experience in the use of software tools, such as SAPHIRE, and PTC Windchill Quality Solutions (formerly Relex).
Experience and/or knowledge in the performance of spacecraft probability strategic assessments.
Collaborate with engineering system and subsystem teams to implement integrated reliability analyses.
Provide technical support at design reviews and test activities as required.
Collaborate with technical engineers to identify critical items of the system for implementing a strategy for verification and ensuring a safe and reliable system.
Collaborate with all engineering system and subsystem teams to implement integrated reliability analyses.
Prepare technical documentation and presentations for peers, management, and external customers and partners.
Communicating analysis progress, status, and potential issues to stakeholders and management.
Perform additional job duties as assigned.
QUALIFICATIONS:
To perform this job successfully, an individual must be able to perform each essential duty satisfactorily. The requirements listed below are representative of the knowledge, skill, and/or ability required. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.
Education & Experience
Bachelor's degree in engineering (aerospace, electrical, mechanical, or related discipline).
10+ years of direct experience in spacecraft reliability-focused activities, including statistical process control, modeling/prediction.
Related experience in the reliability predictions for new designs of spacecraft, space vehicles, and their related components and systems is desired.
Experience in analyzing and processing flight hardware failure data to calculate or refine product reliability metrics and recommend improvement measures.
A broad background in space systems engineering including an understanding of space vehicle design, space system architecture development, ground systems support, payloads, and space missions and operations development.
Detailed knowledge of spaceflight subsystems and functions, preferably human spaceflight related.
Experience with NASA human spaceflight programs desired.
Track record of delivering products in ambiguous, fast-moving environments.
Uses good judgement to solve problems proactively, positively impacting hard challenges.
Proven to deliver high quality results under tight deadlines.
Grit
Passion for space and the mission
Entrepreneurial, growth mindset
Perseverance
Resourceful, adaptable
Skills
Executes priorities with precision and pace
High EQ and ability to collaborate within teams and cross-functionally
Tech-savvy in using systems and tools to move faster and smarter
Excellent written and verbal communication skills
Competencies:
Embody our core values of leadership, innovation, and teamwork. In addition, to perform the job successfully, an individual should demonstrate the following competencies:
Accountability
Sense of Urgency
Extreme Ownership
Execution and Delivery
Efficiency
Effectiveness
WORK ENVIRONMENT:
Generally, an office environment, but can involve inside or outside work depending on the task.
Requirements
Must be able to complete a U.S. government background investigation.
Management has the prerogative to select at any level for which the position is advertised.
Proof of U.S. Citizenship or US Permanent Residency is a requirement for this position.
Must be willing to work evenings and weekends as needed to meet critical project milestones.
Physical Requirements
Work may involve sitting or standing for extended periods (90% of the time)
May require lifting and carrying up to 25 lbs. (5% of the time)
Equipment and Machines
Standard office equipment (PC, phone, printer, etc.)
Axiom Space is proud to be an equal opportunity employer. Axiom Space does not discriminate on the basis of race, regional color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with disability, or other applicable legally protected characteristics.
$89k-125k yearly est. Auto-Apply 60d+ ago
Senior Site Reliability Engineer (SRE)
Americas Test Kitchen 3.5
Boston, MA jobs
America's Test Kitchen (ATK) is seeking a Senior Site Reliability Engineer (SRE) to focus on the stability, scalability, and performance of our core Cloud Infrastructure and Database Systems. This high-impact role is focused on applying software engineering principles to operations, reducing toil, and ensuring the reliability of our high-traffic website, app, and digital subscription platforms.
The successful candidate will be a proficient software developer and an expert in cloud architecture who thrives on designing and implementing automated infrastructure solutions, optimizing complex database performance, and collaborating closely with development teams to build resilient services.
This is a newly created role and will report to the VP, Engineering and will be a key contributor to ATK's DevOps and infrastructure strategy.
Core Technical Responsibilities
Reliability Engineering & Cloud Infrastructure:
Infrastructure-as-Code (IaC): Design, implement, and maintain our cloud infrastructure using AWS CDK. Focus on high availability, disaster recovery, and cost efficiency.
Automated Operations: Develop robust automation using code to manage infrastructure, deploy applications, handle monitoring, and execute system recovery, driving down manual effort.
Observability: Implement and manage comprehensive monitoring, logging, and alerting systems to provide deep visibility into system health, performance, and key Service Level Objectives (SLOs).
Incident Response: Lead incident response, root cause analysis (RCA), and post-mortem processes to identify and resolve systems-level issues and prevent recurrence.
Database Performance and Development:
Database Management: Own the operational health and performance tuning of critical relational and NoSQL database systems in the cloud.
Software Development: Act as a contributing developer, writing clean, well-tested code in core ATK services.
Security and Compliance: Implement and enforce security best practices across infrastructure and data layers, including network segmentation, access control (IAM), and encryption.
Skills and Experience Required
Technical Expertise & Development Proficiency:
SRE/DevOps Experience: 5+ years of progressive experience in an SRE, or highly technical systems engineering role.
Cloud Architecture: Expert-level, hands-on experience designing and managing production environments in AWS (e.g., EC2, Lambda, ECS/EKS, VPC, RDS).
Database Mastery: Deep understanding of database internals, performance tuning, and operational management for data stores.
Coding Skills: Proven proficiency in at least one modern programming language used for systems automation, tooling, and backend service development.
Containerization: Strong experience with container orchestration technologies.
IaC Tools: Hands-on expertise with Infrastructure-as-Code tools.
Execution and Communication:
Problem Solver: Exceptional ability to diagnose and solve complex production issues across multiple domains (network, application, database, and infrastructure).
Collaboration: Strong track record of successfully partnering with software development teams to improve service reliability and delivery pipelines.
Technical Communication: Ability to clearly and concisely communicate technical concepts, status, and post-mortems to both engineering and leadership teams.
Qualifications
Bachelor's degree in Computer Science, Engineering, or equivalent professional experience
5+ years of experience in software development and/or site reliability engineering
Extensive AWS experience
Experience with high-traffic, customer-facing websites and apps
Mastery of Node.js and Java as a bonus
Location: This role can be based in our Boston, MA headquarters or is open to qualified remote applicants.
Salary Range:
$180,000 to $225,000
The range provided is based on what we reasonably expect to pay for this job as of the time of posting. The actual salary offered will be determined based on multiple factors, including but not limited to the candidate s relevant experience, job-related knowledge, skills, geographical location, and other job-related factors permitted by law.
About America's Test Kitchen
The mission of America's Test Kitchen (ATK) is to empower and inspire confidence, community, and creativity in the kitchen. Founded in 1992, the company is the leading multimedia cooking resource serving millions of fans with TV shows (America's Test Kitchen, Cook's Country, and America's Test Kitchen: The Next Generation), magazines (Cook's Illustrated and Cook's Country), cookbooks, a podcast (Proof), FAST channels, short-form video series, and the ATK Essential Membership for digital content. Based in a state-of-the-art 15,000-square-foot test kitchen in Boston's Seaport District, ATK has earned the trust of home cooks and culinary experts alike thanks to its one-of-a-kind processes and best-in-class techniques. Fifty full-time (admittedly very meticulous) test cooks, editors, and product testers spend their days tweaking every variable to find the very best recipes, equipment, ingredients, and techniques. Learn more at *************************************
Why America's Test Kitchen:
We're passionate about cooking, and about creating the best place to work. We're small enough for your ideas to make a big impact, and large enough to offer you opportunities to grow professionally at any stage of your career. We want you to take risks and make mistakes that's how innovation happens in our test kitchen, in our offices, and in life.
We at America's Test Kitchen believe food media can be a powerful force for social change. We are passionate about building an inclusive workforce that represents many different cultures, backgrounds, abilities, identities, and perspectives.
We welcome your application.
$180k-225k yearly 47d ago
Senior Site Reliability Engineer (SRE)
America's Test Kitchen 3.5
Boston, MA jobs
Job Description
America's Test Kitchen (ATK) is seeking a Senior Site Reliability Engineer (SRE) to focus on the stability, scalability, and performance of our core Cloud Infrastructure and Database Systems. This high-impact role is focused on applying software engineering principles to operations, reducing toil, and ensuring the reliability of our high-traffic website, app, and digital subscription platforms.
The successful candidate will be a proficient software developer and an expert in cloud architecture who thrives on designing and implementing automated infrastructure solutions, optimizing complex database performance, and collaborating closely with development teams to build resilient services.
This is a newly created role and will report to the VP, Engineering and will be a key contributor to ATK's DevOps and infrastructure strategy.
Core Technical Responsibilities
Reliability Engineering & Cloud Infrastructure:
Infrastructure-as-Code (IaC): Design, implement, and maintain our cloud infrastructure using AWS CDK. Focus on high availability, disaster recovery, and cost efficiency.
Automated Operations: Develop robust automation using code to manage infrastructure, deploy applications, handle monitoring, and execute system recovery, driving down manual effort.
Observability: Implement and manage comprehensive monitoring, logging, and alerting systems to provide deep visibility into system health, performance, and key Service Level Objectives (SLOs).
Incident Response: Lead incident response, root cause analysis (RCA), and post-mortem processes to identify and resolve systems-level issues and prevent recurrence.
Database Performance and Development:
Database Management: Own the operational health and performance tuning of critical relational and NoSQL database systems in the cloud.
Software Development: Act as a contributing developer, writing clean, well-tested code in core ATK services.
Security and Compliance: Implement and enforce security best practices across infrastructure and data layers, including network segmentation, access control (IAM), and encryption.
Skills and Experience Required
Technical Expertise & Development Proficiency:
SRE/DevOps Experience: 5+ years of progressive experience in an SRE, or highly technical systems engineering role.
Cloud Architecture: Expert-level, hands-on experience designing and managing production environments in AWS (e.g., EC2, Lambda, ECS/EKS, VPC, RDS).
Database Mastery: Deep understanding of database internals, performance tuning, and operational management for data stores.
Coding Skills: Proven proficiency in at least one modern programming language used for systems automation, tooling, and backend service development.
Containerization: Strong experience with container orchestration technologies.
IaC Tools: Hands-on expertise with Infrastructure-as-Code tools.
Execution and Communication:
Problem Solver: Exceptional ability to diagnose and solve complex production issues across multiple domains (network, application, database, and infrastructure).
Collaboration: Strong track record of successfully partnering with software development teams to improve service reliability and delivery pipelines.
Technical Communication: Ability to clearly and concisely communicate technical concepts, status, and post-mortems to both engineering and leadership teams.
Qualifications
Bachelor's degree in Computer Science, Engineering, or equivalent professional experience
5+ years of experience in software development and/or site reliability engineering
Extensive AWS experience
Experience with high-traffic, customer-facing websites and apps
Mastery of Node.js and Java as a bonus
Location: This role can be based in our Boston, MA headquarters or is open to qualified remote applicants.
Salary Range:
$180,000 to $225,000
The range provided is based on what we reasonably expect to pay for this job as of the time of posting. The actual salary offered will be determined based on multiple factors, including but not limited to the candidate's relevant experience, job-related knowledge, skills, geographical location, and other job-related factors permitted by law.
About America's Test Kitchen
The mission of America's Test Kitchen (ATK) is to empower and inspire confidence, community, and creativity in the kitchen. Founded in 1992, the company is the leading multimedia cooking resource serving millions of fans with TV shows (America's Test Kitchen, Cook's Country, and America's Test Kitchen: The Next Generation), magazines (Cook's Illustrated and Cook's Country), cookbooks, a podcast (Proof), FAST channels, short-form video series, and the ATK Essential Membership for digital content. Based in a state-of-the-art 15,000-square-foot test kitchen in Boston's Seaport District, ATK has earned the trust of home cooks and culinary experts alike thanks to its one-of-a-kind processes and best-in-class techniques. Fifty full-time (admittedly very meticulous) test cooks, editors, and product testers spend their days tweaking every variable to find the very best recipes, equipment, ingredients, and techniques. Learn more at *************************************
Why America's Test Kitchen:
We're passionate about cooking, and about creating the best place to work. We're small enough for your ideas to make a big impact, and large enough to offer you opportunities to grow professionally at any stage of your career. We want you to take risks and make mistakes - that's how innovation happens in our test kitchen, in our offices, and in life.
We at America's Test Kitchen believe food media can be a powerful force for social change. We are passionate about building an inclusive workforce that represents many different cultures, backgrounds, abilities, identities, and perspectives.
We welcome your application.
$180k-225k yearly 17d ago
Senior Site Reliability Engineer (SRE)
America's Test Kitchen 3.5
Boston, MA jobs
America's Test Kitchen (ATK) is seeking a Senior Site Reliability Engineer (SRE) to focus on the stability, scalability, and performance of our core Cloud Infrastructure and Database Systems. This high-impact role is focused on applying software engineering principles to operations, reducing toil, and ensuring the reliability of our high-traffic website, app, and digital subscription platforms.
The successful candidate will be a proficient software developer and an expert in cloud architecture who thrives on designing and implementing automated infrastructure solutions, optimizing complex database performance, and collaborating closely with development teams to build resilient services.
This is a newly created role and will report to the VP, Engineering and will be a key contributor to ATK's DevOps and infrastructure strategy.
Core Technical Responsibilities
Reliability Engineering & Cloud Infrastructure:
* Infrastructure-as-Code (IaC): Design, implement, and maintain our cloud infrastructure using AWS CDK. Focus on high availability, disaster recovery, and cost efficiency.
* Automated Operations: Develop robust automation using code to manage infrastructure, deploy applications, handle monitoring, and execute system recovery, driving down manual effort.
* Observability: Implement and manage comprehensive monitoring, logging, and alerting systems to provide deep visibility into system health, performance, and key Service Level Objectives (SLOs).
* Incident Response: Lead incident response, root cause analysis (RCA), and post-mortem processes to identify and resolve systems-level issues and prevent recurrence.
Database Performance and Development:
* Database Management: Own the operational health and performance tuning of critical relational and NoSQL database systems in the cloud.
* Software Development: Act as a contributing developer, writing clean, well-tested code in core ATK services.
* Security and Compliance: Implement and enforce security best practices across infrastructure and data layers, including network segmentation, access control (IAM), and encryption.
Skills and Experience Required
Technical Expertise & Development Proficiency:
* SRE/DevOps Experience: 5+ years of progressive experience in an SRE, or highly technical systems engineering role.
* Cloud Architecture: Expert-level, hands-on experience designing and managing production environments in AWS (e.g., EC2, Lambda, ECS/EKS, VPC, RDS).
* Database Mastery: Deep understanding of database internals, performance tuning, and operational management for data stores.
* Coding Skills: Proven proficiency in at least one modern programming language used for systems automation, tooling, and backend service development.
* Containerization: Strong experience with container orchestration technologies.
* IaC Tools: Hands-on expertise with Infrastructure-as-Code tools.
Execution and Communication:
* Problem Solver: Exceptional ability to diagnose and solve complex production issues across multiple domains (network, application, database, and infrastructure).
* Collaboration: Strong track record of successfully partnering with software development teams to improve service reliability and delivery pipelines.
* Technical Communication: Ability to clearly and concisely communicate technical concepts, status, and post-mortems to both engineering and leadership teams.
Qualifications
* Bachelor's degree in Computer Science, Engineering, or equivalent professional experience
* 5+ years of experience in software development and/or site reliability engineering
* Extensive AWS experience
* Experience with high-traffic, customer-facing websites and apps
* Mastery of
$121k-149k yearly est. 47d ago
Head of Reliability
Blue Water Autonomy 4.0
Lexington, MA jobs
We're seeking a Head of Reliability to lead quality, reliability, and testing for our autonomous naval vessels. This is a critical leadership role at the intersection of hardware, software, and systems engineering. You'll collaborate closely with the Head of Engineering and Head of Software to ensure our ships operate reliably in the harshest ocean environments for months at a time.
This role is based in our Lexington, MA headquarters, or at our R&D facility located on the South Coast of Massachusetts.. You'll build reliability and quality systems from the ground up - establishing world-class standards while working within startup constraints.
What You'll Do
Define and implement reliability, quality, and testing strategy for autonomous maritime systems
Build and lead a lean, high-performing reliability organization from the ground up
Establish quality gates, test protocols, and reliability metrics that balance rigor with speed
Develop reliability models and failure mode analysis for complex hardware/software systems operating in extreme ocean environments
Design accelerated life testing and environmental testing programs that validate performance under real-world conditions
Working with Engineering, create predictive maintenance strategies for systems that must operate autonomously for extended periods
Build comprehensive test strategies spanning unit, integration, system, and sea trials
Lead root cause analysis and corrective action processes to improve system performance
Develop supplier quality programs and incoming inspection protocols
Establish documentation, traceability, and continuous improvement processes
Collaborate with engineering, software, and operations teams to embed reliability into design, manufacturing, and testing
Interface with customers and regulatory bodies on reliability and quality requirements
Who You Are
Bring extensive experience in reliability, quality, or test engineering roles with increasing leadership responsibility
Have a proven track record building reliability programs for complex hardware/software systems
Have at least five years of experience with analysis and test of both hardware and software systems
Bring deep experience with FMEA, fault tree analysis, reliability modeling, and accelerated testing
Are comfortable working in resource-constrained environments and making pragmatic tradeoffs
Have background in defense, maritime, aerospace, automotive, or robotics industries
Understand autonomous and safety-critical systems operating in harsh environments
Make sound engineering judgments with incomplete information and adapt quickly
Are data-driven but know when to rely on test results, analysis, or experience
Communicate clearly and effectively across technical and non-technical teams
Are hands-on when needed and thrive in a fast-paced, mission-driven environment
Nice to Haves
Hold an advanced degree in mechanical, electrical, or systems engineering
Have experience with maritime systems, naval architecture, or marine engineering
Are familiar with MIL-STD reliability and quality standards
Have worked with AI/ML system validation and testing
Have led reliability programs in early-stage startups or innovation labs
Security clearance
What We Offer
Incredibly high-caliber teammates - you'll work directly with our co-founders and technical leads
A mission-driven environment designing technology that protects American lives and democracy
Significant ownership and influence over how reliability and quality are built at scale
A fast-paced, creative culture that values clarity, teamwork, and decisive execution
Equal opportunity employer. All hiring is contingent on eligibility to work in the U.S.; we are unable to sponsor or transfer visas
Compensation
Salary Range: $190,000-$220,000 annual base salary
Additional compensation: Startup equity options
Benefits: health, dental, vision, PTO
Final compensation will depend on experience and skill level
$190k-220k yearly 60d+ ago
Industrial Valve Reliability Technician
Atlantic Valve 4.5
Swedesboro, NJ jobs
Job DescriptionIndustrial Valve Reliability Technician - Swedesboro, NJ
Step into a hands-on role where your mechanical know‑how keeps critical valve systems operating at their best. In this position, you'll be the go-to expert for inspecting, maintaining, and repairing valves while partnering closely with engineering and operations to solve problems that matter.
A day in the life
Start with a safety check and review of scheduled valve inspections.
Diagnose performance issues, interpret schematics, and determine effective repair strategies.
Execute preventive maintenance and functional testing to keep systems running reliably.
Work side-by-side with engineers and operators on project tasks and turnarounds.
Record findings, parts used, and actions taken with precise documentation.
Maintain an orderly, safe workspace and follow all safety procedures.
Share your expertise by supporting and coaching junior technicians as needed.
What you'll bring
High school diploma or equivalent; a technical certification in valve technology or related trade is a plus.
Hands-on experience servicing valves or performing comparable mechanical maintenance.
Solid grasp of valve operation, mechanics, and failure modes.
Ability to read and apply technical manuals, drawings, and wiring diagrams.
Sharp troubleshooting mindset and strong attention to detail.
Clear communicator and collaborative teammate.
Flexibility to work overtime or varied hours when required.
Technical strengths
Mechanical troubleshooting, service, and repair across industrial equipment.
Preventive maintenance and inspection best practices.
Journeyman's License (preferred where applicable).
Electrical installation, troubleshooting, and repair; industrial electrical experience.
Pipefitting fundamentals for safe, accurate work.
HVAC service, installation, maintenance, troubleshooting, and repair.
Boiler Certification and boiler troubleshooting expertise.
Tools and equipment you'll use
Multimeter and measurement gauges for precise diagnostics.
Power tools for efficient repair and maintenance.
Overhead crane for safe handling of components.
If you enjoy solving mechanical challenges, improving equipment reliability, and making a tangible impact, this role gives you the platform to do your best work.
$44k-52k yearly est. 5d ago
Quality Engineer
MRP Solutions 4.6
Plattsburgh, NY jobs
The Plant Quality Engineer is responsible for implementing process control programs that will assure production of finished goods that meet or exceed customer-specified requirements. He/she leads continual improvement by utilizing the appropriate quality tools and provides data analysis. This position will be focusing on the injection molding, lining and decoration departments of the plant. They will be focusing mostly on any quality issue that arises during production, handling corrective actions, etc. They will also be involved in product launches and engineering changes.
DUTIES AND RESPONSIBILITIES:
Analyze quality event data and aide in the development of projects directed at improving product and process quality
Coordinate department quality control activities to resolve production problems, maximize product reliability, and minimize cost.
Facilitate Failure Modes and Effects Analysis (FMEA) with Engineering in conjunction with advanced quality-planning activities and monitor the status throughout production life
Develop validation protocols, test method validation/Gage R & R and final reports. Perform data analysis. Assist in deviation investigation and resolution.
Direct employees engaged in product measurement, inspection, and testing activities to ensure quality control and reliability.
Develop, implement and maintain Statistical Process Control techniques
Assure compliance of the documented quality system (ISO9001:2015) and promote consistency throughout the organization
Monitor and revise quality documentation as required
Visit customers and suppliers when necessary. Acts as a liaison to internal and external customers
Review all production methods for compliance to quality standards and for improvement of product and quality standards
Recommend methods for improving utilization of quality personnel
Interface with customer to resolve complaints, corrective actions, preventive actions, engineering changes and validation.
Initiate engineering change package.
Assist in formulating responses to audit findings
Follow Good Manufacturing Practices (GMP) and Good Documentation Practices (GDP) as applicable.
Conduct multiple projects simultaneously ranging from feasibility phase to post launch quality assessment and throughout production life
Conduct/assist investigations to determine root causes of defects and process failures and make recommendations for corrective action. Facilitate problem solving meeting.
SKILLS REQUIRED
5-7 years of experience in Quality Engineering. Ideally in the Plastic Manufacturing Industry.
A B.S. degree in Mechanical Engineering or closely related discipline is preferred.
ASQ-CQE, CQA certification preferred
Minimal travel is required.
RESPONSIBILITIES:
Safety Responsibilities:
Supports a culture where employees address unsafe conditions and behaviors, make suggestions for improvements, and actively participates in implementing solutions
Identifies safety gaps and self-initiates corrective actions
Strictly adheres to plant safety, housekeeping, and 5S efforts
Understands, identifies, and corrects safety hazards
Drives a culture which empowers employees to understand and embrace what they own
Food Safety Responsibilities:
Monitors and verifies activities to ensure that finished goods and raw materials are coming in and out of the facility meet consumer safety and quality standards.
Complies with all company food safety and quality assurance procedures
Reports any product or process failures that could impact food safety of manufactured products to the location's Food Safety Team Leader and submits an appropriate incident report
About MRP:
MRP Solutions (MRP), a plastic cap manufacturer, is a leading provider of high-quality, injection molded plastic closures, jars and packaging components used every day by millions of consumers around the globe. But we offer more than just plastic caps and lids - we deliver industry-leading packaging solutions tailored to each customer's unique requirements, providing best-in-class product protection while ensuring consumer confidence.
MRP Solutions combines extensive packaging expertise with a consultative approach to reliably uncover customer needs. By understanding your business goals, we can tailor smarter, safer, and more flexible packaging solutions that reduce cost and increase speed to market, helping your businesses capitalize on opportunity. By constantly innovating, MRP enables our customers to grow, making us a preferred partner.
We are passionate about partnering with distributors and manufacturers who understand that plastic caps and lids are a small but important part of how people experience their brands. Together, we deliver packaging with purpose.
Our Vision:
We deliver industry-leading packaging solutions tailored to each customer's unique requirements, providing best-in-class product protection while ensuring consumer confidence.
By constantly innovating, MRP enables our customers to grow, making us a preferred partner.
Our Values & Beliefs
Integrity - We have the courage to act with the highest level of integrity, even when no one is watching. We do what is right 100% of the time.
Value Creation - The sole reason a company exists is to create real long-term value for society. This starts with ensuring human safety, as value cannot be created without first protecting human life. We seek opportunities for mutual benefit with all of our stakeholders, including customers, employees, shareholders, suppliers, and the communities in which we operate. In everything we do, our overarching goal is to deliver superior results..
Accountability - We are accountable to each other and to our stakeholders. We say what we do and do what we say. We embrace a culture of ownership, empowering and equipping employees with the ability to own their outcomes.
Entrepreneurial - Everyone thinks and acts like owners, employing good economic and critical thinking skills while adopting the risk profile of our shareholders. We are inquisitive, constantly seeking out opportunities to improve, actively searching for and innovating across each and every aspect of our business. We relentlessly strive to understand and profitably anticipate what our customers need and value, because if our customers do not grow, we do not grow.
Respectful and Friendly - Everyone deserves to be treated with respect and dignity. Because everyone's perspective has value, we embrace diversity of thought, background and experiences. We are friendly and lead with a smile. What we do is important, but how we do it is what makes it impactful.
Change - We actively seek out and embrace change wherever profitable. Because society is constantly identifying and employing new and better ways of accomplishing tasks, we must constantly innovate, reinvent and, ultimately, destroy the old ways of doing business. We actively engage in rigorous debate and embrace challenge to ensure we stay relevant and deliver superior results.
Fulfillment - Our employees are the foundation of our success. We foster an environment enabling our employees to learn, grow and accept more accountability as they demonstrate capability. We promote more than just individual connection as community at work brings people together through common interest, objectives or experiences.
MRP Solutions is an equal opportunity and affirmative action employer. We do not discriminate in recruiting, hiring or promotion based on race, color, religion, sex (including sexual orientation, gender identity or expression, transgender status), national origin, age, disability, medical condition, marital or protected veteran status or any other basis or characteristic prohibited by applicable federal, state, or local law. Consistent with the obligations of state and federal law, MRP Solutions will make reasonable accommodations for qualified individuals with disabilities. Any employee who needs a reasonable accommodation should contact Human Resources
$72k-96k yearly est. Auto-Apply 5d ago
Quality Engineer
MRP Solutions 4.6
Somerset, NJ jobs
Job Description
MRP Solutions, LLC is a leading manufacturer of injection molded, high quality rigid-plastic packaging components serving customers globally. MRP Solutions is committed to providing our customers with compliant and innovative quality products and the Quality Engineer plays a pivotal role in supporting MRP Solutions Mission. As MRP Solutions continues to grow, our team needs valuable team members who can help ensure sustainable quality by driving quality improvement initiatives and routine problem solving. This individual will be a leader and mentor for quality understanding, improvements, and sustainability.
ESSENTIAL FUNCTIONS:
Implementation of Quality Risk Management in relevant aspects of the Management System
Investigation and CAPA management, including robust root cause analysis and data mining
Champion Supplier Quality Program, monitor and report on Supplier Performance
Determines quality improvement parameters by identifying statistical methods relevant to manufacturing processes
Prepare reports by collecting, analyzing, and summarizing data; making recommendations for improvement
QUALIFICATIONS:
Background in Quality processes, procedures, and documentation
Technical proficiency in Microsoft Office Suite, including but not limited to Word, Excel, and Outlook
Organizational skills to manage multiple data systems
Ability to work on multiple projects in parallel
Ability to perform in team environment
Self-motivated and self-disciplined
Preferred
Familiarity with equipment/process verification procedures
Manufacturing equipment experience/operations background
Statistical analysis experience
EDUCATION and EXPERIENCE:
3+ years of experience in a quality organization
Degree preferred
$75k-103k yearly est. 12d ago
Quality Engineer
FX Staffing 4.1
West Chester, PA jobs
1. Maintain, track and monitor Customer
2. Serve as the customer's primary quality contact for all assigned program products being supplied to the customer assembly plants. Hold face\-to\-face communication activities with suppliers and customers to ensure program development.
3. Lead and coordinate new product evaluations during customer prototype reviews, new model reviews or pre\-production trials with assistance from other engineers, Coordinators and\/or Manager. Attend customer and supplier build activities to gather information to evaluate the status of the developing program.
4. Create appropriate customer and internal documents to support the success of program development. Develop and maintain spreadsheets\/databases that track program development and\/or program issues.
5. Ensure all activities are on schedule with via customers own program\/project planning tools and via the supplier planning tools. Provide technical support to customer Quality Engineering & Production Engineering departments.
6. Develop and perform quality audits to ensure supplier state of readiness with assistance from Senior Engineer and\/or Manager.
7. Maintain databases to track programs, change request, quality issue status and others as needed.
8. Coordinate activities for quality improvement at all manufacturing plants, communicate customer needs and drive improvement activities. Assist continuous improvement teams, support in the development\/implementation of containment activities and corrective action countermeasures. Serve as the primary coordinator for the "Go & See" at the customer's locations for customer concerns.
Other Responsibilities and Duties
1. Coordinate negotiation activities for the removal of scrapped \/ rejected parts from Parts Per Million (PPM) quality record, which serves as a critical Key Performance Indicator (KPI) for the customers' judgment of any supplier location. In addition, coordinate negotiations of the reduction of scrap costs, quality defects, and sort\/re\-work operations that may impact the manufacturing plants.
2. Access and maintain Customer, Tier I\/II and Supplier required websites and systems.
3. Assist in the development and maintenance of a mass production performance tracking program and supplier rating system.
4. Assist in analyzing nonconforming product at the customer facility when required.
5. Assist in action plan development based on both the corporate, customer and departmental targets\/objectives. The ability to execute a plan; check the actual achievement against the plan and re\-plan based on less than targeted results. The “PDCA” approach to be conducted with approval by your manager including the ability to assist in the development of targets\/objectives from year\-to\-year and again track the results.
6. Update management on the status of the assigned programs through meetings and\/or reports, including tracking and reporting to senior management changes to the program which impact the business case and program profitability.
7. Ability to represent and\/or present on the quality condition\/status of any and all products. This requires the continuous collection of information from others, i.e. internally, at plant locations, Tier II level locations and others. This is to assure you keep the entire team prepared for any customer required update meetings, including support the compliance to the customer's supplier requirements.
8. Other special projects, tasks or duties as assigned by supervisor and\/or management.
Qualifications Education and Experience: Desired Qualifications: \- Bachelor's Degree in Technical Field with a minimum of five (5) years of working with automotive OEM's in Quality Assurance. Prefer brake system however other safety critical product quality experience can meet the requirement in place of brake system experience.
Minimum Qualifications: \- A minimum of 8 years professional experience or an equivalent combination of education and experience. \- MUST have corporate level experience at the tier I level.
"}}],"is Mobile":false,"iframe":"true","job Type":"Full time","apply Name":"Apply Now","zsoid":"50687232","FontFamily":"PuviRegular","job OtherDetails":[{"field Label":"Industry","uitype":2,"value":"Automotive"},{"field Label":"City","uitype":1,"value":"West Chester"},{"field Label":"State\/Province","uitype":1,"value":"Ohio"},{"field Label":"Zip\/Postal Code","uitype":1,"value":"45069"}],"header Name":"Quality Engineer","widget Id":"307738000000072311","is JobBoard":"false","user Id":"307738000000083003","attach Arr":[],"custom Template":"5","is CandidateLoginEnabled":false,"job Id":"307738000004375086","FontSize":"15","google IndexUrl":"https:\/\/fxstaffing.zohorecruit.com\/recruit\/ViewJob.na?digest=aynh ZahhyL67OewDlbXUzkJ.yga XXS5spfjPMaFC5kA\-&embedsource=Google","location":"West Chester","embedsource":"CareerSite","indeed CallBackUrl":"https:\/\/recruit.zoho.com\/recruit\/JBApplyAuth.do","logo Id":"c1hl26220e4ef4097400797605d5fa0dea6e2"}
$69k-97k yearly est. 50d ago
Global Advanced Product Quality Engineer
Yeti 4.4
Austin, TX jobs
At YETI, we believe that time spent outdoors matters more than ever and our gear can make that time extraordinary. When you work here, you'll have the opportunity to create exceptional, meaningful work and problem solve with innovative team members by your side. Together, you'll help our customers get the high-quality gear they need to make the most of their adventures. We are BUILT FOR THE WILD.
The Advanced Product Quality Engineer has primary responsibility of a product from "end to end lifecycle" from concept through the entire product development process, including once the product goes into the Wild. This includes product concept/requirement, design, validation, and manufacturing capability. Additionally, the APQE may take on supplier quality responsibilities directly or partner with global Supplier Quality Engineers (SQEs) to manage supplier quality for a portion of YETI's manufacturing partners and will be capable of delivering results in a fast paced and demanding environment. This person will have a significant level of partnership and engagement with project management, marketing, product development, manufacturing, and suppliers. He/She will work as a core team member on major product development projects. In addition, the advanced product quality engineer will be responsible for developing and standardizing processes and tools to better facilitate product development and manufacturing activities.
More specifically, this individual's activities will be focused on bridging the design to the manufacturing and ensuring predictable quality through the product lifecycle. This is achieved through managing supplier qualification, ensuring solid manufacturing capability and driving PPAP on new/revised products, and thus serve as a bridge between new product development and manufacturing process approval and will be responsible to ensure that YETI's manufacturing partners are capable of producing high quality goods on an ongoing basis. Additionally, this role will be responsible to facilitate problem solving and sustaining activities on existing products, audit supplier processes for quality and conformance and drive continuous improvement activities related to product and process improvement. This role will also gather pre-shipment and ongoing warranty/reliability data, leverage LEAN tools and partnering with tier 1 and select tier 2 suppliers to drive continuous improvement, ultimately serving as the primary contact to funnel requested changes into the YETI system, and establish appropriate PPAP plans for product & process changes.
Responsibilities:
* Drive supplier quality throughout YETI's suppliers by partnering to implement manufacturing processes capable of ensuring a high degree of product conformance through the implementation of sustainable processes and statistical process controls
* Work with cross functional teams and incorporate Advanced Product Quality Planning (APQP) in major product development projects and product changes; focus on Product requirement, Print reviews, DFx, FMEA, DVP, CAPA, Product and Process capability, Measurement System Analysis, Control Plans, and risk analysis
* Drive the completion of Production Part Approval Process documentation packages with core team members and the suppliers
* Analyze statistical data and product specifications to determine requirements and establish quality and reliability objectives
* Work with suppliers to standardize processes, utilize metrics, and tools to improve product quality
* Partner with YETI's suppliers to funnel required or requested changes through the change management process ultimately ensuring YETI's partners implement processes that drive a high degree of conformance and quality
* Gather pre-shipment and warranty data related to specific YETI product families, leverage LEAN tools to establish root cause solutions and drive continuous improvement
* Analyze new supplier capability as part of the supplier selection process
* Drive continuous improvement by creating, maintaining, and implementing quality documentation such as SOPs, SIPs, workflows, standards libraries, manuals, etc.
* Being the champion of customers and translate voice of customers into product requirement and design. Develop and maintain quality/reliability standards to ensure alignment with customer expectations
* Perform quality reviews and internal audits on existing product development projects
* Participate in supplier qualification and support sourcing decisions
* Enthusiastically collaborate and drive a culture of quality, reliability, and continuous improvement throughout the organization
Qualifications and attributes:
* Bachelor's Degree in Engineering or related field is required
* Greater than 5 years of engineering experience
o (Hard Goods Space: with experience working with large tonnage, molded in color plastic injection parts, and processes)
* Previous experience in quality and design engineering is preferred
* Knowledge of quality tools, APQP, DFx, FMEA, DOE, GR&R, SPC, CAPA and PPAP
* Knowledge of statistical methods and statistical analysis tools
* Knowledge of reliability prediction methodologies
* Ability to solve problems, perform risk analysis, make quality and timely decisions
* Ability to lead and influence cross functional teams and being a team player
* Ability to travel both internationally and domestically
* Ability to solve problems, perform risk analysis, make quality and timely decisions.
* Ability to lead and influence cross functional teams and being a team player.
* Previous experience within a global full-service supply chain model.
Additional Attributes - nice to have
Knowledge and experience with a broad array of manufacturing methods, materials and tooling. Good vision and measured color perception. Additional computer skills preferred: PLM, SharePoint, Salesforce, Tableau, SolidWorks or other CAD software.
Benefits & Perks:
Click here to learn about the benefits and perks we offer at YETI.
YETI is proud to be an Equal Opportunity Employer.
Our commitment to creating a diverse, equitable, and inclusive culture is at the center of everything we do for our employees. We embrace all applicants looking to bring their authentic selves to YETI and contribute to our mission of keeping the wild WILD. Find out more about our commitment to DE&I at yeti.com/esg.html.
All applicants for employment will be considered without regard to an individual's race, color, sex, gender identity, gender expression, religion, age, national origin or ancestry, citizenship, physical or mental disability, medical condition, family care status, marital status, sexual orientation, genetic information, military or veteran status, or any other basis protected by federal, state or local laws.
YETI Applicant Privacy Notice
YETI welcomes and encourages applications from people with disabilities. Accommodations are available upon request for candidates taking part in all aspects of the selection process. If you require accommodation in order to apply for a job, please contact us at accommodationrequest@yeti.com.
$75k-102k yearly est. Auto-Apply 56d ago
Sustaining Engineer
Diamondback Industries 4.3
Texas jobs
Diamondback Industries is expanding and evolving our Engineering team, with multiple roles open for engineers who bring diverse experience, practical knowledge, and creative problem-solving skills. We welcome candidates from a wide range of technical and educational backgrounds who are ready to contribute new ideas and help shape the next phase of our growth.
At Diamondback, we're more than a manufacturing company-we're a team of driven professionals building American-made products that perform every time. As a leader in Energetics and an innovator of disposable setting tools, we deliver precision-engineered well completion solutions backed by an exceptional customer experience.
Watch this behind-the-scenes look at Diamondback, featured by Titans of CNC Machining! Our success is powered by our people. That's why we offer a comprehensive benefits package-including medical coverage with nearly 90% of the monthly premium paid by Diamondback, 401(k) matching, and paid time off.
If you're ready to grow your engineering career with a company that values innovation, accountability, and performance, Diamondback Industries is where opportunity is built.
SUSTAINING ENGINEER
JOB SUMMARY: Diamondback Industries is seeking a creative, hands-on, early-career Sustaining Engineer to join our Engineering team. This role is ideal for a recent Mechanical Engineering graduate (or an upcoming graduate) who thrives on solving problems, refining product performance, and supporting manufacturing in a fast-paced, high-impact environment.
Our Engineers bring ideas to life! From developing design improvements and creating detailed engineering files to assembling, testing, and taking products into the field for validation, you will have hands-on ownership at every stage. As a Sustaining Engineer, you will support our current product lines, drive continuous improvement, and ensure Diamondback's tools and energetics perform flawlessly in the field. JOB FUNCTIONS:
Support and improve existing Diamondback product lines, focusing on reliability, manufacturability, and performance.
Take ownership of sustaining projects from concept to implementation, including updates to design, processes, and documentation.
Troubleshoot manufacturing and assembly issues; partner with Production and Quality to implement effective solutions.
Perform root-cause analysis and support corrective actions to reduce defects and enhance product consistency.
Create, update, and maintain engineering drawings, BOMs, manuals, assembly instructions, and all sustaining documentation.
Review Non-Conformance Reports (NCRs), provide engineering dispositions, and manage Engineering Change Requests (ECRs) through implementation.
Develop and execute test plans, perform failure analysis, and recommend design or process improvements.
Collect, analyze, and interpret test data using standard measurement tools and instrumentation.
Participate in prototype builds, hands-on testing, and occasional field support for product deployment or troubleshooting.
Collaborate with Manufacturing, QC, R&D, Supply Chain, and Compliance; train personnel on product use; and support all regulatory requirements (ATF, DOT, etc.)
JOB QUALIFICATIONS and REQUIREMENTS:
Mechanical Engineering degree preferred. However, candidates with strong mechanical aptitude, relevant industry experience, or sustained hands-on engineering background are encouraged to apply.
Must pass a Bureau of Alcohol, Tobacco, Firearms, and Explosives (ATF) background check.
Proficiency with CAD systems, preferably SolidWorks, including file management and revision control.
Strong mechanical aptitude; comfortable with hand tools, mechanical lab equipment, and basic hydraulic and electrical schematics.
Understanding of test systems, instrumentation, and data acquisition basics.
Excellent problem-solving abilities and willingness to be hands-on.
Strong communication skills and the ability to work collaboratively across teams.
Curiosity, creativity, and passion for learning new processes and technologies.
The ability to lift 25/50/80/100 pounds regularly
The ability to respond quickly to sounds and flashing lights
The ability to work in extreme weather
The ability to wear personal protective equipment correctly
ready to build What's Next With Us? EMAIL YOUR RESUME OR CONTACT INFORMATION TO:
Charles Speck, Director of Engineering: ***************************************
Traci Richards, Director of HR, Marketing, Compliance: ****************************************