Post job

Reliability Engineer jobs at Smiths Group - 3935 jobs

  • Site Reliability Engineer II Boston, Massachusetts, United States Boston, Massachusetts

    Axon Enterprise 4.5company rating

    Boston, MA jobs

    Join Axon and be a Force for Good. At Axon, we're on a mission to Protect Life. We're explorers, pursuing society's most critical safety and justice issues with our ecosystem of devices and cloud software. Like our products, we work better together. We connect with candor and care, seeking out diverse perspectives from our customers, communities and each other. Life at Axon is fast-paced, challenging and meaningful. Here, you'll take ownership and drive real change. Constantly grow as you work hard for a mission that matters at a company where you matter. Your Impact As a Site Reliability Engineer II within the APX SRE organization, you'll focus on delivering practical, scalable solutions to support the reliability and performance of our mission‑critical, cloud‑native global Kubernetes platform and the services that run on it. You care deeply about system stability, clear documentation, and creating tools that improve the developer experience. Work Location: This role is based out of our Boston office and follows a hybrid schedule. We rely on in‑person collaboration and ask that team members work onsite Tuesdays through Fridays, with the flexibility to work remotely on Mondays, unless there is an approved workplace accommodation. We believe that connection fuels innovation, and our in‑office culture is designed to foster meaningful teamwork, mentorship, and shared success. What You'll Do As an SRE, you'll play a critical role in building the infrastructure and tools that power reliable, scalable, and secure engineering operations across the company. You will: Build robust, easy‑to‑use Kubernetes platforms and tools that enable engineering teams to provision and operate services rapidly, consistently, and securely. Exemplify cloud‑native site reliability best practices. Write code that is performant, maintainable, clear, and concise. Employ strong problem‑solving skills, with the ability to debug problems in cloud‑native distributed systems. Influence and educate the engineering organization to adopt new and improved architectural patterns. Provide robust documentation for use by engineers to promote self‑service. Continually seek improvement within our Kubernetes platform for improved reliability, operability, and cost efficiency. Take calculated risks, champion new ideas, and cultivate your craft. What You Bring 3+ years of applicable experience in platform engineering, and container orchestration. Experience building platforms on clouds such as Azure and AWS. Building, operating, and innovating clustering solutions for Kubernetes platforms like AKS, EKS, or similar in production at scale. Experience with programming languages such as Python, Go, C#, Java, or similar. Experience of code collaboration such as GitHub, ArgoCD, or similar. Experience utilizing CI/CD platforms to automate provisioning infrastructure, software builds, tests, and releases. Experience using observability tools such as APM, logging, and metrics to assist with debugging issues. Experience using Infrastructure as Code tools for provisioning infrastructure such as Terraform, Pulumi, or similar. Experience designing tooling to simplify the operational management of SaaS/PaaS systems. Familiarity with building flexible and testable Infrastructure as Code modules. Empathy to support the needs of software engineers. Benefits that Benefit You Competitive salary and 401k with employer match. Discretionary time off. Paid parental leave for all. Fitness programs. Emotional & Development Programs. And yes, we have snacks in our offices. Benefits listed herein may vary depending on the nature of your employment and the location where you work. Base Pay Range $115,500 - $184,800 USD Don't meet every single requirement? That's ok. At Axon, we aim far. We think big with a long‑term view because we want to reinvent the world to be a safer, better place. We are also committed to building diverse teams that reflect the communities we serve. Studies have shown that women and people of color are less likely to apply to jobs unless they check every box in the . If you're excited about this role and our mission to Protect Life but your experience doesn't align perfectly with every qualification listed here, we encourage you to apply anyway. You may be just the right candidate for this or other roles. The above is not intended as, nor should it be construed as, exhaustive of all duties, responsibilities, skills, efforts, or working conditions associated with this job. The job description may change or be supplemented at any time in accordance with business needs and conditions. Some roles may also require legal eligibility to work in a firearms environment. Axon's mission is to Protect Life and is committed to the well‑being and safety of its employees as well as Axon's impact on the environment. All Axon employees must be aware of and committed to the appropriate environmental, health, and safety regulations, policies, and procedures. Axon employees are empowered to report safety concerns as they arise and activities potentially impacting the environment. We are an equal opportunity employer that promotes justice, advances equity, values diversity and fosters inclusion. We're committed to hiring the best talent-regardless of race, creed, color, ancestry, religion, sex (including pregnancy), national origin, sexual orientation, age, citizenship status, marital status, disability, gender identity, genetic information, veteran status, or any other characteristic protected by applicable laws, regulations and ordinances-and empowering all of our employees so they can do their best work. If you have a disability or special need that requires assistance or accommodation during the application or the recruiting process, please email **********************. Please note that this email address is for accommodation purposes only. Axon will not respond to inquiries for other purposes. #J-18808-Ljbffr
    $115.5k-184.8k yearly 4d ago
  • Job icon imageJob icon image 2

    Looking for a job?

    Let Zippia find it for you.

  • Site Reliability Engineer

    Apple Inc. 4.8company rating

    San Francisco, CA jobs

    San Francisco, California, United States Software and Services The Apple Services Engineering (ASE) team is one of the most exciting examples of Apple's long-held passion for combining art and technology. These are the people who power the App Store, Apple TV, Apple Music, Apple Podcasts, and Apple Books. And they do it on a massive scale, meeting Apple's high expectations with high performance to deliver a huge variety of entertainment in over 35 languages to more than 150 countries. These engineers build secure, end-to-end solutions. They develop the custom software used to process all the creative work, the tools that providers use to deliver that media, all the server-side systems, and the APIs for many Apple services. Thanks to Apple's unique integration of hardware, software, and services, engineers here partner to get behind a single unified vision. That vision always includes a deep commitment to strengthening Apple's privacy policy, one of Apple's core values. Although services are a bigger part of Apple's business than ever before, these teams remain small, forward‑thinking, and cross‑functional, offering greater exposure to the array of opportunities here. Description Apple Services Engineering infrastructure is BIG. Operating at our scale, across multiple geographically dispersed data centers and servicing hundreds of millions of users presents unique challenges. As an SRE at Apple, you'll need to solve these problems using data, teamwork, and your own expertise. SREs at Apple own the full infrastructure stack; from device driver performance debugging to content delivery network traffic management - our responsibilities are both broad and deep. ASE runs the majority of its systems on Linux. We run a mix of open source, vendor‑licensed, and internally developed tools to perform functions such as system configuration management, provisioning, software deployment, logging, and monitoring. You'll learn these tools and have opportunities to improve them. Our team is collaborative; we work closely with the development teams we support to deliver the best results for Apple. We think critically and strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are rewarded. Culturally we believe in a close partnership with our development teams and aim to design & build new services together. We're passionate about software and automation in SRE and develop a variety of tooling and infrastructure. Our services run on mixed & hybrid platforms. Minimum Qualifications BS/MS in Computer Science or Equivalent At least 3 years in a Reliability Engineering, DevOps or infrastructure focused role Advanced experience with programming languages (GoLang, Python, Java) Passion for designing and building reliable systems Strong sense of ownership and integrity demonstrated through clear communication and collaboration Deep systems and infrastructure knowledge Advanced knowledge and hands‑on experience with CI/CD systems Automation advocate - you truly believe in removing operation load with software Understanding of the Linux Operating System, standard networking protocols, and components Preferred Qualifications Experience in managing and scaling distributed systems in a public, private, or hybrid cloud environment Hands‑on experience managing large numbers of diverse systems with configuration management or software delivery platforms (such as Puppet, Ansible, and Spinnaker) Experience with deploying, supporting and monitoring new and existing services, platforms, and application stacks Excellent troubleshooting and problem solving skills Experience with scale testing, disaster recovery, and capacity planning Familiarity with microservices architecture and container orchestration with Docker & Kubernetes Demonstrated ability to deliver results on time with high quality At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $147,400 and $272,100, and your base pay will depend on your skills, qualifications, experience, and location. Apple employees also have the opportunity to become an Apple shareholder through participation in Apple's discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple's Employee Stock Purchase Plan. You'll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses - including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits. Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program. Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant. Apple accepts applications to this posting on an ongoing basis. #J-18808-Ljbffr
    $147.4k-272.1k yearly 1d ago
  • Site Reliability Engineer - Scale, Automation & Observability

    Apple Inc. 4.8company rating

    San Francisco, CA jobs

    A leading technology company based in San Francisco is seeking a Site Reliability Engineer (SRE) to manage their vast infrastructure. The role requires a blend of deep technical skills in reliability engineering and a passion for building reliable systems. Candidates should have a strong background in programming, CI/CD, and Linux. The position offers competitive pay, opportunities for stock options, and comprehensive benefits, along with the chance to work in a collaborative, innovative environment that emphasizes teamwork and development. #J-18808-Ljbffr
    $147k-191k yearly est. 1d ago
  • Senior GaN Reliability Engineer

    Raytheon 4.6company rating

    Boston, MA jobs

    Country: United States of America Onsite U.S. Citizen, U.S. Person, or Immigration Status Requirements: The ability to obtain and maintain a U.S. government issued security clearance is required. U.S. citizenship is required, as only U.S. citizens are eligible for a security clearance Security Clearance: DoD Clearance: Secret At Raytheon, the foundation of everything we do is rooted in our values and a higher calling - to help our nation and allies defend freedoms and deter aggression. We bring the strength of more than 100 years of experience and renowned engineering expertise to meet the needs of today's mission and stay ahead of tomorrow's threat. Our team solves tough, meaningful problems that create a safer, more secure world. Job Summary: An exciting opportunity has opened for a Senior GaN Reliability Engineer to join our Advanced Microelectronic Solutions Department (AMSD) at Raytheon in Andover, MA. AMSD develops, designs, and manufactures compound semiconductor devices, microwave/millimeter wave integrated circuits, and modules for defense applications. Your subject matter technical expertise in reliability testing and characterization, device physics, and applied statistics is key to the success on this role. As a senior engineer, you will need to communicate efficiently and effectively with a cross-functional team and are expected to demonstrate strong ownership and accountability. This 1st shift role will be 100% on-site and based in Andover, MA What You Will Do: Own end-to-end reliability tasks (plan, design, test, analyze, report) with strong initiatives to go above and beyond Push Device Reliability Function to a higher level through exploration of new ideas, advising and implementing new approaches and methodologies Interact with a team of process engineers, technology owners, support team, and project leads Produce key reliability metrics such as MTTF, failure rates, and safe operating conditions Work on compound semiconductor transistors, passive components, and monolithic microwave integrated circuits (MMICs) Interact with probe stations and multiple reliability testing systems (DC and RFOL) Support onsite wafer foundry's next-generation technology development, tool qualification, and reliability-oriented design rule verification and development Coordinate failure analysis Qualifications You Must Have: Typically requires a Bachelor's degree in Science, Technology, Engineering or Mathematics (STEM) and a minimum 5 years of prior work experience or an Advanced Degree in a related field with minimum 2 years of prior work experience with device-level reliability testing and characterization in compound semiconductor field, preferably GaN and GaAs Experience in the design and execution of reliability test plans Experience with applied statistic, reliability modeling, data analysis, and metrics generation The ability to obtain and maintain a U.S. government issued security clearance is required. U.S. citizenship is required, as only U.S. citizens are eligible for a security clearance. Qualifications We Prefer: Solid state device physics knowledge Semiconductor fabrication experience Failure analysis experience with SEM, FIB, STEM, etc. Experience in RF field Experience with lab tools such as probe stations, curve tracers, Keithly/Keysight equipment Experience with related industry standards such as JEDEC, MIL Python, PyQt, and data analysis tools (Pandas, Sklearn, Matplotlib) Project management skills Strong informal and formal reporting and presenting skills What We Offer: Our values drive our actions, behaviors, and performance with a vision for a safer, more connected world. At RTX we value: Trust, Respect, Accountability, Collaboration, and Innovation. As part of our commitment to maintaining a secure hiring process, candidates may be asked to attend select steps of the interview process in-person at one of our office locations, regardless of whether the role is designated as on-site, hybrid or remote. The salary range for this role is 82,000 USD - 164,000 USD. The salary range provided is a good faith estimate representative of all experience levels. RTX considers several factors when extending an offer, including but not limited to, the role, function and associated responsibilities, a candidate's work experience, location, education/training, and key skills.Hired applicants may be eligible for benefits, including but not limited to, medical, dental, vision, life insurance, short-term disability, long-term disability, 401(k) match, flexible spending accounts, flexible work schedules, employee assistance program, Employee Scholar Program, parental leave, paid time off, and holidays. Specific benefits are dependent upon the specific business unit as well as whether or not the position is covered by a collective-bargaining agreement.Hired applicants may be eligible for annual short-term and/or long-term incentive compensation programs depending on the level of the position and whether or not it is covered by a collective-bargaining agreement. Payments under these annual programs are not guaranteed and are dependent upon a variety of factors including, but not limited to, individual performance, business unit performance, and/or the company's performance.This role is a U.S.-based role. If the successful candidate resides in a U.S. territory, the appropriate pay structure and benefits will apply.RTX anticipates the application window closing approximately 40 days from the date the notice was posted. However, factors such as candidate flow and business necessity may require RTX to shorten or extend the application window. RTX is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability or veteran status, or any other applicable state or federal protected class. RTX provides affirmative action in employment for qualified Individuals with a Disability and Protected Veterans in compliance with Section 503 of the Rehabilitation Act and the Vietnam Era Veterans' Readjustment Assistance Act. Privacy Policy and Terms: Click on this link to read the Policy and Terms
    $84k-105k yearly est. 20h ago
  • Founding SRE Engineer - Reliability & Growth

    Asana 4.6company rating

    San Francisco, CA jobs

    A leading software company is seeking experienced Software Engineers to join the new Site Reliability Engineering team. This role focuses on building reliable, scalable systems and leading projects across infrastructure. Candidates should have strong software engineering skills and a passion for reliability. The position offers a hybrid work model and generous compensation packages with additional benefits. #J-18808-Ljbffr
    $147k-189k yearly est. 2d ago
  • SRE II - Hybrid Cloud Reliability Engineer

    Axon Enterprise 4.5company rating

    Boston, MA jobs

    A technology company is seeking a Site Reliability Engineer in Boston to enhance cloud-native services and ensure high reliability. Responsibilities include building foundational platforms, utilizing cloud tools, and following best practices. Candidates should have over 5 years of relevant experience in managing cloud platforms and programming languages like Python or Go. This role follows a hybrid work schedule, promoting collaboration and innovation. #J-18808-Ljbffr
    $88k-122k yearly est. 1d ago
  • Reliability Engineer

    Mini-Circuits 4.1company rating

    New York, NY jobs

    Mini-Circuits designs, manufactures and distributes integrated circuits, modules, and sub‑systems for high‑performance radio frequency (RF) and microwave applications. With design, sales and manufacturing locations in over 30 countries, Mini‑Circuits' products are used in a range of wired and wireless communications applications. Our products are also used in detection, measurement and imaging applications, including military communication, guidance and electronic countermeasure systems, commercial, scientific, military land, sea and aircraft; automotive systems, medical systems, and industrial test equipment. Mini‑Circuits' sells its products to over 20,000 customers globally through our direct sales force, applications engineering staff, sales representatives, as well as through our extensive website. Position Summary: The Reliability Engineer is responsible for conducting reliability studies of existing products and coordinating new product qualification prior to market release. The candidate will work in collaboration with various teams including Reliability, Design Engineering, Product Engineering, Failure Analysis and Project Management teams. Salary Range: $99,000 - $117,000 per year Job Function: Participate in the product development meetings and guide the team to develop reliable products that meet internal specifications and customer requirements. Develop qualification plans for new products, primarily MMICs but also support other product lines including but not limited to Low Temperature Co‑Fired Ceramics, PCBA products, RF accessories and Core & Wire Products. Analyze new products for similarity with existing released products in terms of package, die process and design to determine Qualification by Similarity, thus streamlining qualification testing. Design and execute both device level and package level qualification tests including but not limited to MSL pre‑conditioning, Thermal cycling, UHAST, HTSL, ESD and Life Tests. Define ESD Human Body Model (HBM) and Charged Device Model (CDM) tests as per JEDEC standards. Collaborate with Engineering Test Teams to execute Accelerate Life Tests, High Temperature Operating Life Test. Execute Mechanical stresses such as Vibration, Mechanical Shock, Constant Acceleration & Bend Testing. Co‑ordinate with external labs for outsourced tests. Review RF Test data before and after stresses to analyze changes in performance. Collaborate with Failure Analysis teams to understand the root cause of failures. Identify and record any non‑conformities. Monitor solution implementations to verify effectiveness of corrective actions. Ensure On‑Time Completion of Qualification activities and escalator any potential delays. Present Qualification results with all relevant stakeholders to help Design teams initiate changes to improve reliability performance. Prepare written reports summarizing the results of product performance and failure analysis for both internal purposes as well as customer review. Interface with customers and suppliers on product reliability as required. Interface with supplier to purchase lab equipment. Support reliability assessments originating from production of released products or customer returns. Makes decisions within area of specialty, manages medium to large projects. Promotes ISO9001/AS9100 Quality. The duties, responsibilities and expectations described above are not a comprehensive list and additional tasks may be assigned to the member, within the scope of the position. Qualifications: BS in Mechanical Engineering, Electrical Engineering, Materials, Reliability, Industrial Engineering or Physics. Advanced degree preferred. 3‑5 years' experience as a Reliability Engineer in Semiconductor or equivalent industry. Familiarity with common industry standards including JEDEC, MIL‑STD‑883, MIL‑STD‑202 and AEC‑Q. Experience with Reliability Qualification by Similarity. Experience with Environmental, Mechanical and ESD stresses. Experience with problem solving methodologies and leading root cause analysis. Experience with customer returns failure analysis support. Must have familiarity with failure analysis techniques including Scanning Acoustic Microscopy (SEM), Radiographic Inspection (X‑Ray), Cross‑Section methods. Familiarity with MTTF, MTBF Calculations. Experience with Reliability prediction modeling and tools like Weibull++ (or equivalent reliability software) Experience with Data analysis tools including Advanced Excel, JMP, Minitab. Ability to analyze component performance data in reliability tests, including large variety of test parts and multiple design variations. Experience with Design of Experiments, FMEA, product design reviews and DFM. Excellent written and oral communication skills. Physical Demands: The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. While performing the duties of this job, the employee is regularly required to talk and hear. The employee frequently is required to stand, walk, sit and use hands to operate a computer keyboard. The employee is occasionally required to reach with hands and arms. The employee must occasionally lift and/or move up to 10 pounds. Specific vision abilities required by this job include close vision, and ability to adjust focus. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions. Additional Requirements/Skills: Ability and willingness to abide by Company's Code of Conduct. Occasional travel, some overnight, as required (up to 10%). Disclaimer: The listed qualifications and requirements for each position are intended as guidelines. Mini‑Circuits reserves the right to hire outside of these guidelines at Management's discretion. Mini‑Circuits is an Equal Opportunity Employer and does not discriminate on the basis of actual or perceived age, race, creed, color, national origin, sexual orientation, military status, sex, disability, predisposing genetic characteristics, marital status, familial status, gender identity, gender dysphoria, pregnancy‑related condition, and domestic violence victim status or protected class characteristic, or any other protected characteristic as established by federal or state law. #J-18808-Ljbffr
    $99k-117k yearly 5d ago
  • Product Quality Engineer

    Applied Materials 4.5company rating

    Santa Clara, CA jobs

    **Who We Are** Applied Materials is a global leader in materials engineering solutions used to produce virtually every new chip and advanced display in the world. We design, build and service cutting-edge equipment that helps our customers manufacture display and semiconductor chips - the brains of devices we use every day. As the foundation of the global electronics industry, Applied enables the exciting technologies that literally connect our world - like AI and IoT. If you want to push the boundaries of materials science and engineering to create next generation technology, join us to deliver material innovation that changes the world. **What We Offer** Salary: $110,500.00 - $152,000.00 Location: Santa Clara,CA You'll benefit from a supportive work culture that encourages you to learn, develop, and grow your career as you take on challenges and drive innovative solutions for our customers. We empower our team to push the boundaries of what is possible-while learning every day in a supportive leading global company. Visit our Careers website to learn more. At Applied Materials, we care about the health and wellbeing of our employees. We're committed to providing programs and support that encourage personal and professional growth and care for you at work, at home, or wherever you may go. Learn more about our benefits (********************************** . General Profile: Requires knowledge and experience in own discipline; still acquiring higher-level knowledge and skills. Builds knowledge of the company, processes and customers. Solves a range of straightforward problems. Analyzes possible solutions using standard procedures. Receives a moderate level of guidance and direction. Key Responsibilities 1. Develops, applies, revises, maintains and/ or tests quality standards to ensure alignment with customer expectations. 2. Designs and implements methods and procedures for inspecting, testing and evaluating the quality of products 3. Develops, implements quality test plans and performs failure analysis. Perform FMECA, document CRAMS and work with supplier and engineer to enable crams test plan. Also perform PQP at supplier. Reliability modeling and ERAMS 4. Gathers operational and test data and evaluates results. Prepares documentation for testing. 5. Develops methods and parameters, project methodology and/ or project proposals. 6. Evaluates work methods, procedures and policies to ensure world class quality standards are attainable. 7. May be accountable for projects/ programs as wells as, developing methods and parameters, project methodology and/ or project proposals. 8. Coaches, mentor and conduct training for targeted organizations on quality & reliability process Functional Knowledge · Demonstrates expanded conceptual knowledge in own discipline and broadens capabilities Business Expertise · Understands key business drivers; uses this understanding to accomplish own work Leadership · No supervisory responsibilities but provides informal guidance to new team members Problem Solving · Solves problems in straightforward situations; analyzes possible solutions using technical experience and judgment and precedents Impact · Impacts quality of own work and the work of others on the team; works within guidelines and policies Interpersonal Skills · Explains complex information to others in straightforward situations Education: Bachelor's Degree Experience: 2 - 4 Years **Additional Information** **Time Type:** Full time **Employee Type:** Assignee / Regular **Travel:** Yes, 10% of the Time **Relocation Eligible:** Yes The salary offered to a selected candidate will be based on multiple factors including location, hire grade, job-related knowledge, skills, experience, and with consideration of internal equity of our current team members. In addition to a comprehensive benefits package, candidates may be eligible for other forms of compensation such as participation in a bonus and a stock award program, as applicable. For all sales roles, the posted salary range is the Target Total Cash (TTC) range for the role, which is the sum of base salary and target bonus amount at 100% goal achievement. Applied Materials is an Equal Opportunity Employer. Qualified applicants will receive consideration for employment without regard to race, color, national origin, citizenship, ancestry, religion, creed, sex, sexual orientation, gender identity, age, disability, veteran or military status, or any other basis prohibited by law. In addition, Applied endeavors to make our careers site (**************************************************** accessible to all users. If you would like to contact us regarding accessibility of our website or need assistance completing the application process, please contact us via e-mail at Accommodations_****************, or by calling our HR Direct Help Line at ************, option 1, and following the prompts to speak to an HR Advisor. This contact is for accommodation requests only and cannot be used to inquire about the status of applications.
    $110.5k-152k yearly 1d ago
  • Site Reliability Engineer III

    Veeam 4.1company rating

    San Francisco, CA jobs

    Veeam, the #1 global market leader in data resilience, believes businesses should control all their data whenever and wherever they need it. Veeam provides data resilience through data backup, data recovery, data portability, data security, and data intelligence. Based in Seattle, Veeam protects over 550,000 customers worldwide who trust Veeam to keep their businesses running. Join us as we move forward together, growing, learning, and making a real impact for some of the world's biggest brands. The future of data resilience is here - go fearlessly forward with us. About The Role We are looking for an experienced Senior Site Reliability Engineer to join the Veeam Data Cloud (VDC) engineering team. You will be working with a global team to build the world's next modern data protection platform for Veeam. This is an excellent opportunity for someone with SaaS experience to work with a cutting‑edge technology stack based on containers, serverless infrastructure, Golang, public cloud services in the SaaS domain. What You'll Do Design, implementation and maintenance of scalable and reliable infrastructure solutions on Microsoft Azure and additional cloud platforms in the future Automation of the deployments, maintenance of a resilient, secure, and efficient SaaS application platform to meet established service levels Upkeep and support of delivery and release pipelines Continuous evaluation and improvement of the reliability, performance, and scalability of our systems Development of comprehensive monitoring and alerting solutions Incident response for distributed applications in production environments, including a mandatory participation in on-call rotations Proactively meet standards for information security and compliance, such as ISO (International Standards Organization), SOX (Sarbanes Oxley), SSAE (Standards for Attestation Engagements) 16, etc. Shepherd the definition, documentation, and improvement of our internal standards for style and maintainability Technologies We Work With Microsoft TFS, Azure DevOps, Git, BitBucket Azure (Entra ID, API Management, Cosmos DB, Storage services, Azure Functions, static website hosting, Azure security, etc.) IaC tools (Azure ARM templates, AWS CloudFormation, Terraform, the Serverless Framework, etc.) Observability (Azure Monitor, AppInsights, Elastic Stack) What You'll Bring 3+ years of experience in 24x7 production operations for a SaaS (Software as a Service) or cloud service provider Experience with implementation and maintenance of leading infrastructure and application monitoring tools (Azure Monitor, AppInsights, Elastic Cloud) Experience managing Azure IaaS (Infrastructure as a Service) and PaaS (Platform as a Service) solutions Strong problem‑solving skills and the ability to troubleshoot complex issues in a distributed, multi‑tenant environment Experience with container orchestration and management platforms Possess system programming skills in Python, PowerShell, Bash, Go, etc. Experience with implementation, maintenance, and support of CI/CD practices and tools (Azure DevOps or similar) Experienced with distributed, event‑based messaging architectures (Azure Event Hub, Azure Service Bus, Kafka, etc.) English proficiency level sufficient to communicate with international teams Bonus Skills Industry‑recognized certifications in the relevant field (e.g. AZ‑400, AWS Certified DevOps Engineer, DCA) Experience with migrating and adapting on‑premises products to cloud infrastructure Experience with AWS (ECS, RDS, DynamoDB, VPCs, Step Functions, Lambda, IAM, EC2, S3, etc.) Experience with C# and .NET Remote work is only possible for employees located in the United States. What You'll Get Unlimited paid time off, 12 paid holidays, plus 4 extra global Veeam Days for self‑care and 24 paid volunteer hours annually through Veeam Cares Paid parental leave: 8 weeks for all parents, 16 weeks for birthing parents Medical, dental, and vision coverage starting on your first day Mental health support, therapy sessions, and digital wellness tools via our Employee Assistance Program 401(k) retirement plan with company matching contributions Fertility, adoption, and surrogacy support through Maven, plus paid volunteer time AirVet: 24/7 virtual veterinary care at no cost Legal services, identity protection, and supplemental health insurance options Tax‑advantaged spending accounts for healthcare, dependent care, and commuting Opportunities to learn and grow through on‑demand libraries (LinkedIn Learning, O'Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning Compensation Transparency Veeam is committed to pay transparency and equitable compensation. For this role, the compensation range below reflects the expected total target compensation (TTC), inclusive of base pay and a competitive performance‑based bonus. For roles with a commission plan, the compensation range represents On Target Earnings (OTE), which includes base salary plus variable commission. When determining compensation, Veeam takes into consideration factors such as experience, education, skills, and geographic zone. Offers are typically made below the midpoint of the range. In addition to compensation, Veeam provides a comprehensive benefits package, including health coverage, retirement plans, and unlimited time off. Zone 1: San Francisco Bay Area, New York City Boroughs - $151,500 - $252,500 USD Zone 2: Washington, California (excluding San Francisco Bay Area) - $138,900 - $231,400 USD Zone 3: Texas, Illinois, North Carolina, Colorado, Massachusetts, Pennsylvania, Virginia, Oregon, Nevada, Hawaii, New York (excluding NYC boroughs); Sales roles located in Georgia, Ohio, and Arizona - $126,300 - $210,400 USD Zone 4: All other US locations - $109,800 - $183,000 USD Veeam Software is an equal opportunity employer and does not tolerate discrimination in any form on the basis of race, color, religion, gender, age, national origin, citizenship, disability, veteran status or any other classification protected by federal, state or local law. All your information will be kept confidential. #J-18808-Ljbffr
    $151.5k-252.5k yearly 3d ago
  • Reliability Engineer II - RAM/FRACAS

    Raytheon 4.6company rating

    Miami, FL jobs

    Country: United States of America Onsite U.S. Citizen, U.S. Person, or Immigration Status Requirements: The ability to obtain and maintain a U.S. government issued security clearance is required. U.S. citizenship is required, as only U.S. citizens are eligible for a security clearance Security Clearance: DoD Clearance: Secret At Raytheon, the foundation of everything we do is rooted in our values and a higher calling - to help our nation and allies defend freedoms and deter aggression. We bring the strength of more than 100 years of experience and renowned engineering expertise to meet the needs of today's mission and stay ahead of tomorrow's threat. Our team solves tough, meaningful problems that create a safer, more secure world. Life Cycle Engineering (LCE) is responsible for ensuring our products are Safe, Reliable, Maintainable, and delivered on time. It consists of multiple disciplines that support engineering, our program offices, and our customers throughout the total life cycle of our products-from conception to deactivation. Our primary focus is on product support, including the following disciplines: Reliability, System Safety, and Supportability. As part of this mission, Raytheon currently has an exciting opportunity for a Reliability, Availability, Maintainability (RAM) and Failure Reporting, Analysis, and Corrective Action System (FRACAS) Engineer to help support our goal of making the world a safer place. What You Will Do Conducting Reliability, Availability, and Maintainability (RAM) analyses, including reliability predictions and maintainability predictions Performing Failure Modes Effects and Criticality Analyses (FMECA) and Fault Detection/Fault Isolation (FDFI) analyses Defining/allocating RAM requirements Assessing RAM performance: Mean Time Between Failure (MTBF), Mean Time Between Critical Failure (MTBCF) & Operational Availability (Ao) Interpreting data and analyzing results Ensuring performance compliance to requirements Maintaining complex Reliability Block Diagrams (RBD) and system models Presenting to internal and external stakeholders at program meetings and failure Review Boards (FRB) Performing FRACAS tasks, including trend analysis and failure investigations This is a full-time onsite position based in Tewksbury, MA Qualifications You Must Have Typically requires a degree in Science, Technology, Engineering or Mathematics (STEM) and a minimum of 2 years of prior relevant experience Experience with reliability analysis, failure investigations, and/or root cause/corrective action (RCCA) Qualifications We Prefer Experience with Reliability and FRACAS Engineering Concepts including knowledge of reliability and maintainability predictions, FMECA/FDFI, RBDs, MTBF/MTBCF, Ao Experience presenting to Failure Review Boards Electrical Engineering experience, Electrical Engineering degree/education preferred Strong analytical skills with the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy. Ability to present technical material to varied audiences Ability to work collaboratively across disciplines What We Offer Our values drive our actions, behaviors, and performance with a vision for a safer, more connected world. At RTX we value: Safety, Trust, Respect, Accountability, Collaboration, and Innovation. Relocation assistance is available! Learn More & Apply Now! Please consider the following role type definition as you apply for this role: Onsite: Employees who are working in Onsite roles will work primarily onsite. This includes all production and maintenance employees, as they are essential to the development of our products. Clearance Information: This position requires a security clearance. DCSA Consolidated Adjudication Services (DCSA CAS), an agency of the Department of Defense, handles and adjudicates the security clearance process. More information about Security Clearances can be found on the US Department of State government website here: ************************************************ As part of our commitment to maintaining a secure hiring process, candidates may be asked to attend select steps of the interview process in-person at one of our office locations, regardless of whether the role is designated as on-site, hybrid or remote. The salary range for this role is 68,900 USD - 131,100 USD. The salary range provided is a good faith estimate representative of all experience levels. RTX considers several factors when extending an offer, including but not limited to, the role, function and associated responsibilities, a candidate's work experience, location, education/training, and key skills.Hired applicants may be eligible for benefits, including but not limited to, medical, dental, vision, life insurance, short-term disability, long-term disability, 401(k) match, flexible spending accounts, flexible work schedules, employee assistance program, Employee Scholar Program, parental leave, paid time off, and holidays. Specific benefits are dependent upon the specific business unit as well as whether or not the position is covered by a collective-bargaining agreement.Hired applicants may be eligible for annual short-term and/or long-term incentive compensation programs depending on the level of the position and whether or not it is covered by a collective-bargaining agreement. Payments under these annual programs are not guaranteed and are dependent upon a variety of factors including, but not limited to, individual performance, business unit performance, and/or the company's performance.This role is a U.S.-based role. If the successful candidate resides in a U.S. territory, the appropriate pay structure and benefits will apply.RTX anticipates the application window closing approximately 40 days from the date the notice was posted. However, factors such as candidate flow and business necessity may require RTX to shorten or extend the application window. RTX is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability or veteran status, or any other applicable state or federal protected class. RTX provides affirmative action in employment for qualified Individuals with a Disability and Protected Veterans in compliance with Section 503 of the Rehabilitation Act and the Vietnam Era Veterans' Readjustment Assistance Act. Privacy Policy and Terms: Click on this link to read the Policy and Terms
    $65k-90k yearly est. 20h ago
  • Senior PostgreSQL DBRE - Scale, Reliability & Automation

    Okta, Inc. 4.3company rating

    San Francisco, CA jobs

    A leading identity management firm is looking for a Senior Database Reliability Engineer (DBRE) in San Francisco, California. The ideal candidate will have over 4 years of experience specifically with PostgreSQL and will be responsible for designing and optimizing data persistence layers for mission-critical systems. Key responsibilities include leading database incidents, working cross-functionally with platform teams, developing automation for tasks, and ensuring high availability across database environments. This position is essential for operational excellence in a hybrid environment. #J-18808-Ljbffr
    $157k-199k yearly est. 4d ago
  • Staff Site Reliability Engineer

    Veeam 4.1company rating

    San Francisco, CA jobs

    Veeam, the #1 global market leader in data resilience, believes businesses should control all their data whenever and wherever they need it. Veeam provides data resilience through data backup, data recovery, data portability, data security, and data intelligence. Based in Seattle, Veeam protects over 550,000 customers worldwide who trust Veeam to keep their businesses running. Join us as we move forward together, growing, learning, and making a real impact for some of the world's biggest brands. The future of data resilience is here - go fearlessly forward with us. About the Role Veeam is launching a global Site Reliability Engineering (SRE) function to support the rollout and operation of our new SaaS offering: the Veeam Data Cloud. As a Staff Site Reliability Engineer, you will serve as a hands‑on technical leader within the SRE team, guiding senior engineers, influencing product development teams, and ensuring the systems we operate are built to be reliable, scalable, and observable from the ground up. You will drive strategic initiatives, mentor others in the practice of SRE, and help define architectural best practices across our platform. This role is pivotal in aligning teams, enforcing high standards, and scaling SRE principles globally within Veeam. Reliability Engineering & Resilience Act as a technical authority in your area, mentoring senior engineers and guiding design choices that improve service reliability and resilience. Lead the definition and enforcement of SLIs, SLOs, and error budgets; drive adherence across engineering teams. Collaborate with Staff peers across teams to align strategy and champion shared reliability standards and goals. Partner with development and product teams to proactively design for failure, build resilient architecture, and operationalize reliability from the start. Observability & Operational Excellence Drive company‑wide adoption of observability best practices and tooling. Ensure metrics, logs, and traces provide deep, actionable insights across systems. Lead complex incident responses, post‑mortems, and systemic reliability improvements. Promote and enforce a blameless culture of learning and continuous improvement. Engineering at Scale Lead initiatives in infrastructure as code, deployment automation, and resilience testing. Influence the development and adoption of chaos engineering practices and release validation frameworks. Partner with platform and security teams to ensure production readiness. Collaboration & Culture Work closely with your peer Staff Engineers to plan, align, and deliver against reliability goals. Provide architectural guidance and advocate for engineering rigor and consistency. Represent the SRE team in technical leadership forums and product planning discussions. What We Are Looking For Required 8+ years of experience in a Software Engineering or SRE role, including technical leadership. Demonstrated experience mentoring and guiding senior engineers. Deep expertise in building distributed systems on public cloud (Azure preferred). Strong skills in programming (e.g., JS, Go, Typescript, Java, or C#). Hands‑on experience with observability tooling (e.g., Prometheus, Grafana, OpenTelemetry). Mastery of infrastructure automation tools (Terraform, Pulumi) and container orchestration (Kubernetes). Ability to communicate clearly across geographies and disciplines. Preferred Experience leading SRE initiatives across multiple product teams. Background in chaos engineering, incident learning, or performance and load testing. Familiarity with global compliance standards (ISO, SOC 2, GDPR, FedRAMP, CMMC). Why Join Veeam? Be a core architect in the rollout of Veeam's first global SaaS offering - the Veeam Data Cloud. Help shape a modern, engineering‑driven SRE practice from the ground up. Influence long‑term reliability and architecture across a global product portfolio. Work in a collaborative environment with engineering leaders who value strategic thinking, hands‑on problem solving, and customer empathy. Enjoy competitive pay and benefits, flexible work arrangements, and a team culture built on learning, ownership, and impact. Benefits Unlimited PTO Paid Holidays Veeam Care Days: 24 hours paid time for volunteering Medical, dental, and vision coverage starting on day one (multiple plan options) Flexible Spending Accounts (FSA) and Health Savings Account (HSA) options Employer HSA contributions (for HDHP participants) Life and AD&D insurance (employee, spouse/partner, and child options) Company‑paid short‑term and long‑term disability insurance Supplemental individual disability insurance (IDI) Family planning support: fertility, adoption, surrogacy, and parental resources Paid parental leave Employee Assistance Program Additional voluntary benefits: accident, critical illness, hospital indemnity, legal, identity theft protection, commuter benefits, pet care Mental health support 401(k) plan Professional training and education, on‑demand learning libraries (LinkedIn Learning, O'Reilly), mentoring, workshops, and Global Day of Learning Compensation Transparency Veeam is committed to pay transparency and equitable compensation. For this role, the compensation range below reflects the expected total target compensation (TTC), inclusive of base pay and a competitive performance‑based bonus. For roles with a commission plan, the compensation range represents On Target Earnings (OTE), which includes base salary plus variable commission. Offers are typically made below the midpoint of the range. Zone 1: San Francisco Bay Area, New York City Boroughs - $293,100 - $544,200 USD Zone 2: Washington, California (excluding San Francisco Bay Area) - $268,600 - $498,900 USD Zone 3: Texas, Illinois, North Carolina, Colorado, Massachusetts, Pennsylvania, Virginia, Oregon, Nevada, Hawaii, New York (excluding NYC boroughs); Sales roles located in Georgia, Ohio, and Arizona - $244,200 - $453,500 USD Zone 4: All other US locations - $212,500 - $394,600 USD Veeam Software is an equal opportunity employer and does not tolerate discrimination in any form on the basis of race, color, religion, gender, age, national origin, citizenship, disability, veteran status or any other classification protected by federal, state or local law. All your information will be kept confidential. Please note that any personal data collected from you during the recruitment process will be processed in accordance with our Recruiting Privacy Notice. The Privacy Notice sets out the basis on which the personal data collected from you, or that you provide to us, will be processed by us in connection with our recruitment processes. By submitting your application, you acknowledge that the information provided in your job application and any supporting documents is complete and accurate to the best of your knowledge. Any misrepresentation, omission, or falsification of information may result in disqualification from consideration for employment or, if discovered after employment begins, termination of employment. #J-18808-Ljbffr
    $118k-159k yearly est. 4d ago
  • Site Reliability Engineer US - San Francisco

    Near Inc. 4.6company rating

    San Francisco, CA jobs

    The NEAR AI engineering team is developing decentralized and confidential machine learning infrastructure to power user owned AI. We currently focus on building infrastructure to enable private and confidential inference that works across different compute providers, as well as a blockchain-based coordination layer that incentivizes computer providers to join the decentralized inference network. You will own various components and drive critical decisions throughout their life cycles, including architecture, implementation, and maintenance. You will collaborate with highly knowledgeable and skilled colleagues who are passionate about solving hard problems that can disrupt the industry. What You'll Be Doing: End-to-end infrastructure ownership (for handling telemetry data, for performing testing, etc) Design and implementation of infrastructure components that manage clusters of GPU with special configurations Performance tuning and optimizations Create and maintain runbooks that support the on-call rotation Participate in the on-call rotation. Support code releases and delivery Plan and implement infrastructure cost and security strategies Plan and implement effective CI/CD Pipelines to facilitate development processes What We're Looking For: Agility to quickly learn new programming languages and technologies Ability to write clean and efficient code Ability to transform ambiguous problems into tangible solutions or prototypes Experience with software concurrency or parallelism Experience in building, operating, and scaling Cloud infrastructure (GCP, AWS, etc) Experience with data visualization and observability tooling (Grafana, Graphite, Zabbix, etc) Detail-oriented mindset with a focus on setting priorities and progressing towards objectives Excellent communication and teamwork skills Bachelor's Degree in Computer Science or a related field We'd Love If You Have: Experience with NEAR or other blockchain internals Experience with GPUs Experience with Trusted Execution Environments Experience debugging and troubleshooting complex concurrent systems Professional experience with Rust Locations: onsite, San Francisco office #J-18808-Ljbffr
    $126k-176k yearly est. 2d ago
  • Site Reliability Engineer - AI Inference Infra & GPU Clusters

    Near Inc. 4.6company rating

    San Francisco, CA jobs

    A tech company specializing in AI infrastructure based in San Francisco is looking for a candidate to own the development of decentralized machine learning infrastructure. The role involves designing components, performance tuning, and collaboration with skilled colleagues. The ideal candidate should have experience in Cloud infrastructure and software concurrency, along with a Bachelor's degree in Computer Science. Excellent communication skills and the ability to learn quickly are essential. The position is onsite at the San Francisco office. #J-18808-Ljbffr
    $126k-176k yearly est. 2d ago
  • Business Value Engineer

    Ironclad 3.8company rating

    San Francisco, CA jobs

    Ironclad is the leading AI contracting platform that transforms agreements into assets. Contracts move faster, insights surface instantly, and agents push work forward, all with you in control. Whether you're buying or selling, Ironclad unifies the entire process on one intelligent platform, providing leaders with the visibility they need to stay one step ahead. That's why the world's most transformative organizations, from OpenAI to the World Health Organization and the Associated Press, trust Ironclad to accelerate their business. We're consistently recognized as a leader in the industry: a Leader in the Forrester Wave and Gartner Magic Quadrant for Contract Lifecycle Management, a Fortune Great Place to Work, and one of Fast Company's Most Innovative Workplaces. Ironclad has also been named to Forbes' AI 50 and Business Insider's list of Companies to Bet Your Career On. We're backed by leading investors including Accel, Y Combinator, Sequoia, BOND, and Franklin Templeton. For more information, visit ******************* or follow us on LinkedIn. The Business Value Engineer role is intended to own, define, and execute Ironclad's value-selling methodology across the customer lifecycle. This role is critical in bringing financial rigor and strategic storytelling to the sales process, ensuring our prospects and customers clearly understand the projected and realized impact of Ironclad's technology. You will be responsible for the end-to-end value lifecycle-from initial discovery and financial modeling to executive-level presentations and post‑sale value realization. This role is cross‑functional, partnering with Sales, Customer Success, Product, and Marketing to ensure our value strategy drives revenue growth, deal velocity, and long‑term customer success. What you'll do: Deal‑Level Execution: Own the value strategy for complex deals by leading discovery‑based sales processes to identify customer pain points and business objectives. Develop custom financial models (TCO, ROI) and defend them against CFO and procurement scrutiny. Process Analysis & Standardization: Map as‑is workflows to identify inefficiencies and quantify metrics. Standardize advanced modeling methodologies (scenario‑based, risk‑adjusted) and build repeatable process frameworks for specific industries or functions. Strategic Programs: Lead multi‑threaded value programs that extend beyond individual deals. Create prioritization frameworks for the team to focus the organization on the highest‑impact paths. Sales Partnership & Influence: Provide informal coaching and share best practices across the team to raise the collective bar. Partner seamlessly with GTM and Sales Engineers (SEs) on deal strategy and sponsorship creation. Who are you? Experience: 6+ years of experience in value engineering, management consulting, or software sales, with a track record of driving complex deal strategy. Financial Expertise: Deep experience in financial modeling, specifically developing custom business cases that detail the value of complex software solutions. Resilient & Adaptive: Proven ability to stay effective under ambiguity and adjust value narratives quickly based on customer feedback and shifting deal dynamics. Agile Problem Solver: Ability to guide teams on when to pivot approaches and teach others how to unblock value engagements with methodology when resources are limited. Strategic Storyteller: Demonstrated ability to identify patterns from CXO conversations to innovate and pilot new solutions that establish the team as a trusted advisor. Collaborative Leader: Extensive experience collaborating with cross‑functional teams, including Sales, Sales Engineering, and gTech, to drive persuasive customer adoption. Technical Rigor: Proficiency in mapping as‑is workflows to identify inefficiencies and quantify metrics for "above-the-line" executive audiences. OTE Range: $188,000.00 - $235,000.00 The OTE range represents the minimum and maximum of the OTE range for this position based at our San Francisco headquarters. The OTE offered for this position will depend on numerous factors, including individual proficiency, anticipated performance, and the location of the selected candidate. Our OTE is just one component of Ironclad's competitive total rewards package, which also includes equity awards (a new hire grant, along with opportunities for additional awards throughout your tenure), competitive health and wellness benefits, and a commitment to career growth and development. US Employee Benefits at Ironclad: 100% health coverage for employees (medical, dental, and vision), and 75% coverage for dependents with buy‑up plan options available Market‑leading leave policies, including gender‑neutral parental leave and compassionate leave Family forming support through Maven for you and your partner Paid time off - take the time you need, when you need it Monthly stipends for wellbeing, hybrid work, and (if applicable) cell phone use Mental health support through Modern Health, including therapy, coaching, and digital tools Pre‑tax commuter benefits (US Employees) 401(k) plan with Fidelity with employer match (US Employees) Regular team events to connect, recharge, and have fun And most importantly: the opportunity to help build the company you want to work at UK Employee-specific benefits are included on our UK job postings Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records. #J-18808-Ljbffr
    $188k-235k yearly 4d ago
  • Reliability/DFX Engineer

    Openai 4.2company rating

    San Francisco, CA jobs

    About the Team OpenAI's Hardware organization develops silicon and system-level solutions designed for the unique demands of advanced AI workloads. The team is responsible for building the next generation of AI-native silicon while working closely with software and research partners to co-design hardware tightly integrated with AI models. In addition to delivering production-grade silicon for OpenAI's supercomputing infrastructure, the team also creates custom design tools and methodologies that accelerate innovation and enable hardware optimized specifically for AI. About the Role We are seeking a highly skilled cross-stack engineer with deep expertise in making ML systems reliable at scale. This hands-on individual contributor will sit within our hardware team and work closely with chip design, platform design, hardware health, and the broader industry ecosystem to architect, implement, and deploy reliable next-generation AI accelerator systems. This engineer will evaluate system and chip architecture holistically, identify high-ROI opportunities to improve reliability and availability across the stack, and translate those opportunities into strategy and silicon features. In this role, you will Oversee DFX architecture, implementation, and execution in silicon from concept to high-volume deployment, and propose high-ROI features to enhance reliability and fault tolerance. DFX includes design for testability, reliability, availability, and serviceability of high-performance AI hardware. Build system-level reliability models grounded in empirical data to guide organization-wide DFX and reliability strategy. This requires a detailed understanding of chip and system architecture, design, implementation, and component-level reliability. Collaborate with chip and platform architecture/design teams to explore and implement DFX features, including the specification and implementation of digital/mixed-signal IP, firmware/system software, and DFX methodology (in partnership with engineering teams). Partner with hardware health and platform design teams to continuously improve reliability and fault tolerance in NPI and HVM. This includes optimizing operating conditions, designing experiments, and performing data analysis to drive continuous, data-driven improvements across the stack. Serve as the DFX/reliability champion and evangelist to align the broader industry ecosystem with OpenAI's requirements and roadmap. Qualifications BS with 15+ years, MS with 10+ years, or PhD with 3+ years of relevant industry experience focused on reliability across the chip/platform stack. Hands-on experience with RTL design and DFT is required; physical implementation and/or silicon ATE experience is preferred. Detailed understanding of ML chip and platform architecture and ML workload characteristics is required. Strong fundamentals in reliability modeling, with hands-on skills in empirical data analysis. About OpenAI OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic. For additional information, please see OpenAI's Affirmative Action and Equal Employment Opportunity Policy Statement. Qualified applicants with arrest or conviction records will be considered for employment in accordance with applicable law, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations. To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance. We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link. OpenAI Global Applicant Privacy Policy At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology. #J-18808-Ljbffr
    $127k-176k yearly est. 2d ago
  • Reliability Engineer: Scale Systems, Observe & Automate

    Openai 4.2company rating

    San Francisco, CA jobs

    A leading AI research company based in San Francisco is seeking experienced reliability engineers to scale their infrastructure and ensure system performance and reliability. This role involves collaborating with diverse teams to develop resilient systems and enhance operations. Candidates should have strong cloud proficiency, experience in containerization technologies, and a bachelor's degree in a related field. #J-18808-Ljbffr
    $127k-176k yearly est. 5d ago
  • Principal 3D/2.5D Package Layout Engineer

    Arm Limited 4.8company rating

    San Jose, CA jobs

    A leading technology firm has an exciting opportunity for a Principal Package Layout Engineer in California. This role involves using advanced design techniques for complex package systems in IoT and automotive sectors. The ideal candidate has a degree in Electrical Engineering, over 10 years of experience with Cadence tools, and a deep understanding of package layouts. The company promotes hybrid working and provides support for diverse needs, offering a salary range of $241,100-$326,100 per year. #J-18808-Ljbffr
    $106k-145k yearly est. 2d ago
  • Principal Package Layout Engineer

    Arm Limited 4.8company rating

    San Jose, CA jobs

    Arm Principal Package Layout Engineer Arm has a great opportunity for a Principal Package Layout Engineer in our System-in-Package team! This Engineer will be part of a team responsible for implementing ARM's next generation of SoC's in the IoT, Automotive, and Compute spaces. These complex designs use current and emerging 2.5 and 3D technologies and will require you to work closely with IP, PCB, SI/PI and Systems teams in France, US, UK, and other ARM locations as needed as well as our technology partners. Responsibilities: Layouts using best-known practices for DFM, DFA, Signal and Power Delivery Networks Work with minimal supervision and approach challenges with enthusiasm and persistence Bring forward ideas to improve overall team efficiency Communicate and coordinate with external vendors as necessary Required Skills and Experience: An Electrical Engineering degree or equivalent Experience with 2.5D and 3D package systems and their SI/PI requirements 10+ years of recent experience in the use of Cadence APD & Allegro toolsets Facilitate standardization by being aware of the bigger picture, including design features which will drive standardization and improve efficiency throughout the department Nice To Have Skills and Experience: Familiarity with EDA vendor collaboration, tool evaluation, and issue resolution Strong verbal, written communication and presentation skills Programming experience in Python, SKILL, or TCL Able to optimize design methodologies in packaging design In Return: We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application, interview process, perform essential job functions, and to receive other benefits and privileges of employment. Salary Range: $241,100-$326,100 per year Accommodations at Arm: At Arm, we want to build extraordinary teams. If you need an adjustment or an accommodation during the recruitment process, please email accommodations@arm.com. All accommodation or adjustment requests will be treated with confidentiality, and information concerning these requests will only be disclosed as necessary to provide the accommodation. Hybrid Working at Arm: Arm's approach to hybrid working is designed to create a working environment that supports both high performance and personal wellbeing. We believe in bringing people together face to face to enable us to work at pace, whilst recognizing the value of flexibility. Within that framework, we empower groups/teams to determine their own hybrid working patterns, depending on the work and the team's needs. Details of what this means for each role will be shared upon application. Equal Opportunities at Arm: Arm is an equal opportunity employer, committed to providing an environment of mutual respect where equal opportunities are available to all applicants and colleagues. We are a diverse organization of dedicated and innovative individuals, and don't discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran. #J-18808-Ljbffr
    $106k-145k yearly est. 2d ago
  • Senior Value Engineer - Strategic Deals & ROI Expert

    Ironclad 3.8company rating

    San Francisco, CA jobs

    A leading AI contracting platform is seeking a Business Value Engineer in San Francisco to own and execute value-selling methodology throughout the customer lifecycle. This role will drive revenue growth and customer success by partnering with Sales and Customer Success while developing financial models to articulate the impact of the platform. Candidates should have over six years of relevant experience and a proven track record in strategic storytelling and agile problem-solving. Competitive compensation is offered, including benefits and opportunity for growth. #J-18808-Ljbffr
    $96k-132k yearly est. 4d ago

Learn more about Smiths Group jobs