Post job

Reliability Engineer jobs at Mastercard - 472 jobs

  • Site Reliability Engineer

    The Voleon Group 4.1company rating

    Berkeley, CA jobs

    Voleon is a technology company that applies state‑of‑the‑art AI and machine learning techniques to real‑world problems in finance. For nearly two decades, we have led our industry and worked at the frontier of applying AI/ML to investment management. We have become a multibillion‑dollar asset manager, and we have ambitious goals for the future. Your colleagues will include internationally recognized experts in artificial intelligence and machine learning research as well as highly experienced finance and technology professionals. The people who shape our company come from other backgrounds, including concert music performances, humanitarian aid, opera singing, sports writing, and BMX racing. You will be part of a team that loves to succeed together. In addition to our enriching and collegial working environment, we offer highly competitive compensation and benefits packages, technology talks by our experts, a beautiful modern office, daily catered lunches, and more. As a Site Reliability Engineer (SRE), you will work at the intersection of production operations and software development as you improve, manage, and monitor production‑critical infrastructure and data pipelines. At Voleon, many SREs serve together on a Production Operations team tasked with improving shared production infrastructure. Others are embedded with teams of software engineers to improve specific production systems owned by those teams. Voleon SREs work on important real‑world problems and collaborate with passionate and talented colleagues in an empowering, results‑driven environment. This role is a way to make a real difference: your contributions will make our critical systems more reliable, lower operational risk, and increase the efficiency of our engineering effort. Responsibilities Improve fault‑tolerance and maintainability of code in proprietary data pipelines and trading systems Diagnose and fix bugs in code Lead complex deployments Automate manual workflows Track and prioritize outstanding production‑related issues Share an on‑call rotation responding to incidents to ensure the continuous operation of production‑critical systems Requirements Experience with coding and debugging Python Experience with Linux Familiarity with Relational Databases & SQL Sharp analytical and problem‑solving skills and a persistent drive to make things work (better) Strong growth mindset and a passion for learning Strong technical communication skills Attention to detail 2 years of relevant industry experience An undergraduate degree or comparable training in a quantitative field or equivalent, relevant industry experience Preferred Qualifications Familiarity with best practices concerning code maintainability, documentation, quality assurance, continuous integration and deployment Experience supporting production systems Experience with any of the following: gRPC microservices, Postgres, Pandas, Golang, R, Git, Jenkins, Bazel, Prometheus, Grafana, Airflow, Kubernetes The base salary for this position is $120,000 to $160,000 in the location(s) of this posting. Individual salaries are determined through a variety of factors, including, but not limited to, education, experience, knowledge, skills, and geography. Base salary does not include other forms of total compensation such as bonus compensation and other benefits. Our benefits package includes medical, dental and vision coverage, life and AD&D insurance, 20 days of paid time off, 9 sick days, and a 401(k) plan with a company match. Friends of Voleon Candidate Referral Program If you have a great candidate in mind for this role and would like to have the potential to earn $7,500 - $15,000 if your referred candidate is successfully hired and employed by The Voleon Group, please use this to submit your referral. For more details regarding eligibility, terms and conditions please make sure to review the Voleon Referral Bonus Program. Equal Opportunity Employer The Voleon Group is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law. #J-18808-Ljbffr
    $120k-160k yearly 3d ago
  • Job icon imageJob icon image 2

    Looking for a job?

    Let Zippia find it for you.

  • Senior AI Site Reliability Engineer

    Charles Schwab Corporation 4.8company rating

    San Francisco, CA jobs

    Your Opportunity At Schwab, you will build a rewarding career while making a difference in the lives of our millions of clients. Here, innovative thinking meets creative problem solving as we work together to challenge the status quo. We believe in the power of collaboration and value being together in the office, which is why this role is based on-site in our San Francisco office. Joining Schwab means joining a company committed to transforming the financial industry and putting clients at the center of everything we do. Schwab's AI Strategy & Transformation team, known as AI.x, is the central hub for Artificial Intelligence at Schwab. We are an integrated product, engineering, strategy and risk team, all based in San Francisco. We help set the enterprise vision for AI, invest in the most promising opportunities, and accelerate delivery across the company. We also build the core platform that powers AI at scale and explore next‑generation GenAI efforts that will redefine how we serve our clients. As a Senior AI Site Reliability Engineer on AI.x, you will play a key role in ensuring our AI solutions are reliable, scalable, and resilient-enabling us to deliver innovative experiences to millions of clients. This role is more than a reliability engineering position. It is an opportunity to join a high‑profile team shaping Schwab's future with AI, to build and maintain solutions that matter to millions of clients, and to grow your career in one of the most exciting areas of technology today. As a Senior AI Site Reliability Engineer, you will design, implement, and manage the reliability and operational excellence of GenAI applications and platforms. You will work closely with architects, engineers, and business leaders to align reliability practices with Schwab's enterprise strategy. You will mentor and coach junior engineers, helping to build strong operational practices and foster a culture of continuous improvement. You will lead by example in solving complex reliability challenges, advancing SRE standards, and driving rapid iteration from concept to production. Above all, you will bring curiosity, creativity, and technical depth to help shape the next generation of reliable AI at Schwab. What you have Required Qualifications 8+ years of software development or reliability engineering experience, with 4+ years as a hands‑on senior engineer in startups and/or large organizations. Bachelor's degree in Computer Science or related field. 5+ years of experience building and operating complex products from scratch and running them in production. 3+ years of experience supporting applications that use Artificial Intelligence (AI) models to deliver real business impact. 3+ years of experience building and maintaining data pipelines and infrastructure for large datasets. 3+ years of experience with containers and cloud‑native applications, and the ability to operationalize them in the public cloud with infrastructure as code. Experience implementing monitoring, alerting, and incident response for large‑scale distributed systems. Proven track record in driving reliability, scalability, and performance improvements for production AI systems. Preferred Qualifications Strong computer science fundamentals and experience working across different parts of the tech stack. Experience working with proprietary or open‑source LLMs (Gemini, Claude, OpenAI or other models) and supporting LLM‑powered applications in production. Focus on quality and reliability in everything you do. Continue to raise the bar and drive others to deliver high‑quality, resilient products, with experience writing tests and implementing automated reliability checks. Experience writing and running evaluations to ensure quality and monitor consistency in LLM‑generated responses and actions. Strong communication skills - you balance written and verbal communication to clearly share your perspective with others on the team. Experience mentoring junior engineers and helping them grow their technical and operational skills through clear feedback and code reviews. Demonstrated mindset of continuous learning and improvement. Ability to solve complex problems with ambiguous or incomplete data in highly distributed systems. Demonstrated business domain knowledge related to all products you have worked on. Curiosity about new technologies and processes - you always seek to improve yourself and everyone around you and proactively seek and share knowledge with others on your team. Experience with Python and front‑end development preferred but not required. Master's or advanced degrees in Computer Science or related fields. In addition to the salary range, this role is also eligible for bonus or incentive opportunities. #J-18808-Ljbffr
    $118k-152k yearly est. 4d ago
  • Senior AI SRE: Scale GenAI Reliability & Impact

    Charles Schwab Corporation 4.8company rating

    San Francisco, CA jobs

    A leading financial services firm is seeking a Senior AI Site Reliability Engineer responsible for designing and managing the reliability of AI-driven applications. In this role, you'll work on innovative projects and mentor junior engineers while collaborating with cross-functional teams. Candidates should have extensive experience in software development and reliability engineering, with a particular focus on AI systems. This on-site position is located in San Francisco and offers opportunities for professional growth and development. #J-18808-Ljbffr
    $118k-152k yearly est. 4d ago
  • Process Improvement Specialist/Concrete Industry

    DZ Corporation 4.3company rating

    The Villages, FL jobs

    Reports To: Operations Manager The Process Improvement Specialist is responsible for optimizing production processes within the precast concrete facility. This role focuses on identifying inefficiencies, implementing process enhancements, and supporting quality and safety improvements across manufacturing operations. Working closely with production teams, engineers, and supervisors, the specialist helps streamline workflows, reduce waste, and ensure consistent product quality. Key Responsibilities: Process Analysis & Optimization: Observe and analyze daily production activities (casting, curing, reinforcement, finishing, etc.) to identify bottlenecks and improvement opportunities. Data Collection & Reporting: Gather and track production data such as cycle times, material usage, downtime, and defect rates to support improvement projects. Continuous Improvement Projects: Assist in implementing Lean, 5S, or Six Sigma initiatives to improve plant efficiency, reduce waste, and enhance workplace organization. Standard Work & Documentation: Help develop and update standard operating procedures (SOPs), work instructions, and visual management tools. Quality & Safety Support: Collaborate with Quality Control and Safety teams to ensure process changes meet safety standards and product specifications. Technical Support: Support the introduction of new molds, equipment, or materials by conducting process trials and documenting results. Collaboration: Partner with maintenance, engineering, and production supervisors to troubleshoot recurring process issues. Qualifications: Education: Associate's degree or technical diploma in Manufacturing Technology, Industrial Engineering, or related field. Equivalent experience in precast concrete production or process improvement will be considered. Experience: 2+ years in a manufacturing or precast concrete environment. Familiarity with Lean Manufacturing, 6S, or Continuous Improvement principles. Skills: Strong mechanical aptitude and understanding of production equipment. Ability to collect and interpret process data (cycle times, scrap, yield, etc.). Proficiency in Microsoft Office and basic data entry tools. Good communication and problem-solving skills. Team-oriented and hands-on approach. Preferred Qualifications: Experience with precast or concrete manufacturing processes (casting, curing, form setup, reinforcement, finishing). Knowledge of quality systems such as NPCA or PCI standards. Basic CAD or technical drawing reading ability. Certification in Lean or Six Sigma or willingness to acquire. Performance Indicators: Reduction in process waste or rework rates. Increased production throughput and efficiency. Improved safety compliance and incident reduction. Consistency in meeting product quality standards. Implementation and sustainability of improvement projects.
    $68k-100k yearly est. 3d ago
  • Senior Quality Engineer

    Affinipay, LLC 3.9company rating

    Austin, TX jobs

    About the role: We are seeking an experienced Senior Quality Engineer to join our Quality Engineering team at 8am. In this role, you will serve as a customer advocate and quality champion for your embedded scrum team, ensuring the functionality, reliability, performance, and user experience of our FinTech/Back-End SaaS platform through comprehensive testing strategies and collaborative engineering practices. At 8am, Quality is not a phase-it's a mindset woven into every conversation, decision, and line of code. As a senior member of the team, you'll leverage your deep testing expertise to drive quality improvements that impact not just your immediate team but the broader engineering organization. You'll mentor junior engineers, contribute to process improvements, and take ownership of quality initiatives that protect and enhance our customers' experience. About us: At 8am, our vision is to power a world where professionals thrive. We start every day on a mission to empower professionals with the most trusted, innovative technology to deliver world-class outcomes for their clients and exceptional financial results for their business. They count on our purpose-built solutions to simplify operations, ensure compliance, and fuel profitable growth, so they can focus on their clients and do more of the work that matters. Founded in 2005, 8am (formerly AffiniPay) is the professional business platform built to help legal, accounting, and other client-focused professionals run stronger, more profitable businesses. Today, more than 250,000 professionals across the U.S. trust 8am to help them work smarter, serve clients better, and unlock their full potential. We have been recognized as one of Inc 5000's fastest growing companies in the U.S. for 13 years in a row, and as a result, our teams continue to grow as well! What you'll do: Customer Advocacy & Technical Excellence Champion the customer experience at every stage of development, viewing deliverables through the customer's lens Design and execute comprehensive test strategies across UI, API, and database layers, ensuring they reflect real customer scenarios Speak up when features aren't complete enough to test or when quality concerns arise-even if it means slowing down Challenge requirements that don't serve customer needs and identify issues that could frustrate users before they reach production Write high-quality test cases that tell a story about customer interactions and maintain them through their entire lifecycle Identify and document defects with comprehensive root cause analysis, always including "how this affects the customer" assessments Drive continuous improvement in our Defect Leakage Rate (≤15% target), recognizing that every bug reaching production erodes customer trust Leadership & Execution Complete 90%+ of assigned tasks independently while providing guidance and mentorship to junior engineers Own quality initiatives and drive measurable improvements to QE processes and standards Create and lead test plan reviews using test management tools like TestRail or Qase, ensuring adherence to documented standards Engage with Product Owners and Product Management on feature feasibility, testability, and requirement clarification Become a Subject Matter Expert (SME) for specific technical areas or product components Track and report on Quality KPIs, with particular focus on Defect Leakage Rate-our primary customer-focused metric that measures bugs caught by QE versus those found by customers Collaborate with SDET team/Developers on determining what tests are high value for automation, and ensuring that test automation operates as expected, owning automated test execution. Technical Proficiency Demonstrate proficient to advanced networking and Linux skills, utilizing command-line tools across common distributions Leverage Python scripting for assisting with test automation tasks as necessary (Dedicated SDET team owns framework and most automation tasks). Willingness to learn basic Python scripting if not previously leveraged Validate services deployed on AWS and other cloud platforms, managing test environments and cloud resources efficiently Basic understanding of security and performance testing Collaboration & Communication Work effectively within scrum team, and across them when user features span teams, contributing meaningfully to technical discussions Translate technical issues into customer experience impacts for both technical and non-technical stakeholders Provide timely, helpful feedback to peers and junior engineers through code reviews and test case audits Actively contribute to team knowledge sharing through documentation and presentations Participate in all Agile ceremonies with a focus on quality advocacy About you: 3-7 years of experience in software quality engineering, preferably in SaaS environments Strong demonstrated proficiency in Linux distributions and CLI-based testing, including log file analysis, and other troubleshooting tasks Experience with AWS or other major cloud platforms Basic Python/Shell (or similar) scripting knowledge with ability to edit existing scripts and create new automation Advanced skills with UI, API, and SQL testing methodologies Familiarity with test management tools such as TestRail or Qase Demonstrated experience leveraging Version Control Systems with a focus on Github and Bitbucket Experience with testing tools: Jira, Sentry, DataDog Understanding of containerization, Kubernetes, and CI/CD pipelines (Jenkins, CircleCI) Strong understanding of Agile/Scrum methodologies Proven track record of mentoring junior engineers and contributing to process improvements Excellent analytical and problem-solving abilities Strong communication skills with ability to present to both technical and non-technical stakeholders Most importantly: The courage to be vocal about quality concerns and testing impediments Demonstrated experience leveraging AI tools and technologies to improve workflows, enhance decision-making, or drive innovation. Additional Information The annual salary range for this position is $100,000 to $160,000. The salary range for performing this role outside of the US / Austin / California may differ. 8am is committed to offering competitive, fair and commensurate compensation and has provided an estimated pay range for this role. Actual compensation may vary based on job-related knowledge, skills, experience and education. Why 8am: At 8am, our culture is shaped by the people who bring it to life every day. Together, we build a company rooted in continuous learning, genuine community, holistic wellness, and meaningful engagement-values that empower us as individuals and unite us as a team. Our culture is grounded in our core values: Work Smart, Win Fast; Outshine Ordinary, and We Find a Way. These values drive how we serve our customers and work with each other in a collaborative, inspiring, and empowering environment, every day. Here's how we support our 8Team: Health Insurance Coverage: We offer our 8Team a variety of medical, dental, and vision plans, designed to fit your needs, including a 100% company-paid HDHP plan for employees. Financial perks: We offer a competitive compensation and benefits package including annual bonuses, equity options and 401(k) or RRSP if in Canada, with a company match for all team members. Time for what matters: Flexible Time Off, paid holidays, and a parental leave program for our new parents. Wellness: Wellness stipends, mental health support, and one-on-one nutrition coaching. Learning and Development: Continuous learning through 8am.edu, leadership programs, professional development funds, and individually focused talent development. Giving back to the communities around us: Participate in our charitable matching gift program, paid time off for volunteer service, and company-sponsored volunteer events (both local and virtually). Engagement: Virtual and in-person team-building events, quarterly award recognition through our Rise & Shine Award of Excellence Program, and our peer-to-peer appreciation platform. At 8am, we don't just offer benefits - we create an environment where people can thrive, grow, and make a real impact every day. Diversity, equity & inclusion at 8am: At 8am, we recognize that innovation occurs with a strong team of people who are diverse in background, personality, talent and ideas. Experience comes in many forms and ensuring a diverse and inclusive workplace where we continue to learn from each other is an integral part of our culture. We are committed to creating a welcoming and transparent environment for all that embraces those differences through education, equal access to opportunities and information, inclusionary programs, and community outreach. Security advisory: Our hiring teams at 8am are dedicated to recruiting top talent that share our passion for serving the professional services industry through innovative financial technology. As such, our Talent Acquisition Team only follows legitimate hiring practices. We will always communicate with our candidates using emails with the 8am domain and will never ask for sensitive/personal data during the application process. All interviews take place over phone call, Zoom/Google Meet or in person. All offers are communicated verbally by our Talent Acquisition Specialists with a written offer letter as a follow up.
    $100k-160k yearly 7d ago
  • Senior Quality Engineer | Mobile/UI

    Affinipay, LLC 3.9company rating

    Austin, TX jobs

    About the role: We are seeking an experienced Senior Quality Engineer with a strong focus on mobile testing to join our Quality Engineering team at 8am. In this role, you will serve as a customer advocate and quality champion for your embedded scrum team, ensuring the functionality, reliability, performance, and user experience of our SaaS platform across mobile and web interfaces through comprehensive testing strategies and collaborative engineering practices. At 8am, Quality is not a phase; it's a mindset woven into every conversation, decision, and line of code. As a senior member of the team, you'll leverage your deep mobile testing expertise to drive quality improvements that impact not just your immediate team but the broader engineering organization. You'll mentor junior engineers, contribute to process improvements, and take ownership of quality initiatives that protect and enhance our customers' experience - whether they're accessing our platform from an iPhone, Android device, iPad, or desktop browser. About us: At 8am, our vision is to power a world where professionals thrive. We start every day on a mission to empower professionals with the most trusted, innovative technology to deliver world-class outcomes for their clients and exceptional financial results for their business. They count on our purpose-built solutions to simplify operations, ensure compliance, and fuel profitable growth, so they can focus on their clients and do more of the work that matters. Founded in 2005, 8am (formerly AffiniPay) is the professional business platform built to help legal, accounting, and other client-focused professionals run stronger, more profitable businesses. Today, more than 250,000 professionals across the U.S. trust 8am to help them work smarter, serve clients better, and unlock their full potential. We have been recognized as one of Inc 5000's fastest growing companies in the U.S. for 13 years in a row, and as a result, our teams continue to grow as well! What you'll do: Customer Advocacy & Technical Excellence Champion customer experience; speak up when features aren't ready or quality concerns arise Design and execute comprehensive test strategies across iOS (native + Safari), Android (native + Chrome), and desktop browsers Validate responsive design, touch interactions, gestures, orientation changes, and varying network conditions Write high-quality test cases reflecting real-world scenarios; document defects with root cause analysis Drive continuous improvement toward ≤15% Defect Leakage Rate Cross-Platform & Browser Testing Ensure seamless experiences across platform transitions (mobile desktop) Validate across mac OS, Windows, and major browsers (Chrome, Safari, Firefox, Edge) Maintain a comprehensive device/browser/OS testing matrix based on user analytics Leadership & Execution Complete 90%+ of tasks independently; mentor junior engineers on mobile testing best practices Own quality initiatives and drive measurable improvements to QE standards Become SME for mobile testing, device fragmentation, or specific product components Collaborate with SDET team on automation priorities (Appium, XCUITest, Espresso) Track and report Quality KPIs with focus on Defect Leakage Rate About you: 3-7 years of experience in software quality engineering, preferably in SaaS environments Strong demonstrated experience in mobile application testing across iOS and Android platforms, including both native apps and mobile web browsers Experience testing on physical iOS devices (iPhones, iPads) and Android devices across multiple manufacturers Have familiarity with the major App store submission processes and general guidelines Proficiency testing across major desktop environments (mac OS, Windows) and browsers (Chrome, Safari, Firefox, Edge) Strong demonstrated proficiency in Linux distributions and CLI-based testing, including log file analysis and other troubleshooting tasks Experience with AWS or other major cloud platforms Basic Python/Shell (or similar) scripting knowledge with ability to edit existing scripts and create new automation Advanced skills with UI, API, and SQL testing methodologies Familiarity with test management tools such as TestRail; experience with Qase is a plus Demonstrated experience leveraging Version Control Systems with a focus on GitHub Experience with testing tools: Jira, Sentry, DataDog Strong understanding of Agile/Scrum methodologies Proven track record of mentoring junior engineers and contributing to process improvements Excellent analytical and problem-solving abilities Strong communication skills with ability to present to both technical and non-technical stakeholders Demonstrated experience leveraging AI tools and technologies to improve workflows, enhance decision-making, or drive innovation. Nice to have: Experience with mobile test automation frameworks such as Appium, XCUITest (iOS), or Espresso (Android) Experience with BrowserStack or similar cloud-based device testing platforms for cross-platform compatibility testing Experience testing SaaS products in regulated industries (such as PCI-compliant) Basic understanding of containerization, Kubernetes, and CI/CD pipelines (Jenkins, CircleCI) Knowledge of basic non-functional testing (security, performance, accessibility) with emphasis on mobile-specific concerns Experience with microservices architectures and distributed systems Understanding of responsive design principles and mobile UX best practices Certifications such as ISTQB or CSTE Familiarity with AI-assisted testing tools and leveraging LLMs as a productivity-boosting tool Experience evaluating and implementing new QE tools and processes Additional Information The annual salary range for this position is $120,000 to $150,000. The salary range for performing this role outside of the US / Austin / California may differ. 8am is committed to offering competitive, fair and commensurate compensation and has provided an estimated pay range for this role. Actual compensation may vary based on job-related knowledge, skills, experience and education. Why 8am: At 8am, our culture is shaped by the people who bring it to life every day. Together, we build a company rooted in continuous learning, genuine community, holistic wellness, and meaningful engagement-values that empower us as individuals and unite us as a team. Our culture is grounded in our core values: Work Smart, Win Fast; Outshine Ordinary, and We Find a Way. These values drive how we serve our customers and work with each other in a collaborative, inspiring, and empowering environment, every day. Here's how we support our 8Team: Health Insurance Coverage: We offer our 8Team a variety of medical, dental, and vision plans, designed to fit your needs, including a 100% company-paid HDHP plan for employees. Financial perks: We offer a competitive compensation and benefits package including annual bonuses, equity options and 401(k) or RRSP if in Canada, with a company match for all team members. Time for what matters: Flexible Time Off, paid holidays, and a parental leave program for our new parents. Wellness: Wellness stipends, mental health support, and one-on-one nutrition coaching. Learning and Development: Continuous learning through 8am.edu, leadership programs, professional development funds, and individually focused talent development. Giving back to the communities around us: Participate in our charitable matching gift program, paid time off for volunteer service, and company-sponsored volunteer events (both local and virtually). Engagement: Virtual and in-person team-building events, quarterly award recognition through our Rise & Shine Award of Excellence Program, and our peer-to-peer appreciation platform. At 8am, we don't just offer benefits - we create an environment where people can thrive, grow, and make a real impact every day. Diversity, equity & inclusion at 8am: At 8am, we recognize that innovation occurs with a strong team of people who are diverse in background, personality, talent and ideas. Experience comes in many forms and ensuring a diverse and inclusive workplace where we continue to learn from each other is an integral part of our culture. We are committed to creating a welcoming and transparent environment for all that embraces those differences through education, equal access to opportunities and information, inclusionary programs, and community outreach. Security advisory: Our hiring teams at 8am are dedicated to recruiting top talent that share our passion for serving the professional services industry through innovative financial technology. As such, our Talent Acquisition Team only follows legitimate hiring practices. We will always communicate with our candidates using emails with the 8am domain and will never ask for sensitive/personal data during the application process. All interviews take place over phone call, Zoom/Google Meet or in person. All offers are communicated verbally by our Talent Acquisition Specialists with a written offer letter as a follow up.
    $120k-150k yearly 7d ago
  • Senior ML Engineer: Production Pipelines & HPC Expert

    Capital One 4.7company rating

    McLean, VA jobs

    A leading financial services company in Virginia seeks an experienced professional to design and build data-intensive solutions. The role requires expertise in C, C++, Python, Scala, and machine learning, along with the ability to lead teams and communicate complex concepts effectively. Candidates should possess a Bachelor's and preferably a Master's degree, with a proven track record in production-ready data pipelines and ML lifecycle. Competitive compensation and comprehensive benefits are offered. #J-18808-Ljbffr
    $90k-111k yearly est. 5d ago
  • Site Reliability Engineer

    The Voleon Group 4.1company rating

    Remote

    Voleon is a technology company that applies state-of-the-art AI and machine learning techniques to real-world problems in finance. For nearly two decades, we have led our industry and worked at the frontier of applying AI/ML to investment management. We have become a multibillion-dollar asset manager, and we have ambitious goals for the future. Your colleagues will include internationally recognized experts in artificial intelligence and machine learning research as well as highly experienced finance and technology professionals. The people who shape our company come from other backgrounds, including concert music performances, humanitarian aid, opera singing, sports writing, and BMX racing. You will be part of a team that loves to succeed together. In addition to our enriching and collegial working environment, we offer highly competitive compensation and benefits packages, technology talks by our experts, a beautiful modern office, daily catered lunches, and more. As a Site Reliability Engineer (SRE), you will work at the intersection of production operations and software development as you improve, manage, and monitor production-critical infrastructure and data pipelines. At Voleon, many SREs serve together on a Production Operations team tasked with improving shared production infrastructure. Others are embedded with teams of software engineers to improve specific production systems owned by those teams. Voleon SREs work on important real-world problems and collaborate with passionate and talented colleagues in an empowering, results-driven environment. This role is a way to make a real difference: your contributions will make our critical systems more reliable, lower operational risk, and increase the efficiency of our engineering effort.Responsibilities Improve fault-tolerance and maintainability of code in proprietary data pipelines and trading systems Diagnose and fix bugs in code Lead complex deployments Automate manual workflows Track and prioritize outstanding production-related issues Share an on-call rotation responding to incidents to ensure the continuous operation of production-critical systems Requirements Experience with coding and debugging Python Experience with Linux Familiarity with Relational Databases & SQL Sharp analytical and problem-solving skills and a persistent drive to make things work (better) Strong growth mindset and a passion for learning Strong technical communication skills Attention to detail 2 years of relevant industry experience An undergraduate degree or comparable training in a quantitative field or equivalent, relevant industry experience Preferred Qualifications Familiarity with best practices concerning code maintainability, documentation, quality assurance, continuous integration and deployment Experience supporting production systems Experience with any of the following: gRPC microservices, Postgres, Pandas, Golang, R, Git, Jenkins, Bazel, Prometheus, Grafana, Airflow, Kubernetes The base salary for this position is $120,000 to $160,000 in the location(s) of this posting. Individual salaries are determined through a variety of factors, including, but not limited to, education, experience, knowledge, skills, and geography. Base salary does not include other forms of total compensation such as bonus compensation and other benefits. Our benefits package includes medical, dental and vision coverage, life and AD&D insurance, 20 days of paid time off, 9 sick days, and a 401(k) plan with a company match. “Friends of Voleon” Candidate Referral ProgramIf you have a great candidate in mind for this role and would like to have the potential to earn $7,500 - $15,000 if your referred candidate is successfully hired and employed by The Voleon Group, please use this form to submit your referral. For more details regarding eligibility, terms and conditions please make sure to review the Voleon Referral Bonus Program. Equal Opportunity EmployerThe Voleon Group is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.
    $120k-160k yearly Auto-Apply 50d ago
  • Principal Site Reliability Engineer - Foundational Services

    Jpmorgan Chase 4.8company rating

    Palo Alto, CA jobs

    Join a globally recognized financial organization and advance your profession to new heights by contributing to revolutionary projects. You've discovered the perfect environment to have a major impact. As a Principal Site Reliability Engineer at JP Morgan Chase within the (insert LOB or sub LOB), you draw upon your advanced knowledge to identify new opportunities to influence critical incident management and improve the end-to-end lifecycle of software development for the firm. You will have the opportunity to manage, design, and implement infrastructure components to improve reliability and ensure operational efficiency. **Job responsibilities** + Identifies and solves problems of high complexity + Works with development teams throughout the Software Development Life Cycle to ensure sustainable software releases + Leads medium to large projects by bringing together the proper perspective, identifying roadblocks, and integrating feedback from team members and subject matter experts at the firm + Participates in support responsibilities for coverage of critical applications + Sees problems as opportunities to improve **Required qualifications, capabilities, and skills** + Formal training or certification on site reliability engineering concepts and 10+ years applied experience + Advanced knowledge of software applications and technical processes with considerable depth in one or more technical disciplines + Ability to determine how each system relates to each other and use breadth of tools to build automation to improve reliability for the firm + Experience with translating research, analysis, and tests into business recommendations + Ability to balance and be accountable for the work of multiple architects and designers **Preferred qualifications, capabilities, and skills** + Understands and leads partnerships across job functions (e.g., Cybersecurity and Data) to develop efficient and developer-friendly systems + Engages team members and expresses complex ideas with appropriate level of detail, while also providing constructive feedback JPMorganChase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world's most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management. We offer a competitive total rewards package including base salary determined based on the role, experience, skill set and location. Those in eligible roles may receive commission-based pay and/or discretionary incentive compensation, paid in the form of cash and/or forfeitable equity, awarded in recognition of individual achievements and contributions. We also offer a range of benefits and programs to meet employee needs, based on eligibility. These benefits include comprehensive health care coverage, on-site health and wellness centers, a retirement savings plan, backup childcare, tuition reimbursement, mental health support, financial coaching and more. Additional details about total compensation and benefits will be provided during the hiring process. We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation. JPMorgan Chase & Co. is an Equal Opportunity Employer, including Disability/Veterans **Base Pay/Salary** Palo Alto,CA $204,250.00 - $285,000.00 / year
    $204.3k-285k yearly 4d ago
  • Staff Site Reliability Engineer

    Figure 4.5company rating

    San Jose, CA jobs

    Figure is an AI robotics company developing autonomous general-purpose humanoid robots. The goal of the company is to ship humanoid robots with human level intelligence. Its robots are engineered to perform a variety of tasks in the home and commercial markets. Figure is headquartered in San Jose, CA. We are looking for a Site Reliability Engineer to own our internal systems infrastructure. This role is responsible for setting up and managing cloud and on-prem infrastructure to deliver highly available, reliable, and automated systems. Responsibilities: Be the go to person for mission critical infrastructure enabling critical operations such as Source Configuration Management, CI/CD systems, software distribution, supplier portals, manufacturing and more. Migrate SaaS to self-hosted solutions to enhance security and reliability. Implement monitoring and alerting systems, and define incident response plans and runbooks. Reduce human workload through automation to automate deployment and scaling. Establish strong relationships with stakeholders to identify infrastructure needs and establish Service Level Objectives. Use a data driven approach to demonstrate service robustness and track optimization work. Partner with the security team to ensure that security remediations and updates are applied in a timely manner. Requirements: Strong experience with Linux/Unix systems administration Proficiency in programming/scripting Extensive experience with cloud platforms (Azure, AWS, GCP) and on-prem hardware architectures Experience designing, deploying, and operating high-availability, fault-tolerant, and distributed systems. Mastery of infrastructure as code (Terraform, CloudFormation, Ansible…) Familiarity with monitoring, logging, and alerting tools (Prometheus, Grafana, Datadog…) Solid understanding of networking fundamentals (TCP/IP, DNS, HTTP, load balancers, firewalls) Experience defining Service Level Objectives (SLO), developing runbooks/incident response plans, facilitating post-mortems and managing systems assets. Ability to work in cross-functional teams with developers, infra, and product teams Excellent verbal and written communication skills The US base salary range for this full-time position is between $175,000 - $250,000 annually. The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.
    $175k-250k yearly Auto-Apply 44d ago
  • Site Reliability Engineer - Capital Markets

    Jefferies Financial Group Inc. 4.8company rating

    Jersey City, NJ jobs

    Jefferies is seeking for Site Reliability Engineer to play an instrumental role in supporting Equity Front office trading application, risk and middle office real time products, developed and used for Equity Cash and ETS application. As part of the wider platform engineering team, you will be working closely with the Business users interactively throughout the day, along with technical, analysis and testing colleagues. Investigation and resolution of the work items at hand will require competent technical skills and a keen intellect. The business is a growth area, with current investments taking place in all the technology, business and middle office areas. Responsibilities: Front Line Site Reliable Engineering and Support functions for Equity trading systems used by Jefferies clients as well as internal users. Build monitoring tools for application and infrastructure components. Implement and manage scalable infrastructure using cloud-native technologies and tools. Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding. Partner with business, development and infrastructure teams to improve services through rigorous testing and release procedures. Develop and maintain CI/CD pipelines to streamline deployment processes. Expedient deployment of new systems. Capacity planning, Platform Management, and support for increasing volumes and business growth. Create sustainable systems and services through automation. Collaborate with Application team to establish and enforce production and development standards. Document procedures, best practices and troubleshooting FAQs. Resolve complex application and technical problems. Debugging the system and fixing the production related issues. Escalate / follow-up on permanent fix for development related issues. Lead incident response efforts and post-mortem analysis to prevent future occurrences. Handles complex operational tasks and recommends process and technology changes. Global support and includes weekend availability to troubleshoot production related issues and perform checkouts. Ability to work both independently and in groups in an energetic, diverse environment. Participate in on-call rotations to ensure 24/7 system availability and support. Support compliance and legal queries. Qualifications: Strong experience in Windows and Linux/Unix services. Strong experience in scripting language like Power shell, Python and SQL. Strong Knowledge of monitoring tools - Nagios, Splunk, OTEL, Datadog Strong Knowledge of FIX protocol Strong Domain skills - Must have working experience in Capital Markets across modules and instruments especially - CASH, ETS, Bonds, Options, Futures, Swaps products Experience in BFSI (Banking and Financial Industry) Domain applications with a proper understanding of the Trade Lifecycle. Excellent communication, time management and project management skills. Primary Location Full Time Salary Range of $175,000 - $200,000
    $175k-200k yearly Auto-Apply 60d+ ago
  • Site Reliability Engineer

    The Voleon Group 4.1company rating

    Berkeley, CA jobs

    Voleon is a technology company that applies state-of-the-art AI and machine learning techniques to real-world problems in finance. For nearly two decades, we have led our industry and worked at the frontier of applying AI/ML to investment management. We have become a multibillion-dollar asset manager, and we have ambitious goals for the future. Your colleagues will include internationally recognized experts in artificial intelligence and machine learning research as well as highly experienced finance and technology professionals. The people who shape our company come from other backgrounds, including concert music performances, humanitarian aid, opera singing, sports writing, and BMX racing. You will be part of a team that loves to succeed together. In addition to our enriching and collegial working environment, we offer highly competitive compensation and benefits packages, technology talks by our experts, a beautiful modern office, daily catered lunches, and more. As a Site Reliability Engineer (SRE), you will work at the intersection of production operations and software development as you improve, manage, and monitor production-critical infrastructure and data pipelines. At Voleon, many SREs serve together on a Production Operations team tasked with improving shared production infrastructure. Others are embedded with teams of software engineers to improve specific production systems owned by those teams. Voleon SREs work on important real-world problems and collaborate with passionate and talented colleagues in an empowering, results-driven environment. This role is a way to make a real difference: your contributions will make our critical systems more reliable, lower operational risk, and increase the efficiency of our engineering effort.Responsibilities Improve fault-tolerance and maintainability of code in proprietary data pipelines and trading systems Diagnose and fix bugs in code Lead complex deployments Automate manual workflows Track and prioritize outstanding production-related issues Share an on-call rotation responding to incidents to ensure the continuous operation of production-critical systems Requirements Experience with coding and debugging Python Experience with Linux Familiarity with Relational Databases & SQL Sharp analytical and problem-solving skills and a persistent drive to make things work (better) Strong growth mindset and a passion for learning Strong technical communication skills Attention to detail 2 years of relevant industry experience An undergraduate degree or comparable training in a quantitative field or equivalent, relevant industry experience Preferred Qualifications Familiarity with best practices concerning code maintainability, documentation, quality assurance, continuous integration and deployment Experience supporting production systems Experience with any of the following: gRPC microservices, Postgres, Pandas, Golang, R, Git, Jenkins, Bazel, Prometheus, Grafana, Airflow, Kubernetes The base salary for this position is $120,000 to $160,000 in the location(s) of this posting. Individual salaries are determined through a variety of factors, including, but not limited to, education, experience, knowledge, skills, and geography. Base salary does not include other forms of total compensation such as bonus compensation and other benefits. Our benefits package includes medical, dental and vision coverage, life and AD&D insurance, 20 days of paid time off, 9 sick days, and a 401(k) plan with a company match. “Friends of Voleon” Candidate Referral ProgramIf you have a great candidate in mind for this role and would like to have the potential to earn $7,500 - $15,000 if your referred candidate is successfully hired and employed by The Voleon Group, please use this form to submit your referral. For more details regarding eligibility, terms and conditions please make sure to review the Voleon Referral Bonus Program. Equal Opportunity EmployerThe Voleon Group is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.
    $120k-160k yearly Auto-Apply 50d ago
  • Reliability Engineer

    Tata Consulting Services 4.3company rating

    Marlborough, MA jobs

    * SRE to quickly write automations, self-heal scripts, understanding and finding resolutions for errors from Microservices basically any from any stack ( Full-Stack capable). * Operations skillset with enough attitude to scale to a Reliability Engineer * Should be able to handle customer communication and coordination with offshore team. TCS Employee Benefits Summary: * Discretionary Annual Incentive. * Comprehensive Medical Coverage: Medical & Health, Dental & Vision, Disability Planning & Insurance, Pet Insurance Plans. * Family Support: Maternal & Parental Leaves. * Insurance Options: Auto & Home Insurance, Identity Theft Protection. * Convenience & Professional Growth: Commute r Benefits & Certification & Training Reimbursement. * Time Off: Vacation, Time Off, Sick Leave & Holidays. * Legal & Financial Assistance: Legal Assistance, 401K Plan, Performance Bonus, College Fund, Student Loan Refinancing. # LI-RJ2 Salary Range - $100,000-$120,000 a year
    $100k-120k yearly 21d ago
  • Reliability Engineer (SRE OMS)

    Tata Consulting Services 4.3company rating

    Marlborough, MA jobs

    * SRE with Sterling OMS Skillset with adaptability to Distributed Systems, developing Automations with AI/GenAI tool etc * Operations skillset with enough attitude to scale to a Reliability Engineer. * Should be able to handle customer communication and coordination with offshore team. TCS Employee Benefits Summary: * Discretionary Annual Incentive. * Comprehensive Medical Coverage: Medical & Health, Dental & Vision, Disability Planning & Insurance, Pet Insurance Plans. * Family Support: Maternal & Parental Leaves. * Insurance Options: Auto & Home Insurance, Identity Theft Protection. * Convenience & Professional Growth: Commute r Benefits & Certification & Training Reimbursement. * Time Off: Vacation, Time Off, Sick Leave & Holidays. * Legal & Financial Assistance: Legal Assistance, 401K Plan, Performance Bonus, College Fund, Student Loan Refinancing. # LI-RJ2 Salary Range - $100,000-$120,000 a year
    $100k-120k yearly 21d ago
  • Site Reliability Engineer

    Tata Consulting Services 4.3company rating

    Miami, FL jobs

    Must-Have * Strong development experience in .NET and Java frameworks. * Proven leadership managing SRE and DevOps teams. * Incident and problem management using ServiceNow. * Expertise in Observability: AppDynamics, PagerDuty, Grafana, Splunk. * Deep understanding of CI/CD with Azure ADO, GitHub, Maven, Gradle. * Automated regression and performance testing experience with Selenium, JMeter. * Experience building self-healing systems. * Strong skills in root cause analysis (RCA) and problem identification. * Ability to define and enforce SLAs and response metrics. * Document and maintain version-controlled knowledge repositories. * Exposure to self-healing systems in SRE or DevOps context. Good-to-Have * Certifications in AWS/GCP/Azure Salary Range-$100,000-$120,000 a year #LI-KR3 TCS Employee Benefits Summary: Discretionary Annual Incentive. Comprehensive Medical Coverage: Medical & Health, Dental & Vision, Disability Planning & Insurance, Pet Insurance Plans. Family Support: Maternal & Parental Leaves. Insurance Options: Auto & Home Insurance, Identity Theft Protection. Convenience & Professional Growth: Commuter Benefits & Certification & Training Reimbursement. Time Off: Vacation, Time Off, Sick Leave & Holidays. Legal & Financial Assistance: Legal Assistance, 401K Plan, Performance Bonus, College Fund, Student Loan Refinancing. Experience working in a Travel/Tourism industry
    $100k-120k yearly 18d ago
  • Site Reliability Engineer - Capital Markets

    Jefferies Financial Group Inc. 4.8company rating

    New York, NY jobs

    Jefferies is seeking for Site Reliability Engineer to play an instrumental role in supporting Equity Front office trading application, risk and middle office real time products, developed and used for Equity Cash and ETS application. As part of the wider platform engineering team, you will be working closely with the Business users interactively throughout the day, along with technical, analysis and testing colleagues. Investigation and resolution of the work items at hand will require competent technical skills and a keen intellect. The business is a growth area, with current investments taking place in all the technology, business and middle office areas. Responsibilities: * Front Line Site Reliable Engineering and Support functions for Equity trading systems used by Jefferies clients as well as internal users. * Build monitoring tools for application and infrastructure components. * Implement and manage scalable infrastructure using cloud-native technologies and tools. * Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding. * Partner with business, development and infrastructure teams to improve services through rigorous testing and release procedures. * Develop and maintain CI/CD pipelines to streamline deployment processes. * Expedient deployment of new systems. Capacity planning, Platform Management, and support for increasing volumes and business growth. * Create sustainable systems and services through automation. * Collaborate with Application team to establish and enforce production and development standards. * Document procedures, best practices and troubleshooting FAQs. * Resolve complex application and technical problems. * Debugging the system and fixing the production related issues. * Escalate / follow-up on permanent fix for development related issues. * Lead incident response efforts and post-mortem analysis to prevent future occurrences. * Handles complex operational tasks and recommends process and technology changes. * Global support and includes weekend availability to troubleshoot production related issues and perform checkouts. * Ability to work both independently and in groups in an energetic, diverse environment. * Participate in on-call rotations to ensure 24/7 system availability and support. * Support compliance and legal queries. Qualifications: * Strong experience in Windows and Linux/Unix services. * Strong experience in scripting language like Power shell, Python and SQL. * Strong Knowledge of monitoring tools - Nagios, Splunk, OTEL, Datadog * Strong Knowledge of FIX protocol * Strong Domain skills - Must have working experience in Capital Markets across modules and instruments especially - CASH, ETS, Bonds, Options, Futures, Swaps products * Experience in BFSI (Banking and Financial Industry) Domain applications with a proper understanding of the Trade Lifecycle. * Excellent communication, time management and project management skills. Primary Location Full Time Salary Range of $175,000 - $200,000
    $175k-200k yearly Auto-Apply 49d ago
  • Network Reliability Engineer III

    CME Group 4.4company rating

    Chicago, IL jobs

    As we embark on a journey to transform the Network Services Group in CME, we are seeking a Network Reliability Engineer III to join our dynamic team. In this role, you will design, develop and maintain self-service tools and applications that enhance productivity and reduce operational costs. You will work across the full stack-both front-end and back-end-to architect microservices (GKE) in Google Cloud Platform (GCP), driving our infrastructure towards greater automation and reliability. We are a global team across US, UK, India and Singapore made up of a diverse range of people from varied backgrounds who each bring unique network experiences and skill sets. The relatively new Network Reliability/Automation team are responsible for building a suite of custom automation tools and developing our self-healing capabilities while working closely with other members of the Network Services team in project delivery to ensure one of the largest Exchange network infrastructures in the world is highly available, resilient, secure and reliable. Responsibilities * Design, develop and maintain self-service and automation tools to streamline IT operations and reduce manual effort. * Engage in full-stack development, delivering responsive front-end interfaces as well as robust scalable back-end services. * With support Architect, deploy and scale microservices on GCP, with particular emphasis on containers and Google Kubernetes Engine (GKE). * Manage cloud infrastructure via Infrastructure-as-Code (IaC), primarily using Terraform to provision and maintain resources. * Operate and troubleshoot solutions on Linux-based platforms, leveraging Visual Studio Code (VSCode) as the primary development environment. * Adhere to software engineering best practices, including PEP8 coding standards, SOLID design principles, and established SDLC processes. * Implement and manage CI/CD pipelines with a DevOps mindset, ensuring rapid, reliable delivery of code. * Develop and consume Flask-based RESTful APIs to support network and security automation. * Collaborate within an Agile Scrum framework, utilizing tools such as Bitbucket and Jira to track progress and manage sprints. * Apply strong analytical and problem-solving skills to balance multiple project variables and deliver high-quality solutions on schedule. What we are looking for * Approximately 2-3 years' hands-on Python programming experience, with a demonstrable track record of automation or tooling projects. * Knowledge and experience working with both Python Django and Flask in a corporate environment. * Any experience in network and security automation, coupled with understanding of network fundamentals (routing, switching, firewalls, VPNs) would be beneficial. * Experience developing REST APIs using Flask (or a comparable Python framework). * Applicants with front-end experience using Javascript/JQuery/HTML5/CSS would be ideal. * Familiarity with Infrastructure-as-Code using Terraform (or similar) to manage cloud resources. * Comfortable working in Linux environments and proficient in using Visual Studio Code (VSCode). * Strong software engineering mindset: adherence to PEP8, SOLID principles, and best practices for SDLC, CI/CD and DevOps. * Excellent communication skills, both verbal and written, with the ability to convey technical concepts to diverse stakeholders. * Highly analytical, with the ability to troubleshoot complex issues and manage multiple tasks concurrently. * Experience working in Agile Scrum teams, utilizing Bitbucket and Jira (or equivalent tools) for version control and project tracking. Personal Attributes * Proactive and positive attitude, taking initiative to identify and resolve issues ahead of time. * Collaborative team player, eager to contribute knowledge and assist colleagues. * Innovative thinker who brings fresh ideas and constructive suggestions for continuous improvement. Education Bachelor's Degree in Computer Science, Engineering or a related field is preferred. Equivalent practical experience will also be considered. #LI - Hybrid #LI - JK1 CME Group is committed to offering a competitive total rewards package for our employees that recognizes their contributions to the business and reflects our long-term investment in their future. The pay range for this role is $100,700-$167,800. Actual salary offered will be dependent on a wide array of factors including but not limited to: relevant experience, skills, education and comparison to internal employees (where relevant). Our compensation program also includes an annual target bonus opportunity for all employees, as well as the opportunity to become an owner in the company through our broad-based equity program. Through our benefits program, we strive to offer flexibility, value and choice. From comprehensive health coverage, to a retirement package that includes both a 401(k) and an active pension plan, to highly competitive education reimbursement provisions, paid time off and a mental health benefit, CME Group offers a holistic benefits package for our team and their dependents. CME Group: Where Futures are Made CME Group is the world's leading derivatives marketplace. But who we are goes deeper than that. Here, you can impact markets worldwide. Transform industries. And build a career by shaping tomorrow. We invest in your success and you own it - all while working alongside a team of leading experts who inspire you in ways big and small. Problem solvers, difference makers, trailblazers. Those are our people. And we're looking for more. At CME Group, we embrace our employees' unique experiences and skills to ensure that everyone's perspectives are acknowledged and valued. As an equal-opportunity employer, we consider all potential employees without regard to any protected characteristic. Important Notice: Recruitment fraud is on the rise, with scammers using misleading promises of job offers and interviews to solicit money and personal information from job seekers. CME Group adheres to established procedures designed to maintain trust, confidence and security throughout our recruitment process. Learn more here.
    $100.7k-167.8k yearly 60d+ ago
  • Principal Site Reliability Engineer - Foundational Services

    Jpmorganchase 4.8company rating

    Palo Alto, CA jobs

    Join a globally recognized financial organization and advance your profession to new heights by contributing to revolutionary projects. You've discovered the perfect environment to have a major impact. As a Principal Site Reliability Engineer at JP Morgan Chase within the [insert LOB or sub LOB], you draw upon your advanced knowledge to identify new opportunities to influence critical incident management and improve the end-to-end lifecycle of software development for the firm. You will have the opportunity to manage, design, and implement infrastructure components to improve reliability and ensure operational efficiency. Job responsibilities Identifies and solves problems of high complexity Works with development teams throughout the Software Development Life Cycle to ensure sustainable software releases Leads medium to large projects by bringing together the proper perspective, identifying roadblocks, and integrating feedback from team members and subject matter experts at the firm Participates in support responsibilities for coverage of critical applications Sees problems as opportunities to improve Required qualifications, capabilities, and skills Formal training or certification on site reliability engineering concepts and 10+ years applied experience Advanced knowledge of software applications and technical processes with considerable depth in one or more technical disciplines Ability to determine how each system relates to each other and use breadth of tools to build automation to improve reliability for the firm Experience with translating research, analysis, and tests into business recommendations Ability to balance and be accountable for the work of multiple architects and designers Preferred qualifications, capabilities, and skills Understands and leads partnerships across job functions (e.g., Cybersecurity and Data) to develop efficient and developer-friendly systems Engages team members and expresses complex ideas with appropriate level of detail, while also providing constructive feedback
    $140k-177k yearly est. Auto-Apply 7d ago
  • Reliability Engineer*

    3M 4.6company rating

    Georgia jobs

    Job Title Reliability Engineer Collaborate with Innovative 3Mers Around the World Choosing where to start and grow your career has a major impact on your professional and personal life, so it's equally important you know that the company that you choose to work at, and its leaders, will support and guide you. With a wide variety of people, global locations, technologies and products, 3M is a place where you can collaborate with other curious, creative 3Mers. This position provides an opportunity to transition from other private, public, government or military experience to a 3M career. The Impact You'll Make in this Role As a(n) HANDS ON Reliability Engineer, you will have the opportunity to tap into your curiosity and collaborate with some of the most innovative people around the world. Here, you will make an impact by: Perform failure mode and effect analysis to assure the proper Preventive & Predictive Maintenance programs are implemented, audited and improved on all existing and future assets. Application of Reliability Based Maintenance programs such as Reliability Centered Maintenance (RCM) and Total Productive Maintenance (TPM). Assess & develop capability of mechanics on their role in reliability improvement and to advance their technical capabilities. Analyze data (failure, cost, uptime, etc.) and apply appropriate reliability analysis tools to develop & implement improvement plans. Perform & document equipment criticality analysis in support of an effective critical spares strategy. Submit recommendations and justification for capital expenditures that support and improve the Reliability Program. Provide an external awareness of methods and technologies that advance our own internal body of knowledge for the improvement of our operations reliability. Your Skills and Expertise To set you up for success in this role from day one, 3M requires (at a minimum) the following qualifications: Technical degree or higher (completed and verified prior to start) and Two (2) years of manufacturing experience in a private, public, government or military environment. OR Associates Degree or higher (completed and verified prior to start) and Two (2) years of manufacturing experience in a private, public, government or military environment. AND One (1) year of experience with mechanical and electrical drawings. Additional qualifications that could help you succeed even further in this role include: Bachelor's degree in Electrical, Mechanical, or Mechatronics Engineering from an accredited institution Five (5) years of manufacturing in automotive or aerospace private, public, government or military environment Experience with reliability analysis, predictive (PdM), and preventative maintenance (PM). Skills include… Strong communication, independent, strategic, problem solving. PLC, Automation, variable frequency drives Work location: On-site Clarkston, GA Travel: May include up to 5% domestic/international] Relocation Assistance: Not Authorized Must be legally authorized to work in country of employment without sponsorship for employment visa status (e.g., H1B status). Responsibilities of this position may include direct and/or indirect physical or logical access to information, systems, technologies subjected to the regulations/compliance with U.S. Export Control Laws. U.S. Export Control laws and U.S. Government Department of Defense contracts and sub-contracts impose certain restrictions on companies and their ability to share export-controlled and other technology and services with certain "non-U.S. persons" (persons who are not U.S. citizens or nationals, lawful permanent residents of the U.S., refugees, "Temporary Residents" (granted Amnesty or Special Agricultural Worker provisions), or persons granted asylum (but excluding persons in nonimmigrant status such as H-1B, L-1, F-1, etc.) or non-U.S. citizens. To comply with these laws, and in conjunction with the review of candidates for those positions within 3M that may present access to export controlled technical data, 3M must assess employees' U.S. person status, as well as citizenship(s). The questions asked in this application are intended to assess this and will be used for evaluation purposes only. Failure to provide the necessary information in this regard will result in our inability to consider you further for this particular position. The decision whether or not to file or pursue an export license application is at 3M Company's sole election. Supporting Your Well-being 3M offers many programs to help you live your best life - both physically and financially. To ensure competitive pay and benefits, 3M regularly benchmarks with other companies that are comparable in size and scope. Chat with Max For assistance with searching through our current job openings or for more information about all things 3M, visit Max, our virtual recruiting Applicable to US Applicants Only:The expected compensation range for this position is $81,983 - $100,202, which includes base pay plus variable incentive pay, if eligible. This range represents a good faith estimate for this position. The specific compensation offered to a candidate may vary based on factors including, but not limited to, the candidate's relevant knowledge, training, skills, work location, and/or experience. In addition, this position may be eligible for a range of benefits (e.g., Medical, Dental & Vision, Health Savings Accounts, Health Care & Dependent Care Flexible Spending Accounts, Disability Benefits, Life Insurance, Voluntary Benefits, Paid Absences and Retirement Benefits, etc.). Additional information is available at: ******************************************************************* Faith Posting Date Range 08/11/2025 To 09/10/2025 Or until filled All US-based 3M full time employees will need to sign an employee agreement as a condition of employment with 3M. This agreement lays out key terms on using 3M Confidential Information and Trade Secrets. It also has provisions discussing conflicts of interest and how inventions are assigned. Employees that are Job Grade 7 or equivalent and above may also have obligations to not compete against 3M or solicit its employees or customers, both during their employment, and for a period after they leave 3M.Learn more about 3M's creative solutions to the world's problems at ********** or on Instagram, Facebook, and LinkedIn @3M.Responsibilities of this position include that corporate policies, procedures and security standards are complied with while performing assigned duties.Safety is a core value at 3M. All employees are expected to contribute to a strong Environmental Health and Safety (EHS) culture by following safety policies, identifying hazards, and engaging in continuous improvement.Pay & Benefits Overview: https://**********/3M/en_US/careers-us/working-at-3m/benefits/3M does not discriminate in hiring or employment on the basis of race, color, sex, national origin, religion, age, disability, veteran status, or any other characteristic protected by applicable law. Please note: your application may not be considered if you do not provide your education and work history, either by: 1) uploading a resume, or 2) entering the information into the application fields directly. 3M Global Terms of Use and Privacy Statement Carefully read these Terms of Use before using this website. Your access to and use of this website and application for a job at 3M are conditioned on your acceptance and compliance with these terms. Please access the linked document by clicking here, select the country where you are applying for employment, and review. Before submitting your application, you will be asked to confirm your agreement with the terms.
    $82k-100.2k yearly Auto-Apply 60d+ ago
  • Java Site Reliability Engineer, Messaging Platforms

    Pacific Investment Management Co 4.9company rating

    Austin, TX jobs

    We are a leading global asset management firm with over 3,000 employees across 20 offices in 15 countries; we help millions of investors around the world pursue their financial goals. We hire critical thinkers. People who thrive in a collaborative culture like ours where we solve real problems while building the future of finance. You Are excited to be part of a vibrant engineering community that values diversity, hard work, and continuous learning. Love solving complex real-world business problems. Recognize that cross-functional collaboration is a core component of success for the team. Believe there are multiple ways to solve most technical problems and are willing to debate the trade-offs. Have become a stronger engineer by making mistakes and learning from them. Are a doer, someone who wants to grow their career and gain experience across technologies and business functions. We Continuously invest in a high-performance and inclusive culture, in which a diversity of backgrounds, experiences and viewpoints are celebrated and valued. Encourage career mobility, so you can benefit from learning different functions and technologies, and we gain the benefits of your experience across teams. Run technology pro bono programs that help the non-profit community and give our engineering community opportunities to volunteer and participate. Offer education reimbursements and ongoing training in technology, communication, and diversity & inclusion. Embrace knowledge sharing through lunch-and-learns, demos, and technical forums. Consider our people to be our greatest asset-we will help you learn what PIMCO Technology has to offer so you can participate in activities that benefit your career while delivering impactful technology solutions. As a Java SRE in Trading Technology, you will: As our immediate need Help support the messaging platforms in use (MQ, AMPS, Kafka, etc.). driving the firm's best use of these platforms, making sure all choice make sense, the correct tools issued for the solving each job, and that we build a sustainable messaging strategy. Improve the operational efficiency and reduce the operational risk of our messaging platforms through better tools, better design, and better monitoring. In the future there will be new architectural or coding problems that we will need an experienced engineer to help solve. Work closely with the business and other teams to design and implement solutions that have immediate impact to the business and help us build towards our strategic vision across all our trade floor applications. We need someone proficient in Java, passionate about SRE practices, and able to collaborate effectively with an infrastructure team. We expect you to have a strong passion for messaging systems, including their proper setup, monitoring, and maintenance. At the same time, this role involves software development for target platforms once the immediate needs related to messaging platforms are resolved. You will work with a team consisting of 1 SRE and 1 Unix SA, with full support from the infrastructure and DevOps teams. Position Requirements Bachelor's degree in computer science or equivalent Strong Linux skills (including chef, puppet, ansible configuration tools) Strong experience with different messaging systems (Kafka, AMPS, MQ, FIX, etc.). Strong engineering culture (unit tests, CI/CD) Ability to work independently and in teams Good communication skills Working from the office in Austin 4 days a week. PIMCO follows a total compensation approach when rewarding employees which includes a base salary and a discretionary bonus. Base salary is the fixed component of compensation that is determined by core job responsibilities, relevant experience, internal level, and market factors. The discretionary bonus is used to award performance and therefore is determined by company, business, team, and individual performance. Salary Range: $ 175,000.00 - $ 240,000.00 Equal Employment Opportunity and Affirmative Action Statement PIMCO recruits and hires qualified candidates without regard to race, national origin, ancestry, religion (including religious dress and grooming practices), sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), sexual orientation, gender (including gender identity and expression), age, military or veteran status, disability (physical or mental), any factor prohibited by law, and as such affirms in policy and practice to support and promote the concept of equal employment opportunity and affirmative action, in accordance with all applicable federal, state, provincial and municipal laws. The company also prohibits discrimination on other basis such as medical condition, or marital status under applicable laws. Applicants with Disabilities PIMCO is an Equal Employment Opportunity/Affirmative Action employer. We provide reasonable accommodation for qualified individuals with disabilities, including veterans, in job application procedures. If you have any difficulty using our online system due to a disability and you would like to request an accommodation, you may contact us at ************ and leave a message. This is a dedicated line designed exclusively to assist job seekers with disabilities to apply online. Only messages left for this purpose will be considered. A response to your request may take up to two business days.
    $175k-240k yearly Auto-Apply 51d ago

Learn more about Mastercard jobs

View all jobs