Senior Site Reliability Engineer
San Francisco, CA jobs
Circle is a financial technology company at the epicenter of the emerging internet of money, where value can finally travel like other digital data - globally, nearly instantly and less expensively than legacy settlement systems. This ground-breaking new internet layer opens up previously unimaginable possibilities for payments, commerce and markets that can help raise global economic prosperity and enhance inclusion. Our infrastructure - including USDC, a blockchain-based dollar - helps businesses, institutions and developers harness these breakthroughs and capitalize on this major turning point in the evolution of money and technology.
What you'll be part of:
Circle is committed to visibility and stability in everything we do. As we grow as an organization, we're expanding into some of the world's strongest jurisdictions. Speed and efficiency are motivators for our success and our employees live by our company values: High Integrity, Future Forward, Multistakeholder, Mindful, and Driven by Excellence. We have built a flexible and diverse work environment where new ideas are encouraged and everyone is a stakeholder.
What you'll be responsible for:
The Site Reliability Engineer is responsible for building and maintaining Circle's common libraries and infrastructure to support the rapid development of software features; analyzing requirements, procedures, and problems to improve existing systems and modifying systems; building and owning scalable microservices that are responsible for reliable and secure APIs; working with SRE to improve software shipping experience and improve the speed and quality of iteration; building internal developer platform capabilities; collaborating with Product and Engineering teams to design, test, and ship software, including developing and documenting system design procedures, testing procedures, and quality standards; troubleshooting program and system malfunctions to restore normal functioning; consulting with management to ensure agreement on system principles; writing the infrastructure to deliver great development experiences.
What you'll bring to Circle:
2-4 years of professional software development experience, with a strong foundation in object-oriented programming, preferably in languages such as Java or Golang
Hands-on experience with major cloud platforms, including AWS, Google Cloud Platform (GCP), and Microsoft Azure
Proficient with Kubernetes for container orchestration and managing scalable infrastructure
Skilled in SQL database design, including schema modeling and query optimization
Experience in the deployment and operation of production-quality, scalable software
Emphasis on clean, maintainable code with a focus on speed, quality, and high test coverage to support continuous delivery practices
Adaptable and quick learner, comfortable exploring new languages, frameworks, and technologies as needed
Computer Science degree or a closely related field (or foreign equivalent)
Solid understanding of API design and RESTful architecture, with the ability to derive and communicate well-structured designs
Excellent communicator, able to collaborate effectively across remote teams and clearly present technical ideas and solutions
Self-motivated with a growth mindset, thrives in fast-paced environments, delivers impactful user-focused software, and continuously seeks to improve without heavy oversight.
Circle is on a mission to create an inclusive financial future, with transparency at our core. We consider a wide variety of elements when crafting our compensation ranges and total compensation packages.
Starting pay is determined by various factors, including but not limited to: relevant experience, skill set, qualifications, and other business and organizational needs. Please note that compensation ranges may differ for candidates in other locations.
Base Pay Range: $147,500 - $195,000
We are an equal opportunity employer and value diversity at Circle. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. Additionally, Circle participates in the E-Verify Program in certain locations, as required by law.
Should you require accommodations or assistance in our interview process because of a disability, please reach out to
accommodations@circle.com
for support. We respect your privacy and will connect with you separately from our interview process to accommodate your needs.
#LI-Remote
Auto-ApplySenior Site Reliability Engineer - Enterprise Technology
New York, NY jobs
Hudson River Trading (HRT) is looking for a Senior Site Reliability Engineer to join our growing Enterprise Technology group. The SRE team sits within Enterprise Technology and is responsible for operating and optimizing corporate productivity & collaboration infrastructure for the entire firm, both on-prem and in the cloud.
As one of Enterprise Technology's first SREs, you will help to establish and grow our site reliability engineering practice in addition to ensuring the availability and reliability of systems within our stack.
This role requires a deep Linux operating system and application administration skill set, proficiency in Python, and solid experience with configuration management/IaC. Successful candidates should also have exceptional organizational, communication, and project management skills, as well as the ability to troubleshoot complex technical issues.
Responsibilities
Manage on-premise containerized web services, and a multitude of bridge services, integrations and batch processes that interconnect the elements of our productivity ecosystem
Proactively eliminate sources of operational work. Engineering not firefighting
Automate and troubleshoot a broad range of technical infrastructure both on-prem and in the cloud
Develop and implement monitoring solutions to ensure high system uptime and reliability
Enable transparency and high development velocity within the firm while maintaining a high bar for security. Find ways to reduce user friction, and make sure HRTers have access to the tools and data they need when they need it
Break down complexity, iterate, and communicate progress to a wide variety of leads and stakeholders
Qualifications
5+ years of experience in site reliability engineering or related disciplines
Proficiency with Python
Experience managing and monitoring containerized infrastructure
Experience working with CI/CD tools such as Jenkins, GitHub Actions, or ArgoCD
Expert experience with IaC and configuration management tools such as Terraform, SaltStack, Chef, Puppet, or Ansible
Annual base salary range of $150,000 to $250,000. Pay (base and bonus) may vary depending on job-related skills and experience. A sign-on and discretionary performance bonus may be provided as part of the total compensation package, in addition to company-paid medical and/or other benefits.
Culture
Hudson River Trading (HRT) brings a scientific approach to trading financial products. We have built one of the world's most sophisticated computing environments for research and development. Our researchers are at the forefront of innovation in the world of algorithmic trading.
At HRT we welcome a variety of expertise: mathematics and computer science, physics and engineering, media and tech. We're a community of self-starters who are motivated by the excitement of being at the cutting edge of automation in every part of our organization-from trading, to business operations, to recruiting and beyond. We value openness and transparency, and celebrate great ideas from HRT veterans and new hires alike. At HRT we're friends and colleagues - whether we are sharing a meal, playing the latest board game, or writing elegant code. We embrace a culture of togetherness that extends far beyond the walls of our office.
Feel like you belong at HRT? Our goal is to find the best people and bring them together to do great work in a place where everyone is valued. HRT is proud of our diverse staff; we have offices all over the globe and benefit from our varied and unique perspectives. HRT is an equal opportunity employer; so whoever you are we'd love to get to know you.
Auto-ApplySenior Cluster Site Reliability Engineer
Berkeley, CA jobs
Voleon is a technology company that applies state-of-the-art machine learning techniques to real-world problems in finance. For nearly two decades, we have led our industry and worked at the frontier of applying machine learning to investment management. We have become a multibillion-dollar asset manager, and we have ambitious goals for the future.
As a Senior Cluster Site Reliability Engineer (SRE), you will help scale our research compute cluster to meet our growing needs, and you will leverage engineering skills to ensure high degrees of uptime, reliability, and robustness. Our research clusters are at the core of our R&D, and you will be directly responsible for keeping this key resource available and performant. Your work will provide a world-class HPC platform for researchers to focus on cutting-edge machine learning problems at scale. You will support both on-prem and cloud infrastructure, and work to provide the best experience to our technical staff. You will leverage IaC, Automation, and SRE principles to refine and hone a product that operates 24/7 to support Voleon.
The Cluster Operations team works on the frontline to triage and mitigate real-time operational issues. You will be an integral member of this team, solving day-to-day issues with high urgency, while also engineering systemic improvements and architectural fixes to prevent recurring issues. You will collaborate with engineering teams to develop improvements to monitoring/telemetry. You will help design and oversee operational frameworks to ensure the cluster operates within a set of rigorous SLAs. Responsibilities
Be a first responder in the event of cluster outages or issues. Triage and resolve urgent issues as they arise
Ensure a high degree of cluster uptime (measured in multiple nines), and define + track SLAs to quantify reliability
Diagnose systemic/recurring patterns of problems, and engineer precision solutions to them in collaboration with engineering teams
Develop robust metrics and observability for cluster health and use those metrics to inform your work. Build out custom observability mechanisms when off-the-shelf ones won't do
Help software and research teams design policies around fair cluster usage, and help develop enforcement mechanisms for said policies
Assist in forecasting cluster growth, and help select appropriate scale-up strategies. Help optimize operations across dimensions of cost and usability
Requirements
5+ years of experience in SRE or DevOps roles, preferably working as a senior engineer or tech lead
Knowledge of HPC/batch compute frameworks (Slurm, Kueue, AWS/GCP Batch) and/or machine learning training systems (Kubeflow, MLflow, Horovod)
Ability to develop scripts and utilities of moderate complexity in a common scripting language (Python, Ruby, etc.)
Familiarity with infrastructure-as-code and configuration management tools (Terraform, Ansible)
Experience with cloud infrastructure (AWS or GCP)
Familiarity designing and implementing modern observability stacks (Prometheus, Grafana, Loki, ELK, OpenTelemetry)
Experience with distributed storage technologies (Lustre, Ceph, S3)
Embodies a "system engineer" rather than "system administrator" mindset, thinking systematically and leveraging automation
Bachelor degree in computer science
Preferred Qualifications
Hands-on experience with HPC frameworks (Slurm, Grid Engine) and Kubernetes-based job orchestrators (Airflow, Kueue, Kubeflow Pipelines), along with other distributed computing frameworks (Ray, Modin, Dask, Spark)
Familiarity with ML frameworks (PyTorch/Tensorflow, JAX, Horovod, DeepSpeed)
Familiarity with hybrid/on-prem environments
Experience with containerization (Docker, Podman, Singularity), particularly for HPC/batch compute environments
Experience with HPC networking (InfiniBand, RDMA)
Solid security/IAM foundations (Identity management systems, AWS/GCP IAM, Zero Trust)
The base salary range for this position is $205,000 to $235,000 in the location(s) of this posting. Individual salaries are determined through a variety of factors, including, but not limited to, education, experience, knowledge, skills, and geography. Base salary does not include other forms of total compensation such as bonus compensation and other benefits. Our benefits package includes medical, dental and vision coverage, life and AD&D insurance, 20 days of paid time off, 9 sick days, and a 401(k) plan with a company match. “Friends of Voleon” Candidate Referral ProgramIf you have a great candidate in mind for this role and would like to have the potential to earn $15,000 if your referred candidate is successfully hired and employed by The Voleon Group, please use this form to submit your referral. For more details regarding eligibility, terms and conditions please make sure to review the Voleon Referral Bonus Program. Equal Opportunity EmployerThe Voleon Group is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.
Auto-ApplySenior Site Reliability Engineer
Chicago, IL jobs
Our Team:
Technology drives our business. Our team is made up of talented software engineers, infrastructure engineers, leaders and UX professionals. We care about technology as a craft and a differentiator. We bring our global products to market with a mix of software, cloud, data centers, infrastructure, design and grit.
The Role:
Senior Site Reliability Engineer with extensive experience in automation and programming across the software development lifecycle (SDLC). In this role, you will leverage your technical expertise to develop and scale cloud-based solutions. You'll tackle complex challenges, collaborating closely with development and IT operations teams to enhance system visibility, improve communication, and deliver measurable business value. By optimizing feedback loops and building observable systems, you'll drive customer satisfaction and support revenue growth.
This role is based in our Chicago office, and we follow a hybrid policy in order to foster continuous collaboration.
Job Responsibilities:
Responsible for creating and implementing system enhancements that will improve the performance and reliability of the system.
Leading contributor individually and as a team member, providing direction and mentoring to others.
Work with a highly skilled team of engineers to scale improvements to the cloud and scale them.
Own deployment, availability, reliability, and performance of the systems.
Proactively identify and reduce issues by designing, testing, and implementing software-based solutions.
Be a strong contributor to development of platform services including architecture, provisioning, configuration, deployment, and support.
Requirements:
A bachelor's degree in computer science or a related field, with 3 to 5 years of software DevOps experience.
Strong scripting or programming skills for automating repetitive tasks
DevOps (CI/CD, Docker and Terraform) Experience
Experience architecting and automating cloud-native technologies, deploying applications, and provisioning infrastructure.
Hands-on Experience with Infrastructure as Code, using CloudFormation, Terraform, or other tools.
Experience architecting cloud native CI/CD workflows and tools, such as Code Deploy (AWS)
Hands-on Experience with microservices and distributed application architecture, such as containers, and/or serverless technology.
Good Debugging and troubleshooting skills
Experience with the software development lifecycle and delivery using Agile practices.
Experience with DB deployment and writing SQL/PLSQL queries
Compensation and Benefits
At Morningstar we believe people are at their best when they are at their healthiest. That's why we champion your wellness through a wide-range of programs that support all stages of your personal and professional life. Here are some examples of the offerings we provide:
Financial Health
75% 401k match up to 7%
Stock Ownership Potential
Company provided life insurance - 1x salary + commission
Physical Health
Comprehensive health benefits (medical/dental/vision) including potential premium discounts and company-provided HSA contributions (up to $500-$2,000 annually) for specific plans and coverages
Additional medical Wellness Incentives - up to $300-$600 annual
Company-provided long- and short-term disability insurance
Emotional Health
Trust-Based Time Off
6-week Paid Sabbatical Program
6-Week Paid Family Caregiving Leave
Competitive 8-24 Week Paid Parental Bonding Leave
Adoption Assistance
Leadership Coaching & Formal Mentorship Opportunities
Annual Education Stipend
Tuition Reimbursement
Social Health
Charitable Matching Gifts program
Dollars for Doers volunteer program
Paid volunteering days
15+ Employee Resource & Affinity Groups
Total Cash Compensation Range
$114,100.00 - 193,975.00 USD Annual
Inclusive of annual base salary and target incentive
If you receive and accept an offer from us, we require that personal and any related investments be disclosed confidentiality to our Compliance team (days vary by region). These investments will be reviewed to ensure they meet Code of Ethics requirements. If any conflicts of interest are identified, then you will be required to liquidate those holdings immediately. In addition, dependent on your department and location of work certain employee accounts must be held with an approved broker (for example all, U.S. employee accounts). If this applies and your account(s) are not with an approved broker, you will be required to move your holdings to an approved broker.
Morningstar's hybrid work environment gives you the opportunity to collaborate in-person each week as we've found that we're at our best when we're purposely together on a regular basis. In most of our locations, our hybrid work model is four days in-office each week. A range of other benefits are also available to enhance flexibility as needs change. No matter where you are, you'll have tools and resources to engage meaningfully with your global colleagues.
002_MstarAssocLLC Morningstar Investment Management LLC Legal Entity
Auto-ApplySenior Site Reliability Engineer - Integration Services
Plano, TX jobs
Who we are Collaborative. Respectful. A place to dream and do. These are just a few words that describe what life is like at Toyota. As one of the world's most admired brands, Toyota is growing and leading the future of mobility through innovative, high-quality solutions designed to enhance lives and delight those we serve. We're looking for talented team members who want to Dream. Do. Grow. with us.
An important part of the Toyota family is Toyota Financial Services (TFS), the finance and insurance brand for Toyota and Lexus in North America. While TFS is a separate business entity, it is an essential part of this world-changing company- delivering on Toyota's vision to move people beyond what's possible. At TFS, you will help create best-in-class customer experience in an innovative, collaborative environment.
To save time applying, Toyota does not offer sponsorship of job applicants for employment-based visas or any other work authorization for this position at this time.
Who we're looking for
Toyota Financial Services is hiring a Senior Site Reliability Engineer to support and scale our enterprise integration platforms. You'll focus on keeping our API-driven systems, like MuleSoft, Kafka, and Java microservices, resilient, observable, and automated.
This role is ideal for someone with a software engineering mindset who thrives in platform-focused environments and is passionate about reducing operational toil, improving reliability, and enabling velocity for development teams.
What you'll be doing
* Operate and scale integration and messaging platforms like MuleSoft, Kafka (MSK), Apache Camel, and TIBCO
* Build automation and self-healing capabilities using Python or equivalent scripting
* Define and manage SLIs/SLOs, health checks, and proactive remediation
* Establish observability with tools like Dynatrace, CloudWatch, and centralized logging
* Support enterprise middleware and COTS platforms (e.g., RightFax, Document Management Systems)
* Collaborate with architects and engineers to ensure solutions meet SRE and operational standards
* Participate in on-call rotations and lead incident response/postmortems
* Drive continuous improvement across deployment hygiene, monitoring, and platform resilience
What you bring
* 5+ years of experience in SRE, DevOps, or backend software engineering roles
* Proven experience with Java-based API development and integration tooling
* Hands-on with at least two of the following: MuleSoft, Kafka (MSK/Confluent), Apache Camel, TIBCO (BW/EMS)
* Strong scripting skills (Python preferred) to automate operations and reduce toil
* Experience with observability tools-Dynatrace highly preferred
* Solid grasp of API security protocols (OAuth2, JWT, mTLS)
* Background in middleware, integration platforms, or event-driven systems
Added bonus if you have
* Experience working in hybrid or cloud-native environments (AWS preferred)
* Familiarity with CI/CD pipelines and infrastructure automation tools
* Exposure to integration design patterns (ESB, microservices, pub/sub)
What we'll bring
During your interview process, our team can fill you in on all the details of our industry-leading benefits and career development opportunities. A few highlights include:
* A work environment built on teamwork, flexibility, and respect
* Professional growth and development programs to help advance your career, as well as tuition reimbursement
* Team Member Vehicle Purchase Discount
* Toyota Team Member Lease Vehicle Program (if applicable)
* Comprehensive health care and wellness plans for your entire family
* Toyota 401(k) Savings Plan featuring a company match, as well as an annual retirement contribution from Toyota regardless of whether you contribute
* Paid holidays and paid time off
* Referral services related to prenatal services, adoption, childcare, schools and more
* Tax Advantaged Accounts (Health Savings Account, Health Care FSA, Dependent Care FSA)
* Relocation assistance (if applicable)
#LI-DNI
Belonging at Toyota
Our success begins and ends with our people. We embrace all perspectives and value unique human experiences. Respect for all is our North Star. Toyota is proud to have 10+ different Business Partnering Groups across 100 different North American chapter locations that support team members' efforts to dream, do and grow without questioning that they belong.
Applicants for our positions are considered without regard to race, ethnicity, national origin, sex, sexual orientation, gender identity or expression, age, disability, religion, military or veteran status, or any other characteristics protected by law.
Have a question, need assistance with your application or do you require any special accommodations? Please send an email to *****************************.
Auto-ApplySenior Reliability Engineer - PCBA, Harness & Connectors
Sunnyvale, CA jobs
Figure is an AI Robotics company developing a general purpose humanoid. Our humanoid robot, Figure 02, is designed for commercial tasks and the home. We are based in San Jose, CA and require 5 days/week in-office collaboration. It's time to build.
We are looking for a Senior Reliability Engineer in charge of developing and executing reliability test plans for the printed circuit board assemblies (PCBAs), harnesses, and connectors to ensure they meet our humanoid robot product reliability targets. We need someone who can derive reliability targets and test specs with incomplete field usage knowledge, give data-driven recommendations to the design and manufacturing teams for what to improve, and advocate for design for reliability in the design phase.
Responsibilities:
Work with cross-functional teams, own hardware reliability requirements and validation strategy.
Develop and execute accelerated life tests for PCBAs, electronic components, electrical harness and connectors.
Lead DFMEA efforts with design engineers to assess design risks, impacts, controls, and corrective actions.
Design reliability test flows and procedures, communicate with internal and external/CM teams to execute tests and report results.
Work with test engineers to design setup and fixtures used in reliability testing.
Guide and support PCBA, harness, connector failure analysis, design of experiments (DOEs) and corrective action processes with cross-functional teams.
Analyze field data, assess field risks, and design tests that correlate to field usage conditions.
Requirements:
5+ years of experience in relevant reliability engineering areas.
Bachelor's degree or higher in relevant science and engineering fields.
Strong knowledge of environmental reliability test principles, models, and methodologies, such as high temperature high humidity, thermal cycle/shock, mechanical vibration/shock.
Strong knowledge of industry test standards such as AECQ, JEDEC, IPC standards.
Strong knowledge of electrical circuits, PCBA design and relevant SW tools (e.g. Altium).
Strong knowledge of PCBA, harness and connector failure modes, mechanisms, and FA techniques.
Hands-on experience on field reliability risk analysis and failure prediction methods.
Hands-on experience with Weibull++, JMP, or other reliability statistical analysis software.
Hands-on experience on electronic circuit debug and relevant tools, e.g. source meter, oscilloscope.
Hands-on experience with 3D CAD tool (e.g. CATIA).
Bonus Qualifications:
Experience of shipping reliable robotics, consumer or automotive products.
Knowledge of PCBA, harness and connector manufacturing processes and quality control practices.
Hands-on experience of CATIA V6 CAD.
Hands-on experience of finite element analysis (FEA) SW, e.g. Ansys.
Hands-on experience of FA tools & techniques such as SEM/EDS, CT-Xray, Cross-section, FTIR.
Hands-on experience of accelerometers, load cells, strain gauges and relevant setup and data acquisition techniques.
The US base salary range for this full-time position is between $150,000 - $225,000 annually.
The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.
Auto-ApplySenior Reliability Engineer - PCBA, Harness & Connectors
San Jose, CA jobs
Figure is an AI robotics company developing autonomous general-purpose humanoid robots. The goal of the company is to ship humanoid robots with human level intelligence. Its robots are engineered to perform a variety of tasks in the home and commercial markets. Figure is headquartered in San Jose, CA.
We are looking for a Senior Reliability Engineer in charge of developing and executing reliability test plans for the printed circuit board assemblies (PCBAs), harnesses, and connectors to ensure they meet our humanoid robot product reliability targets. We need someone who can derive reliability targets and test specs with incomplete field usage knowledge, give data-driven recommendations to the design and manufacturing teams for what to improve, and advocate for design for reliability in the design phase.
Responsibilities:
Work with cross-functional teams, own hardware reliability requirements and validation strategy.
Develop and execute accelerated life tests for PCBAs, electronic components, electrical harness and connectors.
Lead DFMEA efforts with design engineers to assess design risks, impacts, controls, and corrective actions.
Design reliability test flows and procedures, communicate with internal and external/CM teams to execute tests and report results.
Work with test engineers to design setup and fixtures used in reliability testing.
Guide and support PCBA, harness, connector failure analysis, design of experiments (DOEs) and corrective action processes with cross-functional teams.
Analyze field data, assess field risks, and design tests that correlate to field usage conditions.
Requirements:
5+ years of experience in relevant reliability engineering areas.
Bachelor's degree or higher in relevant science and engineering fields.
Strong knowledge of environmental reliability test principles, models, and methodologies, such as high temperature high humidity, thermal cycle/shock, mechanical vibration/shock.
Strong knowledge of industry test standards such as AECQ, JEDEC, IPC standards.
Strong knowledge of electrical circuits, PCBA design and relevant SW tools (e.g. Altium).
Strong knowledge of PCBA, harness and connector failure modes, mechanisms, and FA techniques.
Hands-on experience on field reliability risk analysis and failure prediction methods.
Hands-on experience with Weibull++, JMP, or other reliability statistical analysis software.
Hands-on experience on electronic circuit debug and relevant tools, e.g. source meter, oscilloscope.
Hands-on experience with 3D CAD tool (e.g. CATIA).
Bonus Qualifications:
Experience of shipping reliable robotics, consumer or automotive products.
Knowledge of PCBA, harness and connector manufacturing processes and quality control practices.
Hands-on experience of CATIA V6 CAD.
Hands-on experience of finite element analysis (FEA) SW, e.g. Ansys.
Hands-on experience of FA tools & techniques such as SEM/EDS, CT-Xray, Cross-section, FTIR.
Hands-on experience of accelerometers, load cells, strain gauges and relevant setup and data acquisition techniques.
The US base salary range for this full-time position is between $150,000 - $225,000 annually.
The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.
Auto-ApplySenior Lead Site Reliability Engineer
Houston, TX jobs
Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.
As a Senior Lead Site Reliability Engineer at JPMorgan Chase within the Enterprise technology, Market risk team, you work with your fellow stakeholders to define non-functional requirements (NFRs) and availability targets for the services in your application and product lines. You will ensure those NFRs are accounted for in your products' design and test phases, that your service level indicators are effectively measuring customer experience, and that service level objectives are defined with stakeholders and implemented in production.
Job responsibilities
Creates high quality designs, roadmaps, and program charters that are delivered by you or the engineers under your guidance
Provides advice and mentoring to other engineers and acts as a key resource for technologists seeking advice on technical and business-related issues
Demonstrates site reliability principles and practices every day and champions the adoption of site reliability throughout your team
Collaborates with others to create and implement observability and reliability designs for complex systems that are robust, stable, and do not incur additional toil or technical debt
Works toward becoming an expert on the applications and platforms in your remit while understanding their interdependencies and limitations
Evolves and debug critical components of applications and platforms
Provides comprehensive and ongoing guidance, tools, and solutions to support the firms' growth
Makes significant contributions to JPMorgan Chase's site reliability community via internal forums, communities of practice, guilds, and conferences
Required qualifications, capabilities, and skills
Formal training or certification on software engineering concepts and 5+ years applied experience
Advanced knowledge in site reliability culture and principles with demonstrated ability to implement site reliability within an application or platform
Advanced knowledge and experience in observability such as white and black box monitoring, service level objectives, alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc.
Advanced knowledge of software applications and technical processes with considerable depth in one or more technical disciplines
Ability to communicate data-based solutions with complex reporting and visualization methods
Recognized as an active contributor of the engineering community
Continues to expand network and leads evaluation sessions with vendors to see how offerings can fit into the firm's strategy
Preferred qualifications, capabilities, and skills
Ability to anticipate, identify, and troubleshoot defects found during testing
Strong communication skills with ability to mentor and educate others on site reliability principles and practices
Auto-ApplyStaff Site Reliability Engineer
Sunnyvale, CA jobs
Figure is an AI robotics company developing autonomous general-purpose humanoid robots. The goal of the company is to ship humanoid robots with human level intelligence. Its robots are engineered to perform a variety of tasks in the home and commercial markets. Figure is headquartered in San Jose, CA.
We are looking for a Site Reliability Engineer to own our internal systems infrastructure. This role is responsible for setting up and managing cloud and on-prem infrastructure to deliver highly available, reliable, and automated systems.
Responsibilities:
Be the go to person for mission critical infrastructure enabling critical operations such as Source Configuration Management, CI/CD systems, software distribution, supplier portals, manufacturing and more.
Migrate SaaS to self-hosted solutions to enhance security and reliability.
Implement monitoring and alerting systems, and define incident response plans and runbooks.
Reduce human workload through automation to automate deployment and scaling.
Establish strong relationships with stakeholders to identify infrastructure needs and establish Service Level Objectives.
Use a data driven approach to demonstrate service robustness and track optimization work.
Partner with the security team to ensure that security remediations and updates are applied in a timely manner.
Requirements:
Strong experience with Linux/Unix systems administration
Proficiency in programming/scripting
Extensive experience with cloud platforms (Azure, AWS, GCP) and on-prem hardware architectures
Experience designing, deploying, and operating high-availability, fault-tolerant, and distributed systems.
Mastery of infrastructure as code (Terraform, CloudFormation, Ansible…)
Familiarity with monitoring, logging, and alerting tools (Prometheus, Grafana, Datadog…)
Solid understanding of networking fundamentals (TCP/IP, DNS, HTTP, load balancers, firewalls)
Experience defining Service Level Objectives (SLO), developing runbooks/incident response plans, facilitating post-mortems and managing systems assets.
Ability to work in cross-functional teams with developers, infra, and product teams
Excellent verbal and written communication skills
The US base salary range for this full-time position is between $175,000 - $250,000 annually.
The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.
Auto-ApplyPrincipal Site Reliability Engineer
Palo Alto, CA jobs
Join a globally recognized financial organization and advance your profession to new heights by contributing to revolutionary projects. You've discovered the perfect environment to have a major impact. As a **Principal Site Reliability Engineer** at JPMorgan Chase within the **Enterprise Technology, AI/ML & Data Platforms division** , you will utilize your expertise to create innovative solutions that improve critical incident management and streamline the software development lifecycle throughout the organization. Your role will involve overseeing, designing, and deploying infrastructure components to enhance reliability and ensure operational efficiency.
**Job responsibilities**
+ Architect and implement observability platforms and tools for proactive detection and continuous improvement.
+ Lead the design and development of core observability services, including metrics pipelines and log aggregation.
+ Leverage modern technologies such as Open Telemetry and AI/ML for anomaly detection and automated insights.
+ Collaborate with engineering and SRE teams to define service-level objectives (SLOs) and error budgets.
+ Provide technical leadership and mentorship to engineering teams, ensuring best practices in system design.
+ Champion observability as a first-class concern in the software development lifecycle.
+ Influence platform strategy and roadmap through deep technical insight and alignment with business priorities.
+ Write advanced documentation and create executive presentations that translate technical issues into business impact.
+ Participate in industry professional forums and monitor relevant industry technologies and standards.
+ Lead medium to large projects by bringing together the proper perspective and integrating feedback from team members.
+ Participate in support responsibilities for coverage of critical applications.
**Required qualifications, capabilities, and skills**
+ Formal training or certification on site reliability engineering concepts and 10+ years applied experience.
+ Ability to determine how each system relates to each other and build automation to improve reliability.
+ Experience with translating research, analysis, and tests into business recommendations.
+ Ability to balance and be accountable for the work of multiple architects and designers.
+ Understands and leads partnerships across job functions to develop efficient systems.
+ Engages team members and expresses complex ideas with appropriate level of detail, while providing constructive feedback.
+ Self-motivated and able to work well under pressure with minimal supervision.
+ Ability to tackle a problem by using a logical, systematic, sequential approach.
**Preferred qualifications, capabilities, and skills**
+ Experience with cloud-native instrumentation and streaming data platforms.
+ Influence technology and policy decisions while fostering commitment and confidence in team members.
+ Develop effective solutions and analyze competitive positions by considering market trends.
+ Support the introduction of innovative methods and communicate clearly to persuade audiences.
+ Demonstrate concern and meet the needs of both internal and external customers.
\#LI-RB3
JPMorganChase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world's most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management.
We offer a competitive total rewards package including base salary determined based on the role, experience, skill set and location. Those in eligible roles may receive commission-based pay and/or discretionary incentive compensation, paid in the form of cash and/or forfeitable equity, awarded in recognition of individual achievements and contributions. We also offer a range of benefits and programs to meet employee needs, based on eligibility. These benefits include comprehensive health care coverage, on-site health and wellness centers, a retirement savings plan, backup childcare, tuition reimbursement, mental health support, financial coaching and more. Additional details about total compensation and benefits will be provided during the hiring process.
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation.
JPMorgan Chase & Co. is an Equal Opportunity Employer, including Disability/Veterans
**Base Pay/Salary**
Jersey City,NJ $204,250.00 - $285,000.00 / year; Palo Alto,CA $204,250.00 - $285,000.00 / year
Tencent Cloud PaaS Associate Site Reliability Engineer
Palo Alto, CA jobs
Business UnitWhat the Role EntailsJob Description: Research industry solutions, combine the customer's business technology solutions and the characteristics of Tencent's audio and video products, sort out valuable solutions, and organize them into sales support materials. Work closely with the business team to analyze the technical structure of the customer's media business and explore the customer's needs and value in audio and video scenarios. Provide industry solutions and cases serving the international market, such as OTT, social networking, games, education, business, etc. Conduct industry analysis and research, find a list of customers that meet the goals, and conduct business development work;Who We Look ForBachelor degree or above, computer, MBA related majors are preferred. Fluent English can be used as a working language, good communication skills and customer service awareness, and good desk research and writing skills; Good at thinking, high business sensitivity, excellent learning ability, logical thinking ability and problem-solving ability; Self-motivated and responsible, with passion for work, good stress resistance and team spirit.
Location State(s)
US-California-Palo AltoThe expected base pay range for this position in the location(s) listed above is $76,400.00 to $143,900.00 per year. Actual pay may vary depending on job-related knowledge, skills, and experience. Employees hired for this position may be eligible for a sign on payment, relocation package, and restricted stock units, which will be evaluated on a case-by-case basis. Subject to the terms and conditions of the plans in effect, hired applicants are also eligible for medical, dental, vision, life and disability benefits, and participation in the Company's 401(k) plan. The Employee is also eligible for up to 15 to 25 days of vacation per year (depending on the employee's tenure), up to 13 days of holidays throughout the calendar year, and up to 10 days of paid sick leave per year. Your benefits may be adjusted to reflect your location, employment status, duration of employment with the company, and position level. Benefits may also be pro-rated for those who start working during the calendar year.Equal Employment Opportunity at Tencent
As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.
Auto-ApplySenior Bankruptcy Process Engineer
Plano, TX jobs
Come join our amazing team and work remote from home!
The Senior Bankruptcy Process Engineer will be responsible for improving the Bankruptcy processes to ensure all attributes associated with a process are defined, effective controls are in place and are actively monitored for continuous improvement opportunities. Materially improve business performance, shape the deployment of continuous improvement methodology, and develop tools/analytics to drive cultural and organizational transformation to accomplish business objectives. Collaborate with senior business leaders, functional leaders, consults with multiple business units, and employees to understand needs, map current states, design future states, and deploy sustainable processes. Conducts all activities in adherence to all applicable investor timelines and in accordance with the company's policies and procedures, all US state and federal laws and regulations, wherein the company operates. The target pay for this position is $90,000 - $120,000 Annually.
What you'll do:
.
• Learn all aspects of the business process, key performance drivers, and operational report metrics.
• Gain expert knowledge of the operational systems and utilization within the business process.
• Assist with scope management, change management and solution definition.
• Create functional specifications for new or modified systems and processes.
• Develop process maps and additional necessary business process management tools to assist the Bankruptcy leadership team in effectively managing the processes they own.
• Analyze data and interrupt results, trouble shoot if results look incorrect and request changes to data as needed to meet business requirements.
• Gather and document business requirements, formulate use cases, and participate in solution design.
• Design, develop and enhance bankruptcy department workflow systems (Fiserv, Tempo).
• Track requirements, provide status and ensure quality of solution throughout the project/initiative.
• Responsible for User Acceptance Testing and may be responsible for management and completion of Quality Assurance Testing (when applicable).
• Participate in project communications, training materials, and business procedures/documentation.
• Conceptualize and design dashboards and reports that are representative of the business unit's productivity.
• Create projects/ideas that will help gain efficiency, cost savings, visibility, and management of business unit activities via continuous improvement plans (CIPs)
• Manage projects and business unit stakeholders in relation to project needs. Proactive engagement required.
• Lead assigned projects within the organization through the project lifecycle, including analysis, specification, design, development, and deployment.
• Analyze findings and implement process improvements to the business and/or leadership.
• Build and execute effective project communication plans to ensure stakeholders, team members and impacted parties are appropriately apprised of the project goals, expectations, status, and delivery.
• Establish key performance metrics, design reporting/dashboard solutions, and promote the use of structured information to drive enhanced business performance.
• Function as the Project Manager for all bankruptcy related projects. Ensure proper and timely communication of project milestones.
• Performs other duties and special projects as assigned.
What you'll need:
• High school diploma or equivalent work experience.
• Knowledge of various Bankruptcy platforms used within Loan Servicing
• Extensive knowledge of Tempo Application
• A strong working knowledge of all the bankruptcy processes and all Chapters; as well as knowledge of investor and regulatory requirements
• Strong Microsoft Excel, data analysis and process improvement skillsets
• Strong understanding of software development, lifecycle, and program/project management
• Proficient with Microsoft Office, SQL Server Management Studio, Word, Visio, PowerPoint, and Project
• Five (5) years' experience required in bankruptcy with increasing responsibilities in the banking, finance, or mortgage industry.
• Four plus (4+) years' experience working with continuous improvement methodologies like Lean Manufacturing or Six Sigma related experience. Yellow belt preferred.
• Four plus (4+) years' experience in successfully managing projects and/or initiatives under aggressive timelines.
Our Company:
Carrington Mortgage Services is part of The Carrington Companies, which provide integrated, full-lifecycle mortgage loan servicing assistance to borrowers and investors, delivering exceptional customer care and programs that support borrowers and their homeownership experience. We hope you'll consider joining our growing team of uniquely talented professionals as we transform residential real estate. To read more visit: ***************************
What We Offer:
Comprehensive healthcare plans for you and your family. Plus, a discretionary 401(k) match of 50% of the first 4% of pay contributed.
Access to several fitness, restaurant, retail (and more!) discounts through our employee portal.
Customized training programs to help you advance your career.
Employee referral bonuses so you'll get the opportunity to work with friends (and get some extra cash in your pocket!).
Educational Reimbursement.
Carrington Charitable Foundation contributes to the community through causes that reflect the interests of Carrington Associates. For more information about Carrington Charitable Foundation, and the organizations and programs, it supports through specific fundraising efforts, please visit: carringtoncf.org.
#Carrington
#LI-GV1
Auto-ApplyPrincipal Site Reliability Engineer
Palo Alto, CA jobs
Join a globally recognized financial organization and advance your profession to new heights by contributing to revolutionary projects. You've discovered the perfect environment to have a major impact.
As a Principal Site Reliability Engineer at JPMorgan Chase within the Enterprise Technology, AI/ML & Data Platforms division, you will utilize your expertise to create innovative solutions that improve critical incident management and streamline the software development lifecycle throughout the organization. Your role will involve overseeing, designing, and deploying infrastructure components to enhance reliability and ensure operational efficiency.
Job responsibilities
Architect and implement observability platforms and tools for proactive detection and continuous improvement.
Lead the design and development of core observability services, including metrics pipelines and log aggregation.
Leverage modern technologies such as Open Telemetry and AI/ML for anomaly detection and automated insights.
Collaborate with engineering and SRE teams to define service-level objectives (SLOs) and error budgets.
Provide technical leadership and mentorship to engineering teams, ensuring best practices in system design.
Champion observability as a first-class concern in the software development lifecycle.
Influence platform strategy and roadmap through deep technical insight and alignment with business priorities.
Write advanced documentation and create executive presentations that translate technical issues into business impact.
Participate in industry professional forums and monitor relevant industry technologies and standards.
Lead medium to large projects by bringing together the proper perspective and integrating feedback from team members.
Participate in support responsibilities for coverage of critical applications.
Required qualifications, capabilities, and skills
Formal training or certification on site reliability engineering concepts and 10+ years applied experience.
Ability to determine how each system relates to each other and build automation to improve reliability.
Experience with translating research, analysis, and tests into business recommendations.
Ability to balance and be accountable for the work of multiple architects and designers.
Understands and leads partnerships across job functions to develop efficient systems.
Engages team members and expresses complex ideas with appropriate level of detail, while providing constructive feedback.
Self-motivated and able to work well under pressure with minimal supervision.
Ability to tackle a problem by using a logical, systematic, sequential approach.
Preferred qualifications, capabilities, and skills
Experience with cloud-native instrumentation and streaming data platforms.
Influence technology and policy decisions while fostering commitment and confidence in team members.
Develop effective solutions and analyze competitive positions by considering market trends.
Support the introduction of innovative methods and communicate clearly to persuade audiences.
Demonstrate concern and meet the needs of both internal and external customers.
#LI-RB3
Auto-ApplySite Reliability Engineer - Capital Markets
New York, NY jobs
Jefferies is seeking for Site Reliability Engineer to play an instrumental role in supporting Equity Front office trading application, risk and middle office real time products, developed and used for Equity Cash and ETS application. As part of the wider platform engineering team, you will be working closely with the Business users interactively throughout the day, along with technical, analysis and testing colleagues. Investigation and resolution of the work items at hand will require competent technical skills and a keen intellect. The business is a growth area, with current investments taking place in all the technology, business and middle office areas.
Responsibilities:
* Front Line Site Reliable Engineering and Support functions for Equity trading systems used by Jefferies clients as well as internal users.
* Build monitoring tools for application and infrastructure components.
* Implement and manage scalable infrastructure using cloud-native technologies and tools.
* Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding.
* Partner with business, development and infrastructure teams to improve services through rigorous testing and release procedures.
* Develop and maintain CI/CD pipelines to streamline deployment processes.
* Expedient deployment of new systems. Capacity planning, Platform Management, and support for increasing volumes and business growth.
* Create sustainable systems and services through automation.
* Collaborate with Application team to establish and enforce production and development standards.
* Document procedures, best practices and troubleshooting FAQs.
* Resolve complex application and technical problems.
* Debugging the system and fixing the production related issues.
* Escalate / follow-up on permanent fix for development related issues.
* Lead incident response efforts and post-mortem analysis to prevent future occurrences.
* Handles complex operational tasks and recommends process and technology changes.
* Global support and includes weekend availability to troubleshoot production related issues and perform checkouts.
* Ability to work both independently and in groups in an energetic, diverse environment.
* Participate in on-call rotations to ensure 24/7 system availability and support.
* Support compliance and legal queries.
Qualifications:
* Strong experience in Windows and Linux/Unix services.
* Strong experience in scripting language like Power shell, Python and SQL.
* Strong Knowledge of monitoring tools - Nagios, Splunk, OTEL, Datadog
* Strong Knowledge of FIX protocol
* Strong Domain skills - Must have working experience in Capital Markets across modules and instruments especially - CASH, ETS, Bonds, Options, Futures, Swaps products
* Experience in BFSI (Banking and Financial Industry) Domain applications with a proper understanding of the Trade Lifecycle.
* Excellent communication, time management and project management skills.
Primary Location Full Time Salary Range of $175,000 - $200,000
Auto-ApplyNetwork Reliability Engineer III
Chicago, IL jobs
As we embark on a journey to transform the Network Services Group in CME, we are seeking a Network Reliability Engineer III to join our dynamic team. In this role, you will design, develop and maintain self-service tools and applications that enhance productivity and reduce operational costs. You will work across the full stack-both front-end and back-end-to architect microservices (GKE) in Google Cloud Platform (GCP), driving our infrastructure towards greater automation and reliability.
We are a global team across US, UK, India and Singapore made up of a diverse range of people from varied backgrounds who each bring unique network experiences and skill sets. The relatively new Network Reliability/Automation team are responsible for building a suite of custom automation tools and developing our self-healing capabilities while working closely with other members of the Network Services team in project delivery to ensure one of the largest Exchange network infrastructures in the world is highly available, resilient, secure and reliable.
Responsibilities
* Design, develop and maintain self-service and automation tools to streamline IT operations and reduce manual effort.
* Engage in full-stack development, delivering responsive front-end interfaces as well as robust scalable back-end services.
* With support Architect, deploy and scale microservices on GCP, with particular emphasis on containers and Google Kubernetes Engine (GKE).
* Manage cloud infrastructure via Infrastructure-as-Code (IaC), primarily using Terraform to provision and maintain resources.
* Operate and troubleshoot solutions on Linux-based platforms, leveraging Visual Studio Code (VSCode) as the primary development environment.
* Adhere to software engineering best practices, including PEP8 coding standards, SOLID design principles, and established SDLC processes.
* Implement and manage CI/CD pipelines with a DevOps mindset, ensuring rapid, reliable delivery of code.
* Develop and consume Flask-based RESTful APIs to support network and security automation.
* Collaborate within an Agile Scrum framework, utilizing tools such as Bitbucket and Jira to track progress and manage sprints.
* Apply strong analytical and problem-solving skills to balance multiple project variables and deliver high-quality solutions on schedule.
What we are looking for
* Approximately 2-3 years' hands-on Python programming experience, with a demonstrable track record of automation or tooling projects.
* Knowledge and experience working with both Python Django and Flask in a corporate environment.
* Any experience in network and security automation, coupled with understanding of network fundamentals (routing, switching, firewalls, VPNs) would be beneficial.
* Experience developing REST APIs using Flask (or a comparable Python framework).
* Applicants with front-end experience using Javascript/JQuery/HTML5/CSS would be ideal.
* Familiarity with Infrastructure-as-Code using Terraform (or similar) to manage cloud resources.
* Comfortable working in Linux environments and proficient in using Visual Studio Code (VSCode).
* Strong software engineering mindset: adherence to PEP8, SOLID principles, and best practices for SDLC, CI/CD and DevOps.
* Excellent communication skills, both verbal and written, with the ability to convey technical concepts to diverse stakeholders.
* Highly analytical, with the ability to troubleshoot complex issues and manage multiple tasks concurrently.
* Experience working in Agile Scrum teams, utilizing Bitbucket and Jira (or equivalent tools) for version control and project tracking.
Personal Attributes
* Proactive and positive attitude, taking initiative to identify and resolve issues ahead of time.
* Collaborative team player, eager to contribute knowledge and assist colleagues.
* Innovative thinker who brings fresh ideas and constructive suggestions for continuous improvement.
Education
Bachelor's Degree in Computer Science, Engineering or a related field is preferred. Equivalent practical experience will also be considered.
#LI - Hybrid
#LI - JK1
CME Group is committed to offering a competitive total rewards package for our employees that recognizes their contributions to the business and reflects our long-term investment in their future. The pay range for this role is $100,700-$167,800. Actual salary offered will be dependent on a wide array of factors including but not limited to: relevant experience, skills, education and comparison to internal employees (where relevant). Our compensation program also includes an annual target bonus opportunity for all employees, as well as the opportunity to become an owner in the company through our broad-based equity program. Through our benefits program, we strive to offer flexibility, value and choice. From comprehensive health coverage, to a retirement package that includes both a 401(k) and an active pension plan, to highly competitive education reimbursement provisions, paid time off and a mental health benefit, CME Group offers a holistic benefits package for our team and their dependents.
CME Group: Where Futures are Made
CME Group is the world's leading derivatives marketplace. But who we are goes deeper than that. Here, you can impact markets worldwide. Transform industries. And build a career by shaping tomorrow. We invest in your success and you own it - all while working alongside a team of leading experts who inspire you in ways big and small. Problem solvers, difference makers, trailblazers. Those are our people. And we're looking for more.
At CME Group, we embrace our employees' unique experiences and skills to ensure that everyone's perspectives are acknowledged and valued. As an equal-opportunity employer, we consider all potential employees without regard to any protected characteristic.
Important Notice: Recruitment fraud is on the rise, with scammers using misleading promises of job offers and interviews to solicit money and personal information from job seekers. CME Group adheres to established procedures designed to maintain trust, confidence and security throughout our recruitment process. Learn more here.
Reliability Engineer*
Georgia jobs
Job Title
Reliability Engineer
Collaborate with Innovative 3Mers Around the World
Choosing where to start and grow your career has a major impact on your professional and personal life, so it's equally important you know that the company that you choose to work at, and its leaders, will support and guide you. With a wide variety of people, global locations, technologies and products, 3M is a place where you can collaborate with other curious, creative 3Mers.
This position provides an opportunity to transition from other private, public, government or military experience to a 3M career.
The Impact You'll Make in this Role
As a(n) HANDS ON Reliability Engineer, you will have the opportunity to tap into your curiosity and collaborate with some of the most innovative people around the world. Here, you will make an impact by:
Perform failure mode and effect analysis to assure the proper Preventive & Predictive Maintenance programs are implemented, audited and improved on all existing and future assets.
Application of Reliability Based Maintenance programs such as Reliability Centered Maintenance (RCM) and Total Productive Maintenance (TPM).
Assess & develop capability of mechanics on their role in reliability improvement and to advance their technical capabilities.
Analyze data (failure, cost, uptime, etc.) and apply appropriate reliability analysis tools to develop & implement improvement plans.
Perform & document equipment criticality analysis in support of an effective critical spares strategy.
Submit recommendations and justification for capital expenditures that support and improve the Reliability Program.
Provide an external awareness of methods and technologies that advance our own internal body of knowledge for the improvement of our operations reliability.
Your Skills and Expertise
To set you up for success in this role from day one, 3M requires (at a minimum) the following qualifications:
Technical degree or higher (completed and verified prior to start) and Two (2) years of manufacturing experience in a private, public, government or military environment.
OR
Associates Degree or higher (completed and verified prior to start) and Two (2) years of manufacturing experience in a private, public, government or military environment.
AND
One (1) year of experience with mechanical and electrical drawings.
Additional qualifications that could help you succeed even further in this role include:
Bachelor's degree in Electrical, Mechanical, or Mechatronics Engineering from an accredited institution
Five (5) years of manufacturing in automotive or aerospace private, public, government or military environment
Experience with reliability analysis, predictive (PdM), and preventative maintenance (PM).
Skills include… Strong communication, independent, strategic, problem solving. PLC, Automation, variable frequency drives
Work location:
On-site
Clarkston, GA
Travel: May include up to 5% domestic/international]
Relocation Assistance: Not Authorized
Must be legally authorized to work in country of employment without sponsorship for employment visa status (e.g., H1B status).
Responsibilities of this position may include direct and/or indirect physical or logical access to information, systems, technologies subjected to the regulations/compliance with U.S. Export Control Laws.
U.S. Export Control laws and U.S. Government Department of Defense contracts and sub-contracts impose certain restrictions on companies and their ability to share export-controlled and other technology and services with certain "non-U.S. persons" (persons who are not U.S. citizens or nationals, lawful permanent residents of the U.S., refugees, "Temporary Residents" (granted Amnesty or Special Agricultural Worker provisions), or persons granted asylum (but excluding persons in nonimmigrant status such as H-1B, L-1, F-1, etc.) or non-U.S. citizens.
To comply with these laws, and in conjunction with the review of candidates for those positions within 3M that may present access to export controlled technical data, 3M must assess employees' U.S. person status, as well as citizenship(s).
The questions asked in this application are intended to assess this and will be used for evaluation purposes only. Failure to provide the necessary information in this regard will result in our inability to consider you further for this particular position. The decision whether or not to file or pursue an export license application is at 3M Company's sole election.
Supporting Your Well-being
3M offers many programs to help you live your best life - both physically and financially. To ensure competitive pay and benefits, 3M regularly benchmarks with other companies that are comparable in size and scope.
Chat with Max
For assistance with searching through our current job openings or for more information about all things 3M, visit Max, our virtual recruiting
Applicable to US Applicants Only:The expected compensation range for this position is $81,983 - $100,202, which includes base pay plus variable incentive pay, if eligible. This range represents a good faith estimate for this position. The specific compensation offered to a candidate may vary based on factors including, but not limited to, the candidate's relevant knowledge, training, skills, work location, and/or experience. In addition, this position may be eligible for a range of benefits (e.g., Medical, Dental & Vision, Health Savings Accounts, Health Care & Dependent Care Flexible Spending Accounts, Disability Benefits, Life Insurance, Voluntary Benefits, Paid Absences and Retirement Benefits, etc.). Additional information is available at: ******************************************************************* Faith Posting Date Range 08/11/2025 To 09/10/2025 Or until filled All US-based 3M full time employees will need to sign an employee agreement as a condition of employment with 3M. This agreement lays out key terms on using 3M Confidential Information and Trade Secrets. It also has provisions discussing conflicts of interest and how inventions are assigned. Employees that are Job Grade 7 or equivalent and above may also have obligations to not compete against 3M or solicit its employees or customers, both during their employment, and for a period after they leave 3M.Learn more about 3M's creative solutions to the world's problems at ********** or on Instagram, Facebook, and LinkedIn @3M.Responsibilities of this position include that corporate policies, procedures and security standards are complied with while performing assigned duties.Safety is a core value at 3M. All employees are expected to contribute to a strong Environmental Health and Safety (EHS) culture by following safety policies, identifying hazards, and engaging in continuous improvement.Pay & Benefits Overview: https://**********/3M/en_US/careers-us/working-at-3m/benefits/3M does not discriminate in hiring or employment on the basis of race, color, sex, national origin, religion, age, disability, veteran status, or any other characteristic protected by applicable law.
Please note: your application may not be considered if you do not provide your education and work history, either by: 1) uploading a resume, or 2) entering the information into the application fields directly.
3M Global Terms of Use and Privacy Statement
Carefully read these Terms of Use before using this website. Your access to and use of this website and application for a job at 3M are conditioned on your acceptance and compliance with these terms.
Please access the linked document by clicking here, select the country where you are applying for employment, and review. Before submitting your application, you will be asked to confirm your agreement with the terms.
Auto-ApplyReliability Engineer*
Clarkston, GA jobs
**Job Title** _Reliability Engineer_ **Collaborate with Innovative 3Mers Around the World** Choosing where to start and grow your career has a major impact on your professional and personal life, so it's equally important you know that the company that you choose to work at, and its leaders, will support and guide you. With a wide variety of people, global locations, technologies and products, 3M is a place where you can collaborate with other curious, creative 3Mers.
**This position provides an opportunity to transition from other private, public, government or military experience to a 3M career.**
**The Impact You'll Make in this Role**
As a(n) HANDS ON Reliability Engineer, you will have the opportunity to tap into your curiosity and collaborate with some of the most innovative people around the world. Here, you will make an impact by:
+ Perform failure mode and effect analysis to assure the proper Preventive & Predictive Maintenance programs are implemented, audited and improved on all existing and future assets.
+ Application of Reliability Based Maintenance programs such as Reliability Centered Maintenance (RCM) and Total Productive Maintenance (TPM).
+ Assess & develop capability of mechanics on their role in reliability improvement and to advance their technical capabilities.
+ Analyze data (failure, cost, uptime, etc.) and apply appropriate reliability analysis tools to develop & implement improvement plans.
+ Perform & document equipment criticality analysis in support of an effective critical spares strategy.
+ Submit recommendations and justification for capital expenditures that support and improve the Reliability Program.
+ Provide an external awareness of methods and technologies that advance our own internal body of knowledge for the improvement of our operations reliability.
**Your Skills and Expertise**
To set you up for success in this role from day one, 3M requires (at a minimum) the following qualifications:
+ Technical degree or higher (completed and verified prior to start) and Two (2) years of manufacturing experience in a private, public, government or military environment.
OR
+ Associates Degree or higher (completed and verified prior to start) and Two (2) years of manufacturing experience in a private, public, government or military environment.
AND
+ One (1) year of experience with mechanical and electrical drawings.
Additional qualifications that could help you succeed even further in this role include:
+ Bachelor's degree in Electrical, Mechanical, or Mechatronics Engineering from an accredited institution
+ Five (5) years of manufacturing in automotive or aerospace private, public, government or military environment
+ Experience with reliability analysis, predictive (PdM), and preventative maintenance (PM).
+ Skills include... Strong communication, independent, strategic, problem solving. PLC, Automation, variable frequency drives
**Work location:**
+ **On-site**
+ **Clarkston, GA**
**Travel: May include up to** **5%** **domestic/international]**
**Relocation Assistance: Not Authorized**
**Must be legally authorized to work in country of employment without sponsorship for employment visa status (e.g., H1B status).**
_Responsibilities of this position may include direct and/or indirect physical or logical access to information, systems, technologies subjected to the_ _regulations/compliance_ _with U.S. Export Control Laws._
_U.S. Export Control laws and U.S. Government Department of Defense contracts and sub-contracts impose certain restrictions on companies and their ability to share export-controlled and other technology and services with certain "non-U.S. persons" (persons who are not U.S. citizens or nationals, lawful permanent residents of the U.S., refugees, "Temporary Residents" (granted Amnesty or Special Agricultural Worker provisions), or persons granted asylum (but excluding persons in nonimmigrant status such as H-1B, L-1, F-1, etc.) or non-U.S. citizens._
_To comply with these laws, and in conjunction with the review of candidates for those positions within 3M that may present access to export controlled technical data, 3M must assess employees' U.S. person status, as well as citizenship(s)._
_The questions asked in this application are intended to assess this and will be used for evaluation purposes only. Failure to provide the necessary information in this regard will result in our inability to consider you further for this particular position. The decision whether or not to file or pursue an export license application is at 3M Company's sole election._
**Supporting Your Well-being**
3M offers many programs to help you live your best life - both physically and financially. To ensure competitive pay and benefits, 3M regularly benchmarks with other companies that are comparable in size and scope.
**Chat with Max**
For assistance with searching through our current job openings or for more information about all things 3M, visit Max, our virtual recruiting
Applicable to US Applicants Only:The expected compensation range for this position is $81,983 - $100,202, which includes base pay plus variable incentive pay, if eligible. This range represents a good faith estimate for this position. The specific compensation offered to a candidate may vary based on factors including, but not limited to, the candidate's relevant knowledge, training, skills, work location, and/or experience. In addition, this position may be eligible for a range of benefits (e.g., Medical, Dental & Vision, Health Savings Accounts, Health Care & Dependent Care Flexible Spending Accounts, Disability Benefits, Life Insurance, Voluntary Benefits, Paid Absences and Retirement Benefits, etc.). Additional information is available at: ****************************************************************
Good Faith Posting Date Range 08/11/2025 To 09/10/2025 Or until filled
All US-based 3M full time employees will need to sign an employee agreement as a condition of employment with 3M. This agreement lays out key terms on using 3M Confidential Information and Trade Secrets. It also has provisions discussing conflicts of interest and how inventions are assigned. Employees that are Job Grade 7 or equivalent and above may also have obligations to not compete against 3M or solicit its employees or customers, both during their employment, and for a period after they leave 3M.
Learn more about 3M's creative solutions to the world's problems at ********** or on Instagram, Facebook, and LinkedIn @3M.
Responsibilities of this position include that corporate policies, procedures and security standards are complied with while performing assigned duties.
Safety is a core value at 3M. All employees are expected to contribute to a strong Environmental Health and Safety (EHS) culture by following safety policies, identifying hazards, and engaging in continuous improvement.
Pay & Benefits Overview: https://**********/3M/en\_US/careers-us/working-at-3m/benefits/
3M does not discriminate in hiring or employment on the basis of race, color, sex, national origin, religion, age, disability, veteran status, or any other characteristic protected by applicable law.
**Please note: your application may not be considered if you do not provide your education and work history, either by: 1) uploading a resume, or 2) entering the information into the application fields directly.**
**3M Global Terms of Use and Privacy Statement**
Carefully read these Terms of Use before using this website. Your access to and use of this website and application for a job at 3M are conditioned on your acceptance and compliance with these terms.
Please access the linked document by clicking here (************************************************************************************************* , select the country where you are applying for employment, and review. Before submitting your application, you will be asked to confirm your agreement with the terms.
At 3M we apply science in collaborative ways to improve lives daily as our employees connect with customers all around the world. Learn more about 3M's creative solutions to global challenges at ********** or on Twitter @3M or @3MNews.
3M does not discriminate in hiring or employment on the basis of race, color, sex, national origin, religion, age, disability, veteran status, or any other characteristic protected by applicable law.
Site Reliability Engineer
Atlanta, GA jobs
Must Have Technical/Functional Skills * Monitoring solutions - CloudWatch, Dynatrace, PagerDuty * DevOps - GitLab, GitLab CI/CD, AWS Cloud Development Kit (CDK), CloudFormation (CFT) and CodePipeline * Languages, IDEs, Tools & Architectures - Node.js, TypeScript, YAML, VSCode, IntelliJ, Eclipse, REST API, Postman, Docker,
* AWS Technologies - API Gateway, Route 53, Lambda, Kafka, ElastiCache, PostgeSQL, SNS, Quarkus, EventBridge, Secret Manager
Roles & Responsibilities
* Building and supporting a reliable application suite for the environment to meet the development and maintenance
* requirements of systems/platforms
* Implement Service Reliability Engineering by working as part of the development team to evaluate the health, stability, and reliability of applications
* Lead the team in best practices in incident, problem, and change management
* Utilizing monitoring, alerts, dashboards, and management tools to ensure the availability, reliability, cost, and performance of applications and services
* Constantly working to improve and implement automation of applications tasks
* Providing technical support for systems/platforms according to application SLA's
* Responsible for designing and developing resiliency in the application code, troubleshooting incidents, engaging with squads to address failure patterns, and participating in incident management
* Develop delivery pipelines and automated deployment scripts
* Configure services, such as databases and monitoring
Salary Range-$100,000-$125,000 a year
#LI-KR3
TCS Employee Benefits Summary:
Discretionary Annual Incentive.
Comprehensive Medical Coverage: Medical & Health, Dental & Vision, Disability Planning & Insurance, Pet Insurance Plans.
Family Support: Maternal & Parental Leaves.
Insurance Options: Auto & Home Insurance, Identity Theft Protection.
Convenience & Professional Growth: Commuter Benefits & Certification & Training Reimbursement.
Time Off: Vacation, Time Off, Sick Leave & Holidays.
Legal & Financial Assistance: Legal Assistance, 401K Plan, Performance Bonus, College Fund, Student Loan Refinancing.
Site Reliability Engineer III - AWM
Boston, MA jobs
We have an exciting and rewarding opportunity for you to take your software engineering career to the next level.
As a Software Engineer III at JPMorganChase within the Asset and Wealth Management Americas team, you serve as a seasoned member of an agile team to design and deliver trusted market-leading technology products in a secure, stable, and scalable way. You are responsible for carrying out critical technology solutions across multiple technical areas within various business functions in support of the firm's business objectives.
Job responsibilities
Executes software solutions, design, development, and technical troubleshooting with ability to think beyond routine or conventional approaches to build solutions or break down technical problems
Creates secure and high-quality production code and maintains algorithms that run synchronously with appropriate systems
Produces architecture and design artifacts for complex applications while being accountable for ensuring design constraints are met by software code development
Gathers, analyzes, synthesizes, and develops visualizations and reporting from large, diverse data sets in service of continuous improvement of software applications and systems
Proactively identifies hidden problems and patterns in data and uses these insights to drive improvements to coding hygiene and system architecture
Contributes to software engineering communities of practice and events that explore new and emerging technologies
Adds to team culture of diversity, opportunity, inclusion, and respect
Required qualifications, capabilities, and skills
Formal training or certification on computer science and reliability concepts and 3+ years applied experience.
Hands-on practical experience in system design, application development, testing, and operational stability
Proficient in coding in one or more languages
Experience in developing, debugging, and maintaining code in a large corporate environment with one or more modern programming languages and database querying languages
Overall knowledge of the Software Development Life Cycle
Solid understanding of agile methodologies such as CI/CD, Application Resiliency, and Security
Demonstrated knowledge of software applications and technical processes within a technical discipline (e.g., cloud, artificial intelligence, machine learning, mobile, etc.)
Preferred qualifications, capabilities, and skills
Familiarity with modern front-end technologies
Exposure to cloud technologies
Auto-ApplyJava Site Reliability Engineer, Messaging Platforms
Austin, TX jobs
We are a leading global asset management firm with over 3,000 employees across 20 offices in 15 countries; we help millions of investors around the world pursue their financial goals.
We hire critical thinkers. People who thrive in a collaborative culture like ours where we solve real problems while building the future of finance.
You
Are excited to be part of a vibrant engineering community that values diversity, hard work, and continuous learning.
Love solving complex real-world business problems.
Recognize that cross-functional collaboration is a core component of success for the team.
Believe there are multiple ways to solve most technical problems and are willing to debate the trade-offs.
Have become a stronger engineer by making mistakes and learning from them.
Are a doer, someone who wants to grow their career and gain experience across technologies and business functions.
We
Continuously invest in a high-performance and inclusive culture, in which a diversity of backgrounds, experiences and viewpoints are celebrated and valued.
Encourage career mobility, so you can benefit from learning different functions and technologies, and we gain the benefits of your experience across teams.
Run technology pro bono programs that help the non-profit community and give our engineering community opportunities to volunteer and participate.
Offer education reimbursements and ongoing training in technology, communication, and diversity & inclusion.
Embrace knowledge sharing through lunch-and-learns, demos, and technical forums.
Consider our people to be our greatest asset-we will help you learn what PIMCO Technology has to offer so you can participate in activities that benefit your career while delivering impactful technology solutions.
As a Java SRE in Trading Technology, you will:
As our immediate need
Help support the messaging platforms in use (MQ, AMPS, Kafka, etc.).
driving the firm's best use of these platforms, making sure all choice make sense, the correct tools issued for the solving each job, and that we build a sustainable messaging strategy.
Improve the operational efficiency and reduce the operational risk of our messaging platforms through better tools, better design, and better monitoring.
In the future
there will be new architectural or coding problems that we will need an experienced engineer to help solve.
Work closely with the business and other teams to design and implement solutions that have immediate impact to the business and help us build towards our strategic vision across all our trade floor applications.
We need someone proficient in Java, passionate about SRE practices, and able to collaborate effectively with an infrastructure team. We expect you to have a strong passion for messaging systems, including their proper setup, monitoring, and maintenance. At the same time, this role involves software development for target platforms once the immediate needs related to messaging platforms are resolved.
You will work with a team consisting of 1 SRE and 1 Unix SA, with full support from the infrastructure and DevOps teams.
Position Requirements
Bachelor's degree in computer science or equivalent
Strong Linux skills (including chef, puppet, ansible configuration tools)
Strong experience with different messaging systems (Kafka, AMPS, MQ, FIX, etc.).
Strong engineering culture (unit tests, CI/CD)
Ability to work independently and in teams
Good communication skills
Working from the office in Austin 4 days a week.
PIMCO follows a total compensation approach when rewarding employees which includes a base salary and a discretionary bonus. Base salary is the fixed component of compensation that is determined by core job responsibilities, relevant experience, internal level, and market factors. The discretionary bonus is used to award performance and therefore is determined by company, business, team, and individual performance.
Salary Range: $ 175,000.00 - $ 240,000.00
Equal Employment Opportunity and Affirmative Action Statement
PIMCO recruits and hires qualified candidates without regard to race, national origin, ancestry, religion (including religious dress and grooming practices), sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), sexual orientation, gender (including gender identity and expression), age, military or veteran status, disability (physical or mental), any factor prohibited by law, and as such affirms in policy and practice to support and promote the concept of equal employment opportunity and affirmative action, in accordance with all applicable federal, state, provincial and municipal laws. The company also prohibits discrimination on other basis such as medical condition, or marital status under applicable laws.
Applicants with Disabilities
PIMCO is an Equal Employment Opportunity/Affirmative Action employer. We provide reasonable accommodation for qualified individuals with disabilities, including veterans, in job application procedures. If you have any difficulty using our online system due to a disability and you would like to request an accommodation, you may contact us at ************ and leave a message. This is a dedicated line designed exclusively to assist job seekers with disabilities to apply online. Only messages left for this purpose will be considered. A response to your request may take up to two business days.
Auto-Apply