Reliability Engineer jobs at Bank of America - 738 jobs

Site Reliability Engineer
The Voleon Group 4.1
Berkeley, CA jobs
Voleon is a technology company that applies state‑of‑the‑art AI and machine learning techniques to real‑world problems in finance. For nearly two decades, we have led our industry and worked at the frontier of applying AI/ML to investment management. We have become a multibillion‑dollar asset manager, and we have ambitious goals for the future. Your colleagues will include internationally recognized experts in artificial intelligence and machine learning research as well as highly experienced finance and technology professionals. The people who shape our company come from other backgrounds, including concert music performances, humanitarian aid, opera singing, sports writing, and BMX racing. You will be part of a team that loves to succeed together. In addition to our enriching and collegial working environment, we offer highly competitive compensation and benefits packages, technology talks by our experts, a beautiful modern office, daily catered lunches, and more. As a Site Reliability Engineer (SRE), you will work at the intersection of production operations and software development as you improve, manage, and monitor production‑critical infrastructure and data pipelines. At Voleon, many SREs serve together on a Production Operations team tasked with improving shared production infrastructure. Others are embedded with teams of software engineers to improve specific production systems owned by those teams. Voleon SREs work on important real‑world problems and collaborate with passionate and talented colleagues in an empowering, results‑driven environment. This role is a way to make a real difference: your contributions will make our critical systems more reliable, lower operational risk, and increase the efficiency of our engineering effort. Responsibilities Improve fault‑tolerance and maintainability of code in proprietary data pipelines and trading systems Diagnose and fix bugs in code Lead complex deployments Automate manual workflows Track and prioritize outstanding production‑related issues Share an on‑call rotation responding to incidents to ensure the continuous operation of production‑critical systems Requirements Experience with coding and debugging Python Experience with Linux Familiarity with Relational Databases & SQL Sharp analytical and problem‑solving skills and a persistent drive to make things work (better) Strong growth mindset and a passion for learning Strong technical communication skills Attention to detail 2 years of relevant industry experience An undergraduate degree or comparable training in a quantitative field or equivalent, relevant industry experience Preferred Qualifications Familiarity with best practices concerning code maintainability, documentation, quality assurance, continuous integration and deployment Experience supporting production systems Experience with any of the following: gRPC microservices, Postgres, Pandas, Golang, R, Git, Jenkins, Bazel, Prometheus, Grafana, Airflow, Kubernetes The base salary for this position is $120,000 to $160,000 in the location(s) of this posting. Individual salaries are determined through a variety of factors, including, but not limited to, education, experience, knowledge, skills, and geography. Base salary does not include other forms of total compensation such as bonus compensation and other benefits. Our benefits package includes medical, dental and vision coverage, life and AD&D insurance, 20 days of paid time off, 9 sick days, and a 401(k) plan with a company match. Friends of Voleon Candidate Referral Program If you have a great candidate in mind for this role and would like to have the potential to earn $7,500 - $15,000 if your referred candidate is successfully hired and employed by The Voleon Group, please use this to submit your referral. For more details regarding eligibility, terms and conditions please make sure to review the Voleon Referral Bonus Program. Equal Opportunity Employer The Voleon Group is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law. #J-18808-Ljbffr
$120k-160k yearly 5d ago

Looking for a job?

Let Zippia find it for you.

Senior AI SRE: Scale GenAI Reliability & Impact
Charles Schwab Corporation 4.8
San Francisco, CA jobs
A leading financial services firm is seeking a Senior AI Site Reliability Engineer responsible for designing and managing the reliability of AI-driven applications. In this role, you'll work on innovative projects and mentor junior engineers while collaborating with cross-functional teams. Candidates should have extensive experience in software development and reliability engineering, with a particular focus on AI systems. This on-site position is located in San Francisco and offers opportunities for professional growth and development. #J-18808-Ljbffr
$118k-152k yearly est. 1d ago
Staff Site Reliability Engineer
Visa 4.5
Ashburn, VA jobs
Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure payments network, enabling individuals, businesses, and economies to thrive while driven by a common purpose - to uplift everyone, everywhere by being the best way to pay and be paid. Make an impact with a purpose-driven industry leader. Join us today and experience Life at Visa. Job Description What a Staff Reliability Engineer Does at Visa? As a Staff Site Reliability Engineering (SRE) team, you will be part of a cross-functional Operations & Infrastructure group responsible for the reliability, availability, performance, and optimization of Visa Spend Clarity for Enterprises (VSCE). You will support teams in running robust applications, lead incident resolution efforts, and drive operational excellence through automation, observability, and platform modernization. This role is critical to Visa's transformation as we scale our product to a broader range of issuers through cloud infrastructure and automation. You will work closely with engineering, operations, and product teams to ensure our systems are resilient, secure, and continuously improving. Why This Role Matters You will be part of a critical global function within the VSCE product at a time when we are modernizing our platform through cloud infrastructure and automation. This transformation enables us to scale our product to a broader range of issuers and is a key focus area within Visa Commercial Solutions with ambitious growth goals. Our Culture At Visa, your individuality fits right in. Working here gives you an opportunity to impact the world, invest in your career growth, and be part of an inclusive and diverse workplace. We are a global team of disruptors, trailblazers, innovators, and risk-takers who are helping drive economic growth in even the most remote parts of the world. We're creatively moving the industry forward and doing meaningful work that brings financial literacy and digital commerce to millions of unbanked and underserved consumers. You're an individual. We're the team for you. Together, let's transform the way the world pays. Essential Functions Operate and improve distributed systems and SaaS applications in production environments. Lead and coordinate incident response efforts, ensuring timely resolution and root cause analysis. Collaborate with engineering teams to enhance system reliability, uptime, and performance. Automate operational tasks using scripting and orchestration tools (e.g., PowerShell). Support and configure middleware, load balancers, and Web Application Firewalls. Drive strategic initiatives such as cloud migration and platform modernization. Apply AWS cloud expertise to solve infrastructure problems and scalability challenges. Monitor and manage enterprise systems using observability and alerting tools. Participate in a 24/7/365 On Call rotation, including shift and weekend support as needed. Contribute to internal platform development with a product-led mindset. Ensure secure and compliant software delivery in regulated environments. Support geographically dispersed systems across multiple time zones. Provide support and documentation for task handoffs and transitions. This is a hybrid position. Expectation of days in office will be confirmed by your hiring manager. Qualifications Basic Qualifications * 5 or more years of relevant work experience with a Bachelors Degree or at least 2 years of work experience with an Advanced degree (e.g. Masters, MBA, JD, MD) or 0 years of work experience with a PhD Preferred Qualifications * 6 or more years of work experience with a Bachelors Degree or 4 or more years of relevant experience with an Advanced Degree (e.g. Masters, MBA, JD, MD) or up to 3 years of relevant experience with a PhD * Experience with transactional systems (e.g., banking, finance, telecommunications). * Proficiency in Microsoft stack (Windows Server, IIS, MS SQL Server). * Familiarity with middleware technologies (e.g., MQ, Active Directory, Session State). * Advanced experience with AWS cloud services, including designing and troubleshooting scalable, resilient infrastructure. * Knowledge of certificate management and secure system design (basic to intermediate level). * Strong troubleshooting, performance tuning, and capacity planning skills. * Exposure to PCI and other audit/control frameworks. * Experience with enterprise monitoring and orchestration tools. * Ability to work across time zones and with geographically dispersed teams. * Excellent communication, collaboration, and stakeholder management skills. * Self-motivated, adaptable, and committed to continuous learning and growth. * Experience leading initiatives and influencing across teams. * Customer-oriented mindset for both internal and external clients. * Committed to continuous learning and growth, with the ability to adapt quickly to evolving challenges and technologies. Additional Information Work Hours: Varies upon the needs of the department. Travel Requirements: This position requires travel5-10% of the time. Mental/Physical Requirements: This position will be performed in an office setting. The position will require the incumbent to sit and stand at a desk, communicate in person and by telephone, frequently operate standard office equipment, such as telephones and computers. Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law. Visa will consider for employment qualified applicants with criminal histories in a manner consistent with applicable local law, including the requirements of Article 49 of the San Francisco Police Code. U.S. APPLICANTS ONLY: The estimated salary range for this positionis $131,600 to $210,300USD per year, which may include potential sales incentive payments (if applicable). Salary may vary depending on job-related factors which may include knowledge, skills, experience, and location. In addition, this position may be eligible for bonus and equity. Visa has a comprehensive benefits package for which this position may be eligible that includes Medical, Dental, Vision, 401 (k), FSA/HSA, Life Insurance, Paid Time Off, and Wellness Program.
$131.6k-210.3k yearly 5d ago
Staff Site Reliability Engineer, ServiceNow
Visa 4.5
Highlands Ranch, CO jobs
Visa is a world leader in payments technology, facilitating transactions between consumers, merchants, financial institutions and government entities across more than 200 countries and territories, dedicated to uplifting everyone, everywhere by being the best way to pay and be paid. At Visa, you'll have the opportunity to create impact at scale - tackling meaningful challenges, growing your skills and seeing your contributions impact lives around the world. Join Visa and do work that matters - to you, to your community, and to the world. Progress starts with you. Job Description The CMDB Site Reliability Engineer will hold the responsibility for developing on the Service Now platform CMDB, ITOM Discovery and other CMDB related components as part of the Scrum based Agile framework to advance and maintain Visa's CMDB functionality. This will include, but is not limited to, story implementation, data management (CI and Platform), and operational support (Incidents and Requests). Essential Functions: Be part of a global team that has operational support responsibilities where you will be required to work with our internal customers to resolve issues using the ServiceNow Incident management process and ticketing system. Expand and enhance your knowledge of the ServiceNow Platform and its advanced features and capabilities including developing table / attribute level security controls, implementing Business Rules, Flow Designer, Workflows, Client Scripts, UI Policy, and UI Actions as part of development activities for Catalogs, Scoped Apps or other platform activities. Design, develop, and maintain ETL solutions using Microsoft SSIS to support data warehousing and business intelligence initiatives. Monitor, troubleshoot, and optimize existing SSIS packages and ETL jobs for performance and reliability. Ingest, transform, and load data from various sources including relational databases, flat files (CSV, TXT), REST APIs, and ODBC connections. Write and optimize complex SQL queries, stored procedures, triggers, and scripts for data extraction and transformation. Develop and maintain comprehensive documentation for data flows, processes, and systems. Ensure data quality, integrity, and security throughout the ETL process. Participate in code reviews and contribute to best practices for SSIS and ETL development. Collaborate with stakeholders, analysts, and business users to understand requirements and deliver robust data pipelines. Apply data governance, security, and compliance standards to all processes. Work with ServiceNow ITOM Discovery and its Cloud Discovery capabilities in a multi-public Cloud environment. Work with end - users and educate them on how to be more effective and efficient working on the platform either through informal meetings, brown bags or scheduled calls. Develop technical requirements, documentation and technical diagrams. Perform data profiling and root cause analysis for data issues. Implement data validation and reconciliation processes. Work with stakeholders to review stories and requirements to develop new capabilities as part of our monthly Sprints and Releases. This is a hybrid position. Expectation of days in office will be confirmed by your hiring manager. Visa will accept applications for this role until at least January 31, 2026. Qualifications Basic Qualifications: 5+ years of relevant work experience with a Bachelor's Degree or at least 2 years of work experience with an Advanced degree (e.g. Masters, MBA, JD, MD) or 0 years of work experience with a PhD, OR 8+ years of relevant work experience. Preferred Qualifications: Candidate must have direct work experience troubleshooting issues on the ServiceNow platform with Catalog requests, data imports, or other business-related logic. Effective writing skills with the ability to document application configurations, system design and architecture, simple run books. Good communication skills to work effectively with team members, support personnel, management and customers in geographically dispersed locations and ability to work as part of a team as well as independently with minimum guidance. Ideal candidate would have completed their ServiceNow Certified System Administrator. Familiarity with source control and deployment processes for SSIS projects. Experience with data profiling, data quality assessment, and cleansing techniques. 8+ years of hands-on ServiceNow development and CMDB experience, ETL solutions using Microsoft SSIS. Strong proficiency in SQL (T-SQL) for data manipulation, querying, and optimization. Experience in ingesting data from ODBC data sources and processing data from flat files (CSV, TXT, etc.). Experience integrating data from REST APIs (JSON, XML) using SSIS or related tools. Proficient in data modeling (relational and dimensional) and data warehousing best practices. Knowledge of data governance, privacy, and security practices. Familiarity with job scheduling, automation tools, and workflow orchestration. Experience with metadata management and data lineage tracking. Additional Information Work Hours: Varies upon the needs of the department. Travel Requirements: This position requires travel5-10% of the time. Mental/Physical Requirements: This position will be performed in an office setting. The position will require the incumbent to sit and stand at a desk, communicate in person and by telephone, frequently operate standard office equipment, such as telephones and computers. Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law. Visa will consider for employment qualified applicants with criminal histories in a manner consistent with applicable local law, including the requirements of Article 49 of the San Francisco Police Code. U.S. APPLICANTS ONLY: The estimated salary range for this positionis $124,300 to $198,600 USD per year, which may include potential sales incentive payments (if applicable). Salary may vary depending on job-related factors which may include knowledge, skills, experience, and location. In addition, this position may be eligible for bonus and equity. Visa has a comprehensive benefits package for which this position may be eligible that includes Medical, Dental, Vision, 401 (k), FSA/HSA, Life Insurance, Paid Time Off, and Wellness Program.
$124.3k-198.6k yearly 5d ago
Sr. Site Reliability Engineer
Visa 4.5
Austin, TX jobs
Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure payments network, enabling individuals, businesses, and economies to thrive while driven by a common purpose - to uplift everyone, everywhere by being the best way to pay and be paid. Make an impact with a purpose-driven industry leader. Join us today and experience Life at Visa. Job Description Visa Technology & Operations LLC, a Visa Inc. company, needs a Sr. Site Reliability Engineer (multiple openings) in Austin, TX to: Provide technical support to Tier 0 Applications ensuring it meets all service level agreements and team objectives. Participate in root cause and analysis (RCA) for issues encountered for associated services. Assist in disaster recovery plan without impacting any related services. Implement extensive application monitoring objectives. Implement self-healing services to minimize or eliminate downtime. Implement application and system changes according to best practice. Work with the development team to resolve issues, enhance applications, and advice. Resolve incidents and problems in accordance within defined guidelines and meet operational level agreements. Ability to work after hours including weekends, night, early morning on rotational shifts. Position reports to the Austin, Texas office and may allow for partial telecommuting. This position requires travel 5-10% of the time. Qualifications Basic Qualifications: Bachelor's degree in Computer Science, Engineering, Business Analytics or related field, followed by 2 years of experience in the job offered or in a related systems engineer or data engineer occupation. Alternatively, a Master's degree in Computer Science, Engineering, Business Analytics, or related field. Position requires experience in the following: Linux Operating Systems. Virtual Machines. Containers. Databases. MQ or KAFKA. Middleware JVMs. Storage. Supporting Web Services (API) or Web UI or Batch applications based on Linux, Java, BASH, Python, Perl, Oracle, DB2, Hazelcast or Hadoop. Firewalls, Load Balancer, DNS, HTTP, TCP/IP, PKI, SSL, TLS, Digital Certificates, Encryption, Security Scanning or equivalent. Additional Information Worksite: Austin, TX This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs. Travel Requirements:This position requires travel 5-10% of the time. Mental/Physical Requirements:This position will be performed in an office setting. The position will require the incumbent to sit and stand at a desk, communicate in person and by telephone, frequently operate standard office equipment, such as telephones and computers. Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law. U.S. APPLICANTS ONLY: The estimated salary range for a new hire into this position is $111,238.00 USD to $171,800.00 USD per year, which may include potential sales incentive payments (if applicable). Salary may vary depending on job-related factors which may include knowledge, skills, experience, and location. In addition, this position may be eligible for bonus and equity. Visa has a comprehensive benefits package for which this position may be eligible that includes Medical, Dental, Vision, 401 (k), FSA/HSA, Life Insurance, Paid Time Off, and Wellness Program.
$111.2k-171.8k yearly 5d ago
Sr. Site Reliability Engineer - Talent Day
Visa 4.5
Austin, TX jobs
Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure payments network, enabling individuals, businesses, and economies to thrive while driven by a common purpose - to uplift everyone, everywhere by being the best way to pay and be paid. Make an impact with a purpose-driven industry leader. Join us today and experience Life at Visa. Job Description Essential Functions We are seeking a Site Reliability Engineer to work in the Product Reliability Engineering function within Operations & Infrastructure. This individual will: Perform day-to-day site reliability engineering functions including maintenance and incident resolution for Visa's applications, products, and services. Perform ongoing/proactive analysis of applications to detect potential problems and actively engage & facilitate the discussion to find the best possible solution. Work under direct supervision to ensure on-time delivery of projects, and production support plans for upgrades, enhancements, and deployments. Work closely with service partners such as product development, engineering teams to seamlessly implement the innovative solutions to improve the reliability, scalability, and efficiency. Assist in automating the routine tasks and processes to improve overall efficiency and reduce human errors. Actively participate in troubleshooting activities and SWAT calls and drive investigation towards swift resolution. Build comprehensive and robust documentation repositories that can facilitate knowledge transfer among PRE and Global Operations peers. Assist the team with implementing GenAI and machine learning trends to continuously optimize the application reliability and efficiency. Participate in on-call roster to support business including off-hours. Self-motivated, and have excellent interpersonal and communication skills. *This is a hybrid position. Expectation of days in office will be confirmed by your hiring manager. Qualifications Basic Qualifications 2+ years of relevant work experience and a Bachelor's degree, OR 5+ years of relevant work experience. Preferred Qualifications 3 or more years of work experience with a Bachelor's Degree or more than 2 years of work experience with an Advanced Degree (e.g. Masters, MBA, JD, MD) Working knowledge of one or more programming languages such as Python, Java, .NET, C#, PowerShell, Bash scripting. Understanding of Linux/Unix systems Understanding of networking concepts, protocols, and architecture. Proven track record of automating complex tasks and processes to improve efficiency and reliability Basic understanding of AI frameworks and libraries. Additional knowledge in one of the following: Cloud technologies such as AWS, Azure, etc. Database management systems such as MSSQL,MongoDB, etc. Middleware technologies such as Tomcat, Apache, etc. Containerization technologies such as Docker, Kubernetes, etc. Infrastructure-as-code tools such as Terraform, Ansible, etc. Monitoring tools such as Splunk or other Additional Information Work Hours:Varies upon the needs of the department. Travel Requirements:This position requires travel5-10% of the time. Mental/Physical Requirements:This position will be performed in an office setting. The position will require the incumbent to sit and stand at a desk, communicate in person and by telephone, frequently operate standard office equipment, such as telephones and computers. Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law. U.S. APPLICANTS ONLY: The estimated salary range for this positionis $110,700 to $171,800USD per year, which may include potential sales incentive payments (if applicable). Salary may vary depending on job-related factors which may include knowledge, skills, experience, and location. In addition, this position may be eligible for bonus and equity. Visa has a comprehensive benefits package for which this position may be eligible that includes Medical, Dental, Vision, 401 (k), FSA/HSA, Life Insurance, Paid Time Off, and Wellness Program.
$110.7k-171.8k yearly 5d ago
REO Resiliency Engineering and Quality Leader (Hybrid)
Securian 3.7
Saint Paul, MN jobs
title is Infrastructure Dir." Mission "To lead the engineering discipline that ensures Securian's technology platforms and cloud services are built and operated with uncompromising resilience, performance, and quality. This role drives the design and automation of fault-tolerant, high-availability architectures across AWS, Azure, and GCP-ensuring the enterprise meets resiliency, scalability, and efficiency expectations at every layer of technology." Positioning The Director of Resilience Engineering and Quality Leader is both a strategic peer and technical counterpart to the Infrastructure & Reliability Engineering Leader. This role provides bench depth and succession coverage for REO's most technically complex domains while driving innovation in reliability, resilience, and performance practices. Strategic influence: Shapes cloud reliability, quality engineering, and resilience strategy across REO and Architecture domains. Operational authority: Leads Sr. Managers and Managers who own the execution of quality, resilience, and performance engineering capabilities. Enterprise collaboration: Works hand-in-hand with Technology, Solution, Business, Data, and Enterprise Architects to embed reliability and resilience as core architecture principles. Scope of Accountability Resilience Engineering & Cloud Reliability Architect and validate fault-tolerant, regionally resilient architectures across AWS, Azure, and GCP. Own resilience automation, chaos testing, and IaC-based recovery validation. Lead cross-cloud reliability design reviews and failure-mode analyses for critical systems. Quality Engineering & Continuous Testing Define enterprise-wide quality engineering strategy integrated into CI/CD pipelines. Drive automation-first testing (functional, non-functional, performance, resilience). Embed observability-driven quality validation and contract testing across services. Performance, Capacity & Efficiency Engineering Oversee predictive capacity planning, scaling automation, and cost/efficiency optimization (FinOps/GreenOps). Partner with Platform & Infrastructure teams to tune performance across application and platform layers. Measure and report on performance SLIs/SLAs aligned to REO's Reliability Metrics framework. Cross-Domain Architecture Collaboration Partner with Enterprise Architects to codify resilience and reliability standards in technology blueprints. Collaborate with Technology & Solution Architects to design service reliability into delivery architectures. Engage Data Architects for data resilience, replication, and pipeline reliability. Work with Business Architects to align technical reliability goals with critical business outcomes. Leadership & Talent Development Lead a team of Sr. Managers and Managers, fostering a high-performance, hands-on engineering culture. Build and mentor top-tier technical talent in cloud reliability, resilience, and quality automation. Partner with HR and REO Enablement to develop succession plans and technical competency frameworks. Core Technical Competencies AWS (primary) - Multi-account design, HA architecture, region failover, resilience automation, Terraform/CDK/CloudFormation. Azure & GCP (secondary) - Compute, networking, and reliability constructs; hybrid cloud design and failover integration. Infrastructure as Code (IaC) - Deep proficiency in Terraform, policy-as-code (OPA/Conftest), drift detection, pipeline integration. Reliability & Chaos Engineering - AWS Fault Injection Simulator, Gremlin, steady-state hypothesis design. Observability & Quality Automation - OpenTelemetry, Prometheus, CloudWatch, K6, Gatling; CI/CD quality gates and dashboards. Performance Engineering - Load, stress, and soak testing automation; performance profiling and SLO alignment. Disaster Recovery Automation - Cross-region orchestration, IaC-driven DR runs, replication validation. FinOps/GreenOps - Cloud cost and efficiency automation, carbon-aware scaling policies. Leadership Competencies Strategic Technical Leadership: Operates at the intersection of deep engineering and executive strategy. Multi-Domain Collaborator: Integrates reliability and resilience across architecture, operations, and business domains. Talent Multiplier: Develops and empowers senior managers, fostering engineering mastery and innovation. Credible Technical Authority: Trusted peer to Infrastructure & Reliability Engineering; capable of leading architecture reviews and executive briefings. Change Champion: Drives transformation of reliability practices across platforms, pipelines, and teams. Qualifications & Experience 12+ years in cloud engineering, reliability, or platform leadership roles. 5+ years leading Sr. Managers/Managers in technical domains. Proven expertise across AWS, with working knowledge of Azure and GCP. Experience with multi-cloud governance, DR design, IaC at scale, and reliability automation. Strong understanding of observability, SRE principles, and REO/ITIL-aligned reliability frameworks. Certifications: Required: AWS Certified Solutions Architect - Professional Preferred: AWS DevOps Engineer, Azure Solutions Architect Expert, Google Professional Cloud Architect Success Metrics 99.9% availability maintained for Tier-1 workloads. 100% coverage of DR automation for Tier-1 services. 25% annual increase in automated quality/test coverage. 15% annual improvement in resource efficiency and cost performance. Documented resilience participation across all enterprise architecture blueprints. Positive "technical peer readiness" and succession rating from Head of REO. Summary Value Proposition This Director role blends deep AWS reliability engineering expertise, multi-cloud technical breadth, and leadership scale. It ensures REO maintains both technical depth and leadership redundancy, and it strengthens the bridge between engineering execution and enterprise architecture alignment. #LI-hybrid **This position will be in a hybrid working arrangement.** Securian Financial believes in hybrid work as an integral part of our culture. Associates get the benefit of working both virtually and in our offices. If you're in a commutable distance (90 minutes), you'll join us 3 days each week in our offices to collaborate and build relationships. Our policy allows flexibility for the reality of business and personal schedules. The estimated base pay range for this job is: $145,000.00 - $267,000.00 Pay may vary depending on job-related factors and individual experience, skills, knowledge, etc. More information on base pay and incentive pay (if applicable) can be discussed with a member of the Securian Financial Talent Acquisition team. Be you. With us. At Securian Financial, we understand that attracting top talent means offering more than just a job - it means providing a rewarding and fulfilling career. As a valued member of our high-performing team, we want you to connect with your work, your relationships and your community. Enjoy our comprehensive range of benefits designed to enhance your professional growth, well-being and work-life balance, including the advantages listed here: Paid time off: We want you to take time off for what matters most to you. Our PTO program provides flexibility for associates to take meaningful time away from work to relax, recharge and spend time doing what's important to them. And Securian Financial rewards associates for their service by providing additional PTO the longer you stay at Securian. Leave programs: Securian's flexible leave programs allow time off from work for parental leave, caregiver leave for family members, bereavement and military leave. Holidays: Securian provides nine company paid holidays. Company-funded pension plan and a 401(k) retirement plan: Share in the success of our company. Securian's 401(k) company contribution is tied to our performance up to 10 percent of eligible earnings, with a target of 5 percent. The amount is based on company results compared to goals related to earnings, sales and service. Health insurance: From the first day of employment, associates and their eligible family members - including spouses, domestic partners and children - are eligible for medical, dental and vision coverage. Volunteer time: We know the importance of community. Through company-sponsored events, volunteer paid time off, a dollar-for-dollar matching gift program and more, we encourage you to support organizations important to you. Associate Resource Groups: Build connections, be yourself and develop meaningful relationships at work through associate-led ARGs. Dedicated groups focus on a variety of interests and affinities, including: Mental Wellness and Disability Pride at Securian Financial Securian Young Professionals Network Securian Multicultural Network Securian Women and Allies Network Servicemember Associate Resource Group For more information regarding Securian's benefits, please review our Benefits page. This information is not intended to explain all the provisions of coverage available under these plans. In all cases, the plan document dictates coverage and provisions. Securian Financial Group, Inc. does not discriminate based on race, color, religion, national origin, sex, gender, gender identity, sexual orientation, age, marital or familial status, pregnancy, disability, genetic information, political affiliation, veteran status, status in regard to public assistance or any other protected status. If you are a job seeker with a disability and require an accommodation to apply for one of our jobs, please contact us by email at , by telephone (voice), or 711 (Relay/TTY). To view our privacy statement click here To view our legal statement click here
$145k-267k yearly 2d ago
Process Engineer
CTC 4.6
Cincinnati, OH jobs
20 hrs/week ONSITE Cincinnati, OH 45224 The Manufacturing Process Engineer will be responsible for evaluating, improving, and maintaining manufacturing processes and equipment to ensure efficiency, safety, and compliance. This role requires strong analytical skills, technical expertise, and the ability to drive continuous improvement initiatives across the plant. Responsibilities Evaluate existing manufacturing processes and identify areas for improvement. Inspect and maintain mechanical equipment performance within the plant. Diagnose production issues and implement effective solutions. Conduct cost-benefit analyses for new processes and equipment. Design detailed layouts for equipment, processes, and workflows. Research and develop new processes, equipment, and products. Implement cost-saving measures and quality control systems. Ensure compliance with safety standards and legal regulations. Maintain documentation and prepare technical reports. Must Have Process evaluation and continuous improvement experience. Mechanical equipment inspection and maintenance knowledge. Strong problem-solving and root cause analysis skills. Ability to perform cost-benefit analysis. Process design and workflow optimization expertise. Knowledge of quality control systems and regulatory compliance. Technical documentation and report preparation skills. Bachelor's degree in Mechanical, Industrial, or Manufacturing Engineering (or equivalent). 2 years of experience Nice to Have Experience with advanced manufacturing technologies (automation, robotics, Industry 4.0). Familiarity with Lean Manufacturing, Six Sigma, or Kaizen methodologies. Exposure to ERP systems (SAP, Oracle, Salesforce). Project management and cross-functional collaboration skills. Innovation mindset for R&D of new processes and products. Bilingual communication (English/Spanish) for global operations. Experience in cost-saving initiatives with measurable impact.
$55k-74k yearly est. 2d ago
Process Designer
Tata Consulting Engineers 4.3
Washington, WV jobs
“Together We Make Life Better”. Our quality engineering, sustainable solutions and safety record inspire everything we do. Our diverse and inclusive workforce allows for all employees to feel valued and safe to give their opinions and improve our company. Tata Consulting Engineers USA, LLC, (TCE), is a multi-disciplinary engineering organization offering a full range of integrated engineering design, project support, procurement and construction management services to the energy and chemicals industries. Position Summary: The Process / Mechanical Designer will support capital and cost projects at the Washington Works site by developing detailed mechanical and process design deliverables. This role does not require a formal engineering degree but demands strong technical and design experience in an industrial setting. The designer will collaborate closely with project managers, engineers, drafters, and construction teams to ensure safe, efficient, and cost-effective project execution. Responsibilities: Adhere to company core values of Safety, Integrity, Partnership, Respect, and Ownership. Develop complete design packages including conceptual, preliminary, and construction deliverables. Develop and revise mechanical and process design drawings including P&IDs, equipment drawings, and general arrangements. Interpret sketches, field notes, and verbal instructions to produce accurate design documents. Conduct field verification and site walkdowns to support design accuracy. Collaborate with engineers, vendors, and stake holders to resolve design challenges and optimize solutions. Review and verify drawings for accuracy, compliance, and constructability. Incorporate redlines and as-built updates into final drawing packages. Ensure compliance with applicable codes, standards, and company procedures. Participate in design reviews and provide input on constructability and safety. Prepare and revise Bills of Materials (BOMs) and technical specifications. Follow company QA/QC requirements. Support design change documentation. Qualifications & Experience: Required: Minimum 5 - 8 years of experience in mechanical or process design in a chemical or industrial environment. Familiarity with P&IDs, piping systems, and mechanical equipment layouts. Ability to read and interpret engineering drawings and specifications. Strong attention to detail and organizational skills. Ability to work independently and as part of a cross-functional team. Working knowledge of Microsoft products (Word and Excel). Proficiency in CAD software (2D/3D) - primarily MicroStation with knowledge of AutoCAD. Preferred: Familiarity with Management of Change (MOC) and Process Safety Management (PSM) documentation practices. Work Environment: Primarily office-based with frequent fieldwork in active chemical manufacturing areas. Must be able to access all areas of the plant, including elevated platforms. Exposure to industrial hazards such as moving equipment, chemicals, and varying weather conditions. Use of appropriate PPE is required. Physical Requirements: Ability to sit, stand, walk, climb, and stoop as needed. Must be able to lift up to 25 pounds occasionally. Additional Expectations: Strong problem-solving and reasoning abilities. Effective communication skills for working with cross-functional teams. Ability to manage multiple priorities and meet deadlines. Education Requirements: High school diploma or equivalent required. Associate degree or technical certification in drafting/design preferred.
$60k-75k yearly est. 1d ago
Process Improvement Specialist
DZ Corporation 4.3
The Villages, FL jobs
Reports To: Operations Manager The Process Improvement Specialist is responsible for optimizing production processes within the precast concrete facility. This role focuses on identifying inefficiencies, implementing process enhancements, and supporting quality and safety improvements across manufacturing operations. Working closely with production teams, engineers, and supervisors, the specialist helps streamline workflows, reduce waste, and ensure consistent product quality. Key Responsibilities: Process Analysis & Optimization: Observe and analyze daily production activities (casting, curing, reinforcement, finishing, etc.) to identify bottlenecks and improvement opportunities. Data Collection & Reporting: Gather and track production data such as cycle times, material usage, downtime, and defect rates to support improvement projects. Continuous Improvement Projects: Assist in implementing Lean, 5S, or Six Sigma initiatives to improve plant efficiency, reduce waste, and enhance workplace organization. Standard Work & Documentation: Help develop and update standard operating procedures (SOPs), work instructions, and visual management tools. Quality & Safety Support: Collaborate with Quality Control and Safety teams to ensure process changes meet safety standards and product specifications. Technical Support: Support the introduction of new molds, equipment, or materials by conducting process trials and documenting results. Collaboration: Partner with maintenance, engineering, and production supervisors to troubleshoot recurring process issues. Qualifications: Education: Associate's degree or technical diploma in Manufacturing Technology, Industrial Engineering, or related field. Equivalent experience in precast concrete production or process improvement will be considered. Experience: 2+ years in a manufacturing or precast concrete environment. Familiarity with Lean Manufacturing, 6S, or Continuous Improvement principles. Skills: Strong mechanical aptitude and understanding of production equipment. Ability to collect and interpret process data (cycle times, scrap, yield, etc.). Proficiency in Microsoft Office and basic data entry tools. Good communication and problem-solving skills. Team-oriented and hands-on approach. Preferred Qualifications: Experience with precast or concrete manufacturing processes (casting, curing, form setup, reinforcement, finishing). Knowledge of quality systems such as NPCA or PCI standards. Basic CAD or technical drawing reading ability. Certification in Lean or Six Sigma or willingness to acquire. Performance Indicators: Reduction in process waste or rework rates. Increased production throughput and efficiency. Improved safety compliance and incident reduction. Consistency in meeting product quality standards. Implementation and sustainability of improvement projects.
$68k-100k yearly est. 5d ago
Electromechanical Validation Engineer
Generis Tek Inc. 4.0
Milwaukee, WI jobs
Please Contact: To discuss this amazing opportunity, reach out to our Talent Acquisition Specialist Jigar Kachhia at email address **************************** can be reached on # ************. We have Permanent role Electromechanical Validation Engineer for our client at Willowbrook, IL. Please let me know if you or any of your friends would be interested in this position. Position Details: Electromechanical Validation Engineer- Willowbrook, IL Location : Willowbrook, IL 60527 Project Duration : Full-time Permanent Job Summary: Join Client as an Electromechanical Validation Engineer and help ensure the reliability and performance of innovative battery diagnostic tools and systems. This role is ideal for engineers with deep electromechanical aptitude who thrive in a hands-on lab environment. You'll lead validation for a range of battery testers and diagnostic platforms for both traditional 12V ICE vehicles and high voltage EV battery modules. Day-to-day work involves setting up complex test environments, troubleshooting real-world hardware issues, analyzing system behaviors, and collaborating with design, firmware, and manufacturing teams to ensure robust product performance. Key Responsibilities: • Develop and execute validation test plans for electromagnetically systems (e.g., test equipment with embedded electronics, relays, sensors, and power electronics). • Build, maintain, and instrument hardware test setups involving DC power systems, loads, thermal management, cabling, enclosures, and mechanical interfaces. • Use tools like oscilloscopes, DAQ systems, power supplies, thermal chambers, and custom test fixtures to execute validation activities. • Investigate issues by analyzing both electrical signals and mechanical performance; drive root-cause resolution in cross-functional teams. • Define and manage validation timelines to align with hardware development milestones. • Act as primary liaison with external test labs for regulatory, certification, and environmental testing (e.g., thermal, vibration, EMC). • Own compliance, qualification, and certification efforts for design releases. • Author detailed validation plans, protocols, test reports, and engineering documentation. Position Requirements: • BS or higher in Electrical Engineering, Mechatronics, or a related discipline • Minimum 10 years of experience in hardware validation of electromechanical systems • Proven track record diagnosing mixed signal, power, and electromechanical issues in lab environments • Strong understanding of validation methods and lab instrumentation (oscilloscopes, DAQ, thermal cycling, high current load testing) • Experience with LabVIEW or equivalent for test automation. • Excellent technical communication and teamwork skills WHY CHOOSE Client: • Comprehensive Health Coverage: Medical, dental, and vision benefits that prioritize your well-being. • Secure Your Future: Life and disability insurance provided at no extra cost to you. • Invest in Tomorrow: 401K savings plan with company match. • Performance Rewards: Annual bonus and profit-sharing opportunities. • Time to Recharge: Enjoy 12 days of vacation per year (prorated based on start date); 5 emergency PTO days; plus 10 Company-paid holidays. • Continual Learning: Tuition reimbursement to support your educational goals. • Health & Wellness: Onsite wellness screenings, flu shots, and subsidized health club memberships. • Sustainable Choices: Free charging stations for hybrid and electric vehicles. • Exclusive Perks: Discounts with auto suppliers. • Appreciation in Action: Weekly breakfast or lunch as a gesture of our gratitude to our team. • Must be able to travel to external labs. To discuss this amazing opportunity, reach out to our Talent Acquisition Specialist Jigar Kachhia at email address **************************** can be reached on # ************.
$61k-78k yearly est. 4d ago
Senior ML Engineer: Production Pipelines & HPC Expert
Capital One 4.7
McLean, VA jobs
A leading financial services company in Virginia seeks an experienced professional to design and build data-intensive solutions. The role requires expertise in C, C++, Python, Scala, and machine learning, along with the ability to lead teams and communicate complex concepts effectively. Candidates should possess a Bachelor's and preferably a Master's degree, with a proven track record in production-ready data pipelines and ML lifecycle. Competitive compensation and comprehensive benefits are offered. #J-18808-Ljbffr
$90k-111k yearly est. 2d ago
Staff Site Reliability Engineer
Figure 4.5
San Jose, CA jobs
Figure is an AI robotics company developing autonomous general-purpose humanoid robots. The goal of the company is to ship humanoid robots with human level intelligence. Its robots are engineered to perform a variety of tasks in the home and commercial markets. Figure is headquartered in San Jose, CA. We are looking for a Site Reliability Engineer to own our internal systems infrastructure. This role is responsible for setting up and managing cloud and on-prem infrastructure to deliver highly available, reliable, and automated systems. Responsibilities: Be the go to person for mission critical infrastructure enabling critical operations such as Source Configuration Management, CI/CD systems, software distribution, supplier portals, manufacturing and more. Migrate SaaS to self-hosted solutions to enhance security and reliability. Implement monitoring and alerting systems, and define incident response plans and runbooks. Reduce human workload through automation to automate deployment and scaling. Establish strong relationships with stakeholders to identify infrastructure needs and establish Service Level Objectives. Use a data driven approach to demonstrate service robustness and track optimization work. Partner with the security team to ensure that security remediations and updates are applied in a timely manner. Requirements: Strong experience with Linux/Unix systems administration Proficiency in programming/scripting Extensive experience with cloud platforms (Azure, AWS, GCP) and on-prem hardware architectures Experience designing, deploying, and operating high-availability, fault-tolerant, and distributed systems. Mastery of infrastructure as code (Terraform, CloudFormation, Ansible…) Familiarity with monitoring, logging, and alerting tools (Prometheus, Grafana, Datadog…) Solid understanding of networking fundamentals (TCP/IP, DNS, HTTP, load balancers, firewalls) Experience defining Service Level Objectives (SLO), developing runbooks/incident response plans, facilitating post-mortems and managing systems assets. Ability to work in cross-functional teams with developers, infra, and product teams Excellent verbal and written communication skills The US base salary range for this full-time position is between $175,000 - $250,000 annually. The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.
$175k-250k yearly Auto-Apply 35d ago
Site Reliability Engineer - Capital Markets
Jefferies Financial Group Inc. 4.8
Jersey City, NJ jobs
Jefferies is seeking for Site Reliability Engineer to play an instrumental role in supporting Equity Front office trading application, risk and middle office real time products, developed and used for Equity Cash and ETS application. As part of the wider platform engineering team, you will be working closely with the Business users interactively throughout the day, along with technical, analysis and testing colleagues. Investigation and resolution of the work items at hand will require competent technical skills and a keen intellect. The business is a growth area, with current investments taking place in all the technology, business and middle office areas. Responsibilities: Front Line Site Reliable Engineering and Support functions for Equity trading systems used by Jefferies clients as well as internal users. Build monitoring tools for application and infrastructure components. Implement and manage scalable infrastructure using cloud-native technologies and tools. Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding. Partner with business, development and infrastructure teams to improve services through rigorous testing and release procedures. Develop and maintain CI/CD pipelines to streamline deployment processes. Expedient deployment of new systems. Capacity planning, Platform Management, and support for increasing volumes and business growth. Create sustainable systems and services through automation. Collaborate with Application team to establish and enforce production and development standards. Document procedures, best practices and troubleshooting FAQs. Resolve complex application and technical problems. Debugging the system and fixing the production related issues. Escalate / follow-up on permanent fix for development related issues. Lead incident response efforts and post-mortem analysis to prevent future occurrences. Handles complex operational tasks and recommends process and technology changes. Global support and includes weekend availability to troubleshoot production related issues and perform checkouts. Ability to work both independently and in groups in an energetic, diverse environment. Participate in on-call rotations to ensure 24/7 system availability and support. Support compliance and legal queries. Qualifications: Strong experience in Windows and Linux/Unix services. Strong experience in scripting language like Power shell, Python and SQL. Strong Knowledge of monitoring tools - Nagios, Splunk, OTEL, Datadog Strong Knowledge of FIX protocol Strong Domain skills - Must have working experience in Capital Markets across modules and instruments especially - CASH, ETS, Bonds, Options, Futures, Swaps products Experience in BFSI (Banking and Financial Industry) Domain applications with a proper understanding of the Trade Lifecycle. Excellent communication, time management and project management skills. Primary Location Full Time Salary Range of $175,000 - $200,000
$175k-200k yearly Auto-Apply 59d ago
Staff Site Reliability Engineer
CME Group 4.4
Chicago, IL jobs
We're looking for a Staff Site Reliability Engineer to join our team, focusing on the core systems that power global financial markets. This isn't just about keeping the lights on; it's about pioneering the future of financial technology. As a member of our Clearing department, you'll be on the front lines, ensuring the integrity and performance of mission-critical systems that facilitate billions of dollars in daily transactions. If you're a builder at heart, driven by a passion for creating ultra-reliable and resilient systems, you'll thrive here. This is a hybrid role. You must be in our office 2+ days a week What You'll Get * A supportive environment fostering career progression, continuous learning, and an inclusive culture. * Broad exposure to CME's diverse products, asset classes, and cross-functional teams. * A competitive salary and comprehensive benefits package. Learn more about our career opportunities here. What You'll Do As a Staff Site Reliability Engineer, you'll be a visionary builder of our resilient infrastructure. You'll move beyond conventional operations to apply software engineering principles to every facet of our clearing systems. * Pioneer solutions to guarantee the reliability, performance, and availability of our CME clearing and risk systems, where every millisecond and every transaction counts. * Architect and implement cutting-edge solutions for application resiliency and fault tolerance. * Drive automation and continuous improvement across the entire system lifecycle, eliminating manual toil and enhancing operational excellence. * Integrate SRE principles directly into the software development lifecycle, embedding reliability from day one. * Collaborate with cross-functional development and platform teams, providing expert-level guidance to deploy and maintain critical applications. * Innovate and lead efforts to prevent incidents, enhance operational processes, and automate solutions at a global scale. * Spearhead the adoption of observability and performance testing, guiding teams to a "build with SRE mindset" culture. * Own the end-to-end operational integrity of products, understanding and contributing to the bigger picture of the organization. What You'll Bring * A strong academic background: Bachelor's degree in Engineering, Computer Science, Information Technology, or a related field is strongly preferred. * Cloud expertise: Hands-on experience deploying and operating applications using IaaS and PaaS on major cloud providers, preferably Google Cloud Services. * Coding fluency: Proficiency in one or more of the following languages: Java, Python, Bash, or Go. Typescript and/or Rust are a significant plus. * Infrastructure as Code (IaC) mastery: Experience with tools such as GKE, Terraform, CloudFormation, and Chef. * Proven reliability engineering skills: Deep knowledge of SRE and security best practices, with a track record of implementing them into workflows. A solid understanding of performance testing tools is essential, along with the ability to help teams resolve complex performance issues. * Automation prowess: Demonstrated experience with automation, CI/CD, orchestration, and configuration management. * Observability knowledge: Familiarity with logging and observability platforms such as OpenTelemetry and Prometheus. * A security-first mindset: Strong understanding of security and compliance frameworks. * Problem-solving abilities: Excellent written and verbal communication skills, with the ability to convey complex technical concepts clearly to both technical and non-technical audiences. * Strong collaboration skills: An agile team player who is self-motivated and can work with minimal supervision while juggling multiple concurrent projects. * A passion for innovation: A continuous desire to learn and stay up-to-date with the latest technologies and industry trends. #LI-JK1 #LI-Hybrid CME Group is committed to offering a competitive total rewards package for our employees that recognizes their contributions to the business and reflects our long-term investment in their future. The pay range for this role is $128,500-$214,100. Actual salary offered will be dependent on a wide array of factors including but not limited to: relevant experience, skills, education and comparison to internal employees (where relevant). Our compensation program also includes an annual target bonus opportunity for all employees, as well as the opportunity to become an owner in the company through our broad-based equity program. Through our benefits program, we strive to offer flexibility, value and choice. From comprehensive health coverage, to a retirement package that includes both a 401(k) and an active pension plan, to highly competitive education reimbursement provisions, paid time off and a mental health benefit, CME Group offers a holistic benefits package for our team and their dependents. CME Group: Where Futures are Made CME Group is the world's leading derivatives marketplace. But who we are goes deeper than that. Here, you can impact markets worldwide. Transform industries. And build a career by shaping tomorrow. We invest in your success and you own it - all while working alongside a team of leading experts who inspire you in ways big and small. Problem solvers, difference makers, trailblazers. Those are our people. And we're looking for more. At CME Group, we embrace our employees' unique experiences and skills to ensure that everyone's perspectives are acknowledged and valued. As an equal-opportunity employer, we consider all potential employees without regard to any protected characteristic. Important Notice: Recruitment fraud is on the rise, with scammers using misleading promises of job offers and interviews to solicit money and personal information from job seekers. CME Group adheres to established procedures designed to maintain trust, confidence and security throughout our recruitment process. Learn more here.
$128.5k-214.1k yearly 60d+ ago
Reliability Engineer (SRE OMS)
Tata Consulting Services 4.3
Marlborough, MA jobs
* SRE with Sterling OMS Skillset with adaptability to Distributed Systems, developing Automations with AI/GenAI tool etc * Operations skillset with enough attitude to scale to a Reliability Engineer. * Should be able to handle customer communication and coordination with offshore team. TCS Employee Benefits Summary: * Discretionary Annual Incentive. * Comprehensive Medical Coverage: Medical & Health, Dental & Vision, Disability Planning & Insurance, Pet Insurance Plans. * Family Support: Maternal & Parental Leaves. * Insurance Options: Auto & Home Insurance, Identity Theft Protection. * Convenience & Professional Growth: Commute r Benefits & Certification & Training Reimbursement. * Time Off: Vacation, Time Off, Sick Leave & Holidays. * Legal & Financial Assistance: Legal Assistance, 401K Plan, Performance Bonus, College Fund, Student Loan Refinancing. # LI-RJ2 Salary Range - $100,000-$120,000 a year
$100k-120k yearly 13d ago
Reliability Engineer
Tata Consulting Services 4.3
Marlborough, MA jobs
* SRE to quickly write automations, self-heal scripts, understanding and finding resolutions for errors from Microservices basically any from any stack ( Full-Stack capable). * Operations skillset with enough attitude to scale to a Reliability Engineer * Should be able to handle customer communication and coordination with offshore team. TCS Employee Benefits Summary: * Discretionary Annual Incentive. * Comprehensive Medical Coverage: Medical & Health, Dental & Vision, Disability Planning & Insurance, Pet Insurance Plans. * Family Support: Maternal & Parental Leaves. * Insurance Options: Auto & Home Insurance, Identity Theft Protection. * Convenience & Professional Growth: Commute r Benefits & Certification & Training Reimbursement. * Time Off: Vacation, Time Off, Sick Leave & Holidays. * Legal & Financial Assistance: Legal Assistance, 401K Plan, Performance Bonus, College Fund, Student Loan Refinancing. # LI-RJ2 Salary Range - $100,000-$120,000 a year
$100k-120k yearly 13d ago
Site Reliability Engineer
Tata Consulting Services 4.3
Miami, FL jobs
Must-Have * Strong development experience in .NET and Java frameworks. * Proven leadership managing SRE and DevOps teams. * Incident and problem management using ServiceNow. * Expertise in Observability: AppDynamics, PagerDuty, Grafana, Splunk. * Deep understanding of CI/CD with Azure ADO, GitHub, Maven, Gradle. * Automated regression and performance testing experience with Selenium, JMeter. * Experience building self-healing systems. * Strong skills in root cause analysis (RCA) and problem identification. * Ability to define and enforce SLAs and response metrics. * Document and maintain version-controlled knowledge repositories. * Exposure to self-healing systems in SRE or DevOps context. Good-to-Have * Certifications in AWS/GCP/Azure Salary Range-$100,000-$120,000 a year #LI-KR3 TCS Employee Benefits Summary: Discretionary Annual Incentive. Comprehensive Medical Coverage: Medical & Health, Dental & Vision, Disability Planning & Insurance, Pet Insurance Plans. Family Support: Maternal & Parental Leaves. Insurance Options: Auto & Home Insurance, Identity Theft Protection. Convenience & Professional Growth: Commuter Benefits & Certification & Training Reimbursement. Time Off: Vacation, Time Off, Sick Leave & Holidays. Legal & Financial Assistance: Legal Assistance, 401K Plan, Performance Bonus, College Fund, Student Loan Refinancing. Experience working in a Travel/Tourism industry
$100k-120k yearly 10d ago
Site Reliability Engineer - Capital Markets
Jefferies Financial Group Inc. 4.8
New York, NY jobs
Jefferies is seeking for Site Reliability Engineer to play an instrumental role in supporting Equity Front office trading application, risk and middle office real time products, developed and used for Equity Cash and ETS application. As part of the wider platform engineering team, you will be working closely with the Business users interactively throughout the day, along with technical, analysis and testing colleagues. Investigation and resolution of the work items at hand will require competent technical skills and a keen intellect. The business is a growth area, with current investments taking place in all the technology, business and middle office areas. Responsibilities: * Front Line Site Reliable Engineering and Support functions for Equity trading systems used by Jefferies clients as well as internal users. * Build monitoring tools for application and infrastructure components. * Implement and manage scalable infrastructure using cloud-native technologies and tools. * Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding. * Partner with business, development and infrastructure teams to improve services through rigorous testing and release procedures. * Develop and maintain CI/CD pipelines to streamline deployment processes. * Expedient deployment of new systems. Capacity planning, Platform Management, and support for increasing volumes and business growth. * Create sustainable systems and services through automation. * Collaborate with Application team to establish and enforce production and development standards. * Document procedures, best practices and troubleshooting FAQs. * Resolve complex application and technical problems. * Debugging the system and fixing the production related issues. * Escalate / follow-up on permanent fix for development related issues. * Lead incident response efforts and post-mortem analysis to prevent future occurrences. * Handles complex operational tasks and recommends process and technology changes. * Global support and includes weekend availability to troubleshoot production related issues and perform checkouts. * Ability to work both independently and in groups in an energetic, diverse environment. * Participate in on-call rotations to ensure 24/7 system availability and support. * Support compliance and legal queries. Qualifications: * Strong experience in Windows and Linux/Unix services. * Strong experience in scripting language like Power shell, Python and SQL. * Strong Knowledge of monitoring tools - Nagios, Splunk, OTEL, Datadog * Strong Knowledge of FIX protocol * Strong Domain skills - Must have working experience in Capital Markets across modules and instruments especially - CASH, ETS, Bonds, Options, Futures, Swaps products * Experience in BFSI (Banking and Financial Industry) Domain applications with a proper understanding of the Trade Lifecycle. * Excellent communication, time management and project management skills. Primary Location Full Time Salary Range of $175,000 - $200,000
$175k-200k yearly Auto-Apply 41d ago
Network Reliability Engineer III
CME Group 4.4
Chicago, IL jobs
As we embark on a journey to transform the Network Services Group in CME, we are seeking a Network Reliability Engineer III to join our dynamic team. In this role, you will design, develop and maintain self-service tools and applications that enhance productivity and reduce operational costs. You will work across the full stack-both front-end and back-end-to architect microservices (GKE) in Google Cloud Platform (GCP), driving our infrastructure towards greater automation and reliability. We are a global team across US, UK, India and Singapore made up of a diverse range of people from varied backgrounds who each bring unique network experiences and skill sets. The relatively new Network Reliability/Automation team are responsible for building a suite of custom automation tools and developing our self-healing capabilities while working closely with other members of the Network Services team in project delivery to ensure one of the largest Exchange network infrastructures in the world is highly available, resilient, secure and reliable. Responsibilities * Design, develop and maintain self-service and automation tools to streamline IT operations and reduce manual effort. * Engage in full-stack development, delivering responsive front-end interfaces as well as robust scalable back-end services. * With support Architect, deploy and scale microservices on GCP, with particular emphasis on containers and Google Kubernetes Engine (GKE). * Manage cloud infrastructure via Infrastructure-as-Code (IaC), primarily using Terraform to provision and maintain resources. * Operate and troubleshoot solutions on Linux-based platforms, leveraging Visual Studio Code (VSCode) as the primary development environment. * Adhere to software engineering best practices, including PEP8 coding standards, SOLID design principles, and established SDLC processes. * Implement and manage CI/CD pipelines with a DevOps mindset, ensuring rapid, reliable delivery of code. * Develop and consume Flask-based RESTful APIs to support network and security automation. * Collaborate within an Agile Scrum framework, utilizing tools such as Bitbucket and Jira to track progress and manage sprints. * Apply strong analytical and problem-solving skills to balance multiple project variables and deliver high-quality solutions on schedule. What we are looking for * Approximately 2-3 years' hands-on Python programming experience, with a demonstrable track record of automation or tooling projects. * Knowledge and experience working with both Python Django and Flask in a corporate environment. * Any experience in network and security automation, coupled with understanding of network fundamentals (routing, switching, firewalls, VPNs) would be beneficial. * Experience developing REST APIs using Flask (or a comparable Python framework). * Applicants with front-end experience using Javascript/JQuery/HTML5/CSS would be ideal. * Familiarity with Infrastructure-as-Code using Terraform (or similar) to manage cloud resources. * Comfortable working in Linux environments and proficient in using Visual Studio Code (VSCode). * Strong software engineering mindset: adherence to PEP8, SOLID principles, and best practices for SDLC, CI/CD and DevOps. * Excellent communication skills, both verbal and written, with the ability to convey technical concepts to diverse stakeholders. * Highly analytical, with the ability to troubleshoot complex issues and manage multiple tasks concurrently. * Experience working in Agile Scrum teams, utilizing Bitbucket and Jira (or equivalent tools) for version control and project tracking. Personal Attributes * Proactive and positive attitude, taking initiative to identify and resolve issues ahead of time. * Collaborative team player, eager to contribute knowledge and assist colleagues. * Innovative thinker who brings fresh ideas and constructive suggestions for continuous improvement. Education Bachelor's Degree in Computer Science, Engineering or a related field is preferred. Equivalent practical experience will also be considered. #LI - Hybrid #LI - JK1 CME Group is committed to offering a competitive total rewards package for our employees that recognizes their contributions to the business and reflects our long-term investment in their future. The pay range for this role is $100,700-$167,800. Actual salary offered will be dependent on a wide array of factors including but not limited to: relevant experience, skills, education and comparison to internal employees (where relevant). Our compensation program also includes an annual target bonus opportunity for all employees, as well as the opportunity to become an owner in the company through our broad-based equity program. Through our benefits program, we strive to offer flexibility, value and choice. From comprehensive health coverage, to a retirement package that includes both a 401(k) and an active pension plan, to highly competitive education reimbursement provisions, paid time off and a mental health benefit, CME Group offers a holistic benefits package for our team and their dependents. CME Group: Where Futures are Made CME Group is the world's leading derivatives marketplace. But who we are goes deeper than that. Here, you can impact markets worldwide. Transform industries. And build a career by shaping tomorrow. We invest in your success and you own it - all while working alongside a team of leading experts who inspire you in ways big and small. Problem solvers, difference makers, trailblazers. Those are our people. And we're looking for more. At CME Group, we embrace our employees' unique experiences and skills to ensure that everyone's perspectives are acknowledged and valued. As an equal-opportunity employer, we consider all potential employees without regard to any protected characteristic. Important Notice: Recruitment fraud is on the rise, with scammers using misleading promises of job offers and interviews to solicit money and personal information from job seekers. CME Group adheres to established procedures designed to maintain trust, confidence and security throughout our recruitment process. Learn more here.
$100.7k-167.8k yearly 60d+ ago