Site Reliability Engineer (Azure App Services)
Reliability engineer job in Irving, TX
The Judge Group, a Technology, Talent & Learning Solutions company based in Wayne, PA, that helps professionals find top jobs with the nation's leading brands. We're looking to hire a Site Reliability Engineer (Azure App Services) for a Full-Time, permanent position based in Irving, TX.
About the Role
We are looking for a Site Reliability Engineer (SRE) with strong expertise in monitoring, debugging, and optimizing Azure App Services. This role ensures platform reliability, performance, and scalability. You will combine hands-on Azure experience with code-level debugging, observability best practices, and automation to prevent issues, reduce MTTD/MTTR, and deliver an exceptional experience for end-users.
Qualifications
Bachelor's degree in Computer Science, IT, or related field.
Microsoft Azure Fundamentals (AZ-900) certification required.
Proven SRE experience focusing on monitoring, debugging, and incident response.
Hands-on experience with Azure App Services, Application Insights, and Azure Monitor.
Skilled in Diagnose and Troubleshoot Tools, Kudu, and PowerShell scripting.
Strong programming fundamentals with ability to read and troubleshoot .NET/C# and Angular code.
Experience in on-call operations, incident response, and RCA writing.
Bonus: Experience with Grafana/Prometheus, DataDog/Dynatrace, Azure Front Door, CDN, Function Apps, WebJobs, Service Bus, or Event Hub.
Excellent communication, collaboration, and problem-solving skills.
Additional Azure certifications are a strong plus.
Quality Engineer (Richardson)
Reliability engineer job in Richardson, TX
Zentech is one of the leading and most highly certified U.S. based Electronics Manufacturing Services (EMS) providers in North America. We support original equipment manufacturers (OEMs) of medical devices, aerospace and defense products, and industrial equipment with engineering & manufacturing solutions. These solutions include product design, printed circuit board layout, test development, and manufacturing support through the whole product lifecycle.
The products and designs Zentech support are an integral part of everyday life and in mission critical environments. Many can be seen every day at sporting events, on delivery trucks, in medical offices, at construction sites, on American farms, and on commercial aircraft. Other products and designs are unseen but vital, such the work we do for our domestic and foreign military customers who rely on our technical skills to help ensure our country remains safe, our warfighters remain out of harm's way, and our nation's networks remain free from intrusion.
Zentech has developed strengths in the required manufacturing processes for high reliability, high complexity, low-to-medium volume printed circuit boards and box builds, all with best-in-class ability to scale to higher volume products. Zentech products are proudly Made in America.
Not sure what skills you will need for this opportunity Simply read the full description below to get a complete picture of candidate requirements.
Job Description: The Quality Assurance Engineer assists the QC Supervisor, Program Managers, Design Engineers and Manufacturing Engineers in the introduction and sustaining of products in production. This process includes analyzing customer controlled technical data packages (e.g., engineering drawings, schematics, bills of material, specifications, etc.) in order to:
Identify and resolve issues affecting product realization
Identify and establish the products inspection requirements and techniques
Provide technical support to structure Customer data for manufacturability
Ensure all Customer and regulatory requirements are identified and satisfied throughout the product lifecycle
Sets quality assurance testing models for analysis of raw materials, materials in process, and finished products.
IPC-A-610 Certified Trainer preferred
J-STD-001 Certified Trainer preferred
Performs a variety of complex and critical tasks.
May provide consultation on complex projects and is considered to be the top-level contributor/specialist.
Understands and applies statistical analysis to promote continuous process and product improvement.
A wide degree of creativity and latitude is expected.
Understands and applies product and process knowledge to review, approve, validate and verify customer and engineering change, deviation and waiver activities.
Ability to assimilate facts and make recommendations to correct processing errors.
May report to an executive or a manager.
Skills:
Candidates must be able to work in a self-directed manner in a fast-paced working environment in or with new or loosely defined situations
Foster and maintain professional and productive relationships with other staff and management
Ability to maintain productive relationships while prioritizing tasks in a demanding environment
Possess excellent verbal and written communication skills
Possess strong proficiency in Microsoft Office Suite programs and Adobe Acrobat
800x600 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4
Qualifications:
Requires a bachelor's degree in area of specialty and at least 8-10 years of experience in the field or in a related area.
Certified in statistical analysis (Black Belt preferred)
Demonstrates expertise in a variety of the field's concepts, practices, and procedures including: ZDP, DFX, FMEA, CAPA, SPC, Control Charts, 5-Whys, MSA, RCA, 8-Diciplines Techniques of Problem Solving, Good Documentation Practice.
Ability to assess and implement statistical process control (SPC) data measurement systems at various processing steps
Relies on extensive experience and judgment to plan and accomplish goals.
Ability to conceptualize, design, implement, manage, and sustain critical processes
Assist in processing of returned products for warranty and service repair
Processing of Non-conforming products including Material Review processes
Defect analysis including root cause and corrective action
Corrective, preventive and improvement action processing
Experience in audit activities pertaining to standards, procedures and practices.
Practical working familiarity with industry and regulatory standards including OSHA, ISO and Medical
Knowledge of Engineering processes and terms
Physical Demands:
While performing these duties, the employee is required to sit, use hands, reach with hand and arm, and talk or hear. May be required to lift up to 20 pounds.
Zentech Manufacturing is an Equal Opportunity Employer. xevrcyc Zentech will recruit, hire, train, and promote persons in all job titles without regard to real or perceived classification of race, ethnicity, ancestry, color, marital status, religion, national origin, veteran status, sex, sexual orientation, genetic information, gender identity or expression, age, or physical or mental disability.
PIde1174da34c9-38
SQL Database Reliability Engineer
Reliability engineer job in Plano, TX
Description The Database Reliability Engineer (DBRE) - SQL is a critical technical leader responsible for ensuring the availability, performance, and reliability of Tyler Technologies' SQL Server infrastructure. This role combines deep database expertise with operational excellence to support high-performance systems across complex client environments. The ideal candidate is confident, authoritative, and takes ownership in diagnosing and resolving issues, while influencing database design, infrastructure improvements, and long-term stability strategies.
Hybrid Work Policy: The candidate is required to be on-site at least 3x per week at the Plano, TX office.
Responsibilities
Serve as the lead resource for high-severity SQL Server incidents, driving triage, diagnostics, and resolution in real time.
Own performance tuning, indexing strategies, and architecture-level improvements to optimize database systems at scale.
Proactively monitor database performance, system health, and workload trends to identify and resolve issues before they impact customers.
Collaborate with product and development teams to refine schema design, improve query performance, and enhance overall data architecture.
Develop and maintain database standards, observability dashboards, and automation for patching, backups, and alerting.
Design and execute comprehensive backup and disaster recovery strategies for critical systems.
Contribute to continuous improvement initiatives, including cloud modernization, infrastructure as code, and capacity planning.
Author technical documentation, including runbooks, architecture designs, internal KBs, and lessons learned.
Provide mentorship and technical leadership to engineers across support and infrastructure teams.
Advocate for architectural and operational improvements across teams, using data and insight to influence decisions.
Role Complexity To be successful, this individual must:
Demonstrate expert-level understanding of SQL Server internals, optimization techniques, and operational best practices.
Lead high-stakes conversations and incident calls with clarity, confidence, and control.
Understand system-level performance (I/O, memory, CPU) and how it affects SQL operations.
Communicate technical issues and solutions to both technical and business stakeholders effectively.
Analyze recurring incidents to identify trends and permanently resolve root causes.
Operate autonomously with a sense of ownership and urgency.
Balance short-term firefighting with long-term architectural planning and automation.
Qualifications
Bachelor's degree in Computer Science, Information Systems, or a related field-or 5+ years of equivalent experience.
Proven experience in a production-facing SQL Server environment, preferably in a SaaS or multi-tenant context.
Hands-on expertise in:
Query tuning, indexing, and execution plan analysis.
High availability, replication, disaster recovery, and backup strategies.
Scripting and automation using PowerShell, T-SQL, or similar.
Monitoring and observability tools for SQL Server and infrastructure health.
Strong familiarity with virtualization, storage performance, and cloud platforms (e.g., AWS, Azure).
Demonstrated ability to lead incident response and influence cross-functional technical decisions.
Exceptional written and verbal communication skills, especially under pressure.
Previous experience mentoring peers or junior engineers is a plus.
Auto-ApplySite Reliability Engineer
Reliability engineer job in Richardson, TX
Splunk, a Cisco company, is building a safer and more resilient digital world with an end-to-end full stack platform made for a hybrid, multi-cloud world. Leading enterprises use our unified security and observability platform to keep their digital systems secure and reliable. Come help organizations be their best, while you reach new heights with a team that has your back.
Meet the Team
Splunk, a Cisco company, is building a safer, more resilient digital world with an end-to-end, full stack platform designed for hybrid, multi-cloud environments. Join the FedRAMP team where you will be a part of one of the largest and most sophisticated cloud-scale, Bigdata, and microservices platforms in the world. You will be working with engineers who operate highly available, scalable, and cost-efficient applications with low operational burden by handling and improving the reliability and resiliency of services and infrastructure. You thrive driving initiatives on automation, infrastructure-as-code, reliability engineering, and getting rid of tedious, manual tasks.
Your Impact
Splunk's FedRAMP SRE team is looking for a Site Reliability Engineer to help lead, design and build the next generation of our large scale cloud offering. You will be working on core services and applications that form the primitives for our current and future cloud service offerings. Site Reliability Engineers in this role will be engaging with multiple service owners across the platform to teach and implement modern interpretations of SRE, observability, Chaos Engineering and DevOps. This role is highly visible and impactful to the organization and will help shape Splunk's Engineering culture for years to come. Your job, in a nutshell, is to make every team around you better... including your own!
You will:
* Own Splunk Cloud in FedRAMP environments.
* Work across the organization to deliver quality products that delight Splunk's passionate users.
* Work with teams of tight-knit engineers who are building a state-of-the-art, cloud-based environment for massive-scale data processing.
Minimum Qualifications
* You have experience or an interest in working with regulated computing environments such as FISMA and/or FedRAMP and are enthusiastic about doing it better.
* You have worked with Kubernetes, EKS, GKE or AKS and the associated ecosystems. Kubernetes certifications or an interest in obtaining these certifications are a plus, such as those from the Cloud Native Computing Foundation; Certified Kubernetes Administrator (CKA), Certified Kubernetes Application Developer (CKAD), or Certified Kubernetes Security Specialist (CKS).
* You have a good understanding of linux systems (network stack, file system, OS services) and networking (L2 vs. L3, network architecture, VLANs, etc)
* Experience with at least one programming language, preferably golang (go) or python. Knowledge of working with and automating linux systems tasks using this language is required, including working with configuration files and system services. Knowledge of common data structures and algorithms, as well as their performance characteristics is required.
* Skilled in identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues.
Preferred Qualifications
* Experience monitoring cloud environments with Splunk.
* Experience with development and deployment in a hosted cloud environment, preferably AWS, Azure or GCP. Cloud certifications are a plus or an interest in obtaining these certifications, such as AWS Certified Solutions Architect, AWS Certified DevOps Engineer, or Google Associate Cloud Engineer (ACE).
* Experience with large scale distributed cloud service development, infrastructure, traffic management and architecture.
* Experience with distributed architectures/systems with optimized and scalable software that operates on a large number of nodes.
Why Cisco?
At Cisco, we're revolutionizing how data and infrastructure connect and protect organizations in the AI era - and beyond. We've been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint.
Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you'll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere.
We are Cisco, and our power starts with you.
Message to applicants applying to work in the U.S. and/or Canada:
The starting salary range posted for this position is $126,500.00 to $182,000.00 and reflects the projected salary range for new hires in this position in U.S. and/or Canada locations, not including incentive compensation*, equity, or benefits.
Individual pay is determined by the candidate's hiring location, market conditions, job-related skillset, experience, qualifications, education, certifications, and/or training. The full salary range for certain locations is listed below. For locations not listed below, the recruiter can share more details about compensation for the role in your location during the hiring process.
U.S. employees are offered benefits, subject to Cisco's plan eligibility rules, which include medical, dental and vision insurance, a 401(k) plan with a Cisco matching contribution, paid parental leave, short and long-term disability coverage, and basic life insurance. Please see the Cisco careers site to discover more benefits and perks. Employees may be eligible to receive grants of Cisco restricted stock units, which vest following continued employment with Cisco for defined periods of time.
U.S. employees are eligible for paid time away as described below, subject to Cisco's policies:
* 10 paid holidays per full calendar year, plus 1 floating holiday for non-exempt employees
* 1 paid day off for employee's birthday, paid year-end holiday shutdown, and 4 paid days off for personal wellness determined by Cisco
* Non-exempt employees receive 16 days of paid vacation time per full calendar year, accrued at rate of 4.92 hours per pay period for full-time employees
* Exempt employees participate in Cisco's flexible vacation time off program, which has no defined limit on how much vacation time eligible employees may use (subject to availability and some business limitations)
* 80 hours of sick time off provided on hire date and each January 1st thereafter, and up to 80 hours of unused sick time carried forward from one calendar year to the next
* Additional paid time away may be requested to deal with critical or emergency issues for family members
* Optional 10 paid days per full calendar year to volunteer
For non-sales roles, employees are also eligible to earn annual bonuses subject to Cisco's policies.
Employees on sales plans earn performance-based incentive pay on top of their base salary, which is split between quota and non-quota components, subject to the applicable Cisco plan. For quota-based incentive pay, Cisco typically pays as follows:
* .75% of incentive target for each 1% of revenue attainment up to 50% of quota;
* 1.5% of incentive target for each 1% of attainment between 50% and 75%;
* 1% of incentive target for each 1% of attainment between 75% and 100%; and
* Once performance exceeds 100% attainment, incentive rates are at or above 1% for each 1% of attainment with no cap on incentive compensation.
For non-quota-based sales performance elements such as strategic sales objectives, Cisco may pay 0% up to 125% of target. Cisco sales plans do not have a minimum threshold of performance for sales incentive compensation to be paid.
The applicable full salary ranges for this position, by specific state, are listed below:
New York City Metro Area:
$152,500.00 - $252,000.00
Non-Metro New York state & Washington state:
$135,800.00 - $224,400.00
* For quota-based sales roles on Cisco's sales plan, the ranges provided in this posting include base pay and sales target incentive compensation combined.
Employees in Illinois, whether exempt or non-exempt, will participate in a unique time off program to meet local requirements.
3510- Site Reliability Engineer II
Reliability engineer job in Dallas, TX
About the Role We at Innovaccer are looking for a Site Reliability Engineer-II to build secured modern healthcare cloud infrastructure and a massive data stack and aim to write everything as code A Day in the Life * Take ownership of SRE pillars: Deployment, Reliability, Scalability, Service Availability (SLA/SLO/SLI), Performance, and Cost.
* Lead production rollouts of new releases and emergency patches using CI/CD pipelines while continuously improving deployment processes.
* Establish robust production promotion and change management processes with quality gates across Dev/QA teams.
* Roll out a complete observability stack across systems to proactively detect and resolve outages or degradations.
* Analyze production system metrics, optimize system utilization, and drive cost efficiency.
* Manage autoscaling of the platform during peak usage scenarios.
* Perform triage and RCA by leveraging observability toolchains across the platform architecture.
* Reduce escalations to higher-level teams through proactive reliability improvements.
* Participate in the 24x7 OnCall Production Support team.
* Lead monthly operational reviews with executives covering KPIs such as uptime, RCA, CAP (Corrective Action Plan), PAP (Preventive Action Plan), and security/audit reports.
* Operate and manage production and staging cloud platforms, ensuring uptime and SLA adherence.
* Collaborate with Dev, QA, DevOps, and Customer Success teams to drive RCA and product improvements.
* Implement security guidelines (e.g., DDoS protection, vulnerability management, patch management, security agents).
* Manage least-privilege RBAC for production services and toolchains.
* Build and execute Disaster Recovery plans and actively participate in Incident Response.
* Work with a cool head under pressure and avoid shortcuts during production issues.
* Collaborate effectively across teams with excellent verbal and written communication skills.
* Build strong relationships and drive results without direct reporting lines.
* Take ownership, be highly organized, self-motivated, and accountable for high-quality delivery.
What You Need
* 4-7 years in production engineering, site reliability, or related roles.
* Solid hands-on experience with at least one cloud provider (AWS, Azure, GCP) with automation focus (certifications preferred).
* Strong expertise in Kubernetes and Linux.
* Proficiency in scripting/programming (Python required).
* Observability is very critical for the scale of our systems and ability to find insights/behavior, detect problem/failures. Looking for leads to drive this charter spanning across logs, metrics, mesh, tracing etc.
* Knowledge of CI/CD pipelines and toolchains (Jenkins, ArgoCD, GitOps).
* Familiarity with persistence stores (Postgres, MongoDB), data warehousing (Snowflake, Databricks), and messaging (Kafka).
* Exposure to monitoring/observability tools such as ElasticSearch, Prometheus, Jaeger, NewRelic, etc.
* Proven experience in production reliability, scalability, and performance systems.
* Experience in 24x7 production environments with process focus.
* Familiarity with ticketing and incident management systems.
* Security-first mindset with knowledge of vulnerability management and compliance.
* Advantageous: hands-on experience with Kafka, Postgres, and Snowflake.
* Excellent judgment, analytical thinking, and problem-solving skills.
* Ability to quickly identify and drive optimal solutions within constraints.
* Lead least privilege based RBAC for various production services and tool chains.
* Able to perform with cool head under pressure situations without taking any shortcuts.
* Collaboration with solid verbal and oral communication skills are very critical to this role. Strong cross-functional collaboration skills, relationship building skills, and ability to achieve results without direct reporting relationships
* Ability to quickly identify and drive to the optimal solution when presented with a series of constraints.
* Excellent judgment, analytical thinking, and problem-solving skills.
* Self-motivated individual that possesses excellent time management and organizational skills.
* Strong sense of personal responsibility and accountability for delivering high quality work.
We offer competitive benefits to set you up for success in and outside of work.
Here's What We Offer
* Generous Paid Time Off: Recharge and relax with 22 days of fixed time off per year, in addition to company holidays-because we believe work-life balance fuels performance.
* Best-in-Class Parental Leave: Spend quality time with your growing family. We offer one of the industry's most generous parental leave policies to support you during life's most important moments.
* Recognition & Rewards: We celebrate wins-big and small. Get rewarded with monetary incentives and company-wide recognition for your impact and dedication. Your hard work won't go unnoticed.
* Comprehensive Insurance Coverage: Stay covered with medical, dental, and vision insurance, plus 100% company-paid short- and long-term disability and basic life insurance. Optional perks include discounted legal aid and pet insurance.
Innovaccer Inc. is an equal opportunity employer. We celebrate diversity and are committed to fostering an inclusive workplace where all employees feel valued and empowered regardless of any characteristic protected by federal, state or local law including, without limitation, race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, medical condition, disability, age, marital status, or veteran status. Innovaccer Inc. participates in the E-Verify program to confirm employment eligibility of all newly hired employees based out of the U.S. and employed by Innovaccer Inc.
For any additional information, please visit the below websites:
E-Verify
Right to Work (English)
Right to Work (Spanish)
Disclaimer:
Innovaccer does not charge fees or require payment from individuals or agencies for securing employment with us. We do not guarantee job spots or engage in any financial transactions related to employment. If you encounter any posts or requests asking for payment or personal information, we strongly advise you to report them immediately to our HR department at *****************. Additionally, please exercise caution and verify the authenticity of any requests before disclosing personal and confidential information, including bank account details.
Easy ApplyPrincipal System Safety Engineer (Onsite)
Reliability engineer job in Richardson, TX
Country:
United States of America Onsite
U.S. Citizen, U.S. Person, or Immigration Status Requirements:
The ability to obtain and maintain a U.S. government issued security clearance is required. U.S. citizenship is required, as only U.S. citizens are eligible for a security clearance
Security Clearance:
DoD Clearance: Secret
Our industry-leading experts are setting the standards for the aerospace industry and paving the way for the future. But as new challenges present themselves, we need fresh, creative and motivated minds to overcome these hurdles, help us break barriers and achieve new levels of innovation. Do you have what it takes to join a global organization that doesn't shy away from big opportunities? If so, we invite you to join our ranks and create the next generation of aerospace technologies.
Together, we will nurture an engineering culture that values intellectual curiosity, risk takers and integrity. A place where we will challenge ourselves, our teams, and the status quo and where we will work to find a way - the right way - to achieve what others can only dream of.
We are currently searching for a Principal Systems Engineer with a focus on System Safety to join our team in Richardson, TX. We are seeking individuals with the technical skills and personal leadership attributes to help support program development through the application of system safety techniques. The successful candidate will participate in all phases of program life cycle, utilizing system safety methodologies as they apply in our systems engineering processes. They will be responsible for tracking and schedule through the application of earned value management system of programs currently in development and leading a small system safety team.
What YOU Will Do:
Support design and development teams to develop, assess, and verify system safety related requirements for mission-critical systems
Identify, evaluate, and mitigate hazards and risks associated with mission critical systems
Develop Functional Hazard Assessment (FHA), Preliminary System Safety Assessment (PSSA), System Safety Assessment (SSA), System Safety Hazard Analysis (SSHA), Hazardous Materials Report, Fault Tree Analysis
Apply MIL-STD-882E (DoD System Safety Standard Practice) and SAE ARP 4761 (Civil Aviation Safety Assessment Guidelines / Methods)
Serve as a subject matter expert to various disciplines from engineering, project engineers, and business leadership in the area of system safety
Develop proposal language and determining appropriate BOE (basis of estimates) for bids and proposals during pursuit capture phase of a program for system safety programs
What YOU Will Learn:
Learn, discover and expand your knowledge about aerospace navigation systems and other various engineering disciplines
Learn to thrive and be supported in a dynamic team-driven environment
Learn to become a champion within our CORE Operating Systems
Qualifications You Must Have:
Typically requires a degree in Science, Technology, Engineering or Mathematics (STEM) and minimum 8 years prior relevant experience or an Advanced Degree in a related field and minimum 5 years of experience
Previous experience as a systems/hardware/software engineer working primarily with Systems Safety principles and best practices.
Previous experience in developing system safety analysis reports.
Qualifications We Prefer:
Knowledge of systems architectures and designs
Active and existing Secret Clearance
Understanding of military aircraft-related missions, functional operation, performance specifications, and verification and validation techniques
Exposure to Control Account Management (CAM) within the Earned Value Management System (EVMS) framework.
Familiarity with agile development and model-based systems engineering (MBSE).
Experience with Cameo, DOORS, and/or Atlassian (ALM) tools (e.g., Jira, Confluence)
Demonstrated oral and written communication skills, with the ability to effectively lead, interact and build trust with team members from many different backgrounds, including engineering management, engineering teams, program management, business development, support groups, and subcontractors.
What We Offer:
Medical, dental, and vision insurance
Three weeks of vacation for newly hired employees
Generous 401(k) plan that includes employer matching funds and separate employer retirement contribution, including a Lifetime Income Strategy option
Tuition reimbursement program
Student Loan Repayment Program
Life insurance and disability coverage
Optional coverages you can buy pet insurance, home and auto insurance, additional life and accident insurance, critical illness insurance, group legal, ID theft protection
Birth, adoption, parental leave benefits
Ovia Health, fertility, and family planning
Adoption Assistance
Autism Benefit
Employee Assistance Plan, including up to 10 free counseling sessions
Healthy You Incentives, wellness rewards program
Doctor on Demand, virtual doctor visits
Bright Horizons, child and elder care services
Teladoc Medical Experts, second opinion program
May be eligible for relocation
And more!
Learn More & Apply Now!
Mission Systems:
Do you want to be a part of something bigger? A team whose impact stretches across the world, and even beyond? At Collins Aerospace, our Mission Systems team helps civilian, military and government customers complete their most complex missions - whatever and wherever they may be. Our customers depend on us for intelligent and secure communications, missionized systems for specialized aircraft and spacecraft and collaborative space solutions. By joining our team, you'll have your own critical part to play in ensuring our customer succeeds today while anticipating their needs for tomorrow. Are you up for the challenge? Join our mission today.
* Please consider the following role type definitions as you apply for this role.
Onsite: Employees who are working in Onsite roles will work primarily onsite. This includes all production and maintenance employees, as they are essential to the development of our products.
At Collins, the paths we pave together lead to limitless possibility. And the bonds we form - with our customers and with each other - propel us all higher, again and again.
Apply now and be part of the team that's redefining aerospace, every day.
#Hotjobs
The salary range for this role is 101,000 USD - 203,000 USD. The salary range provided is a good faith estimate representative of all experience levels. RTX considers several factors when extending an offer, including but not limited to, the role, function and associated responsibilities, a candidate's work experience, location, education/training, and key skills.Hired applicants may be eligible for benefits, including but not limited to, medical, dental, vision, life insurance, short-term disability, long-term disability, 401(k) match, flexible spending accounts, flexible work schedules, employee assistance program, Employee Scholar Program, parental leave, paid time off, and holidays. Specific benefits are dependent upon the specific business unit as well as whether or not the position is covered by a collective-bargaining agreement.Hired applicants may be eligible for annual short-term and/or long-term incentive compensation programs depending on the level of the position and whether or not it is covered by a collective-bargaining agreement. Payments under these annual programs are not guaranteed and are dependent upon a variety of factors including, but not limited to, individual performance, business unit performance, and/or the company's performance.This role is a U.S.-based role. If the successful candidate resides in a U.S. territory, the appropriate pay structure and benefits will apply.RTX anticipates the application window closing approximately 40 days from the date the notice was posted. However, factors such as candidate flow and business necessity may require RTX to shorten or extend the application window.
RTX is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability or veteran status, or any other applicable state or federal protected class. RTX provides affirmative action in employment for qualified Individuals with a Disability and Protected Veterans in compliance with Section 503 of the Rehabilitation Act and the Vietnam Era Veterans' Readjustment Assistance Act.
Privacy Policy and Terms:
Click on this link to read the Policy and Terms
Auto-ApplySite Reliability Engineer (SRE)
Reliability engineer job in Southlake, TX
Job Description
Client is actively gettingoperational best practices to an even higher level, while adopting andtransforming to a full agile model in the software development practices. TheODX (Operational Data Exchange) organization as part of the Client Data team isresponsible for delivering and making sure multiple Data Exchanges continue tofollow these best practices to achieve operational excellence. We are seekingan entry level to junior software engineer to fill the site reliabilityengineering (SRE) role. The candidate is expected to have good understanding ofdata engineering best practices such as creating and securing data pipelinesand data stores while documenting and following best practices to meet Clientspolicies and procedures. The primary responsibility of this position isbecoming a subject matter expert for the Human Resource Data Exchangeplatforms operations role and ensuring Clients standards are met. Thecandidate will be expected to improve relevant SRE processes while identifyingautomation opportunities for monitoring scheduled processes, ETLs, andreports. This will result in creating effective, reliable, and repeatablesolutions that continues to improve Clients operational excellence formultiple Data Exchange platforms.
Responsibilities:
In-depth knowledge of Workday functionalities, configurations, and alignment to Clients best practices.
Validate data checks, while providing inputs to improve data quality checksthat match Clients best practices and governance policies.
Monitor job health, and data quality metrics while identifying areas forimprovement
Optimize data workflows and processes, while troubleshooting and resolving anydata related issues identified.
Work to develop and maintain reporting dashboard that reflects the health ofeach Data Exchange concerning daily reports matching SLA and SLO targets.
Analyze report trends creating insights for improvements to enhance theefficiency of data operations.
Collaborate with cross-functional Data Exchange platform support teams toensure best practices are in place concerning new releases and patching foreach platform.
Requirements:
Bachelors degree in information and technology or related field, or equivalentexperience.
Experience in Workday implementations, configurations, and support, Workdaycertifications are a plus.
Good understanding of Database, SQL and/or Postgres
Good understanding of Data pipelines and patterns
Experience in Python, Java, JavaScript coding
Experience with ETL tools like Informatica
Experience with Workday development
Experience with incident and change management tracking tools such as Remedy
Good communication skills Oral and Written to convey ideas and persuade others
Experience with dev tools such as Jira, GitHub, Bamboo, JMeter, SonarQube is aplus
Staff Site Reliability Engineer
Reliability engineer job in Plano, TX
EchoStar is reimagining the future of connectivity. Our business reach spans satellite television service, live-streaming and on-demand programming, smart home installation services, mobile plans and products. Today, our brands include Boost Mobile, DISH TV, Gen Mobile, Hughes and Sling TV.
Department Summary
Our Technology teams challenge the status quo and reimagine capabilities across industries. Whether through research and development, technology innovation or solution engineering, our team members play a vital role in connecting consumers with the products and platforms of tomorrow.
Job Duties and Responsibilities
Key Responsibilities:
* Implement and maintain monitoring, alerting, observability, and distributed tracing solutions to improve system health visibility, MTTD, and proactive issue detection
* Lead incident response efforts, participate in on-call rotations, conduct root cause analysis, and facilitate blameless post-mortems to drive continuous improvement and reduce MTTR
* Develop, document, and maintain incident response procedures, runbooks, and SOPs to ensure rapid, consistent responses to critical incidents
* Define, measure, and report SLIs/SLOs for retail wireless and supporting applications, implementing SRE best practices like error budgets and chaos engineering to enhance system reliability
* Drive automation initiatives by developing tools and solutions to streamline operational tasks, reduce manual effort, and eliminate toil
* Collaborate across teams to embed reliability into the software development lifecycle, ensure production readiness of new features, optimize performance, plan capacity, and mentor junior engineers
Skills, Experience and Requirements
Education and Experience:
* Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience
* 6+ years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role in a fast-paced, high-availability environment. Experience in the telecommunications or retail industry is a strong plus
Skills and Qualifications:
* Strong experience with monitoring and observability tools, preferably Dynatrace, along with Prometheus, Grafana, Splunk, ELK Stack, Datadog, or AppDynamics
* Proficient in programming and scripting languages relevant to SRE work, such as Python, Go, Java, Ruby, or Bash, with strong SQL skills for relational and NoSQL databases like Oracle, Cassandra, PostgreSQL, and MySQL
* Extensive experience with cloud platforms, preferably AWS, and familiarity with services from Azure or Google Cloud; certifications like AWS Solutions Architect Associate are a plus
* Solid understanding of microservices, REST APIs, and containerization technologies such as Docker; Kubernetes experience is preferred
* Familiar with CI/CD pipelines and tools (e.g., Jenkins, GitLab CI, CircleCI), and experience with retail wireless systems such as billing and activation platforms
* Strong problem-solving, analytical, and debugging skills, with excellent communication and collaboration abilities; adaptable to dynamic environments and willing to support on-call rotations and weekend coverage
Candidates must be willing to participate in at least one in-person interview, which may include a live whiteboarding or technical assessment session.
Salary Ranges
Compensation: $110,100.00/Year - $157,300.00/Year
Benefits
We offer versatile health perks, including flexible spending accounts, HSA, a 401(k) Plan with company match, ESPP, career opportunities, and a flexible time away plan; all benefits can be viewed here: DISH Benefits.
The base pay range shown is a guideline. Individual total compensation will vary based on factors such as qualifications, skill level, and competencies; compensation is based on the role's location and is subject to change based on work location.
Candidates need to successfully complete a pre-employment screen, which may include a drug test and DMV check. Our company is committed to fostering an inclusive and equitable workplace where every individual has the opportunity to succeed. We are dedicated to providing individuals with criminal or arrest records a fair chance of employment in accordance with local, state, and federal laws.
The posting will be active for a minimum of 3 days. The active posting will continue to extend by 3 days until the position is filled.
We pride ourselves on developing and promoting talent as an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or protected veteran status. EchoStar will accommodate the sincerely held religious beliefs of employees if such accommodations are not undue hardships and are otherwise within the bounds of applicable law. All qualified applicants with arrest or conviction records will be considered for employment in accordance with local, state, and federal law. You may redact any information that identifies age, date of birth, or dates of school/graduation from your application documents before submission and throughout our application process.
EchoStar will provide reasonable accommodation to otherwise qualified job applicants and employees with known physical or mental disabilities, unless doing so poses an undue hardship on the Company, poses a direct threat of substantial harm to others, or is otherwise not required by law. EchoStar has a more detailed Accommodation Policy that applies to employees. EchoStar endeavors to make echostar.com and jobs.echostar.com accessible to users. Please contact *************** if you would like to discuss the accessibility of our website or need assistance completing the application process. This contact information is for accommodation requests only; do not use this contact information to inquire about the status of applications.
Click the links to access the following statements: EEO Policy Statement, Pay Transparency, EEOC Know Your Rights (English/Spanish)
Easy ApplySite Reliability Engineering (SRE)
Reliability engineer job in Dallas, TX
Akkodis is seeking a Site Reliability Engineering (SRE) for a Contract with a client in Dallas, TX. This role involves troubleshooting e-commerce application issues and monitoring system performance to ensure optimal availability and functionality. Rate Range: $41/hour to $56/hour; The rate may be negotiable based on experience, education, geographic location, and other factors.
Site Reliability Engineering (SRE) job responsibilities include:
* Respond to production incidents and alerts within defined SLA timeframes, ensuring timely resolution of critical issues.
* Troubleshoot e-commerce application problems, including order processing failures, payment gateway issues, and cache-related performance concerns.
* Monitor WCS application health using tools like AppDynamics, Dynatrace, and Nagios, and perform proactive health checks.
* Analyze application logs to identify patterns, resolve recurring issues, and improve system stability.
* Manage scheduled jobs, batch processes, and feed file workflows to ensure smooth operations.
* Execute service restart and recovery procedures for WCS components and maintain compliance with ITIL processes for incident, problem, and change management.
Required Qualifications:
* Bachelor's degree in computer science, Information Technology, or a related field.
* 3-5 years of experience in application or technical support roles, with at least 2 years of hands-on experience in WebSphere Commerce Suite (WCS) 8.x or similar e-commerce platforms.
* Strong troubleshooting skills for e-commerce application issues, including order processing, payment gateway failures, and cache optimization.
* Experience with monitoring tools (AppDynamics, Dynatrace), database technologies (DB2, Oracle, SQL Server), and basic Linux/Unix command-line operations.
If you are interested in this role, then please click APPLY NOW. For other opportunities available at Akkodis, or any questions, feel free to contact me at *****************************.
Pay Details: $41.00 to $56.00 per hour
Benefit offerings available for our associates include medical, dental, vision, life insurance, short-term disability, additional voluntary benefits, EAP program, commuter benefits and a 401K plan. Our benefit offerings provide employees the flexibility to choose the type of coverage that meets their individual needs. In addition, our associates may be eligible for paid leave including Paid Sick Leave or any other paid leave required by Federal, State, or local law, as well as Holiday pay where applicable.
Equal Opportunity Employer/Veterans/Disabled
Military connected talent encouraged to apply
To read our Candidate Privacy Information Statement, which explains how we will use your information, please navigate to *************************************************
The Company will consider qualified applicants with arrest and conviction records in accordance with federal, state, and local laws and/or security clearance requirements, including, as applicable:
* The California Fair Chance Act
* Los Angeles City Fair Chance Ordinance
* Los Angeles County Fair Chance Ordinance for Employers
* San Francisco Fair Chance Ordinance
Massachusetts Candidates Only: It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.
Easy ApplyCloud Service Reliability Engineer
Reliability engineer job in Plano, TX
Job Description
We are looking for someone that is generalist at heart, one who is curious, appreciates complexity, knows or wants to learn when to step back and when to dive deep. We call this role a Cloud Service Reliability Engineer.
The Cloud Service Reliability Engineer will be responsible for effective design, execution, and maintenance of systems implemented on premise or in the cloud, primarily focused on identity and access management, cloud computing services/integrations, and data analytics technologies.
Responsibilities of the Cloud Service Reliability Engineer:
Establishes technology product specifications and collaborates with various functions to ensure successful product development and implementation.
Deploys, supports and monitors new and existing services, platforms, and application stacks.
Drives improvements to processes and design enhancements for automation to continuously improve the production environment.
The Cloud Service Reliability Engineer will be responsible for effective design, execution, and maintenance of systems implemented on premise or in the cloud, primarily focused on identity and access management, cloud computing services/integrations, and data analytics technologies
Establishes technology product specifications and collaborates with various functions to ensure successful product development and implementation
Deploys, supports and monitors new and existing services, platforms, and application stacks
Drives improvements to processes and design enhancements for automation to continuously improve the production environment
Skills and Experience
Minimum of 5 years of experience in automating infrastructure, service delivery, and engineering site reliability, maintaining infrastructure on premise and in cloud environment
Product does not really matter (any of Puppet, Chef, Fabric, Ansible, Salt is fine)
Proven experience with cloud Platforms (Azure, AWS, GCP), PaaS, SaaS experience
DevOps Tools (GitHub, GitHub Actions, Jenkins, Jira and other CI/CD Tools)
Configuration Management and infrastructure as a Code(Terraform, Ansible) experience
Expertise in Active Directory Domain Services, Active Directory Federation Services (ADFS), and Active Directory Certificate Services
Site Reliability Engineer (SRE) - SQL & NoSQL Database Operations
Reliability engineer job in Plano, TX
Job DescriptionSalary:
This role bridges software development and IT operations, focusing specifically on ensuring the reliability, scalability, and performance of our SQL and NoSQL database infrastructure. The SRE applies software engineering principles to automate database operations, build robust systems, and proactively address issues to minimize downtime and optimize resource utilization.
Responsibilities
Design and Implementation: Design, implement, and maintain scalable and reliable SQL and NoSQL database systems to support high-performance applications.
Automation and Optimization: Develop automation using scripting languages (e.g., Python) and configuration management tools (e.g., Ansible, Terraform) to streamline database operations, deployments, and infrastructure management.
Performance Tuning: Monitor, analyze, and optimize SQL queries and NoSQL database configurations (e.g., indexing, sharding, replication) to improve database performance and scalability.
Monitoring and Alerting: Design, build, and maintain comprehensive database monitoring solutions to track key metrics (e.g., availability, latency, errors, saturation) and establish effective alerts.
Incident Response: Participate in on-call rotations to respond to and resolve complex production issues and database-related incidents, conduct root cause analyses, and implement preventative measures.
Infrastructure Modernization: Drive the modernization of existing database infrastructure through migrations, upgrades, and optimization efforts.
Collaboration: Work closely with software developers, data engineers, DevOps teams, and architects to integrate database solutions into applications and optimize database performance and stability.
Capacity Planning: Plan and manage database infrastructure capacity to support increasing data volumes and user traffic.
Security: Implement database security policies, access controls, and encryption techniques to protect sensitive data.
Backup and Recovery: Design and implement robust backup and recovery strategies to ensure data integrity and minimize downtime in case of failures.
Skills & Requirements
Database Expertise: Deep understanding and hands-on experience with both SQL (e.g., SQL Server, MySQL) and NoSQL databases (e.g., MongoDB, Redis).
Software Engineering: Proficiency in programming languages like Python , c#, ReactJS, NodeJS etc and experience with software development best practices.
Automation and Scripting: Expertise in automation tools (e.g., Ansible, Terraform) and scripting languages (e.g., Python, Bash) for database operations and infrastructure management.
Cloud Platforms: Experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and containerization technologies (e.g., Docker, Kubernetes).
System Reliability Principles: Strong understanding of SRE principles and practices (e.g., SLIs, SLOs, error budgets, incident management).
Problem-Solving: Excellent diagnostic and problem-solving skills with the ability to analyze complex systems and troubleshoot issues under pressure.
Communication and Collaboration: Strong communication and collaboration skills to work effectively with cross-functional teams and stakeholders.
Data Modeling and Design: Proficiency in data modeling and schema design for both relational and NoSQL databases.
Performance Tuning: Ability to identify and resolve performance bottlenecks, optimize queries, and tune database configurations.
This SRE role requires a blend of technical depth, a proactive mindset, and a commitment to continuous improvement to ensure the smooth operation and optimal performance of critical database systems.
Qualifications
Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent experience.
Relevant certifications in database technologies or cloud platforms (a plus).
Site Reliability Engineer
Reliability engineer job in Dallas, TX
Site Reliability Engineer, Interconnection Service and Network Delivery
Your role
In this role, you will be responsible for deploying and maintaining all Digital Realty interconnection fabric network infrastructure. The ideal candidate can demonstrate a unique blend of network engineering, network operations, and software understanding through the application of engineering principals. You will focus on delivering operational discipline and embrace key operational principals including automation, agile development, and scripting.
What you'll do
You will be part of the global Fabric Engineering organization and work in tandem with other teams to build and maintain a global network infrastructure. Ideal candidates for this role will bring an understanding carrier class network infrastructure as well as experience working in a fast-paced development environment.
What you'll need
5+ years of operations and engineering experience
Bachelor's degree in Computer Science (or equivalent) preferred
Strong experience with automation tools (Ansible, Terraform, etc)
Strong experience working with Linux systems and tools
Experience with Python (or equivalent high-level language)
Experience software development tools (Github, Jenkins, etc) and software development practices
Experience with any of the Hyperscale cloud platforms such as Azure, AWS, Google Cloud, Oracle Cloud, IBM Cloud is an advantage.
Experience configuring monitoring, responding to alerts, and addressing the underlying issues to eliminate toil
Must participate in a 24x7x365 on-call rotation
Familiarity with Layer 3 routing protocols (BGP, IS-IS, OSPF), MPLS technologies, and Layer 2 switching protocols (802.1Q VLAN tagging, Spanning Tree Protocol).
Familiarity with virtual networking concepts such as L2VPN, EVPN, Open vSwitch, and VXLAN preferred
Ability to understand high-level network design and its impacts across the infrastructure
Ability to work independently on complex and unique service provider engineering projects
Strong analytical and troubleshooting skills
Strong communication skills
A bit about us
Digital Realty brings companies and data together by delivering the full spectrum of data center, colocation and interconnection solutions. PlatformDIGITAL , the company's global data center platform, provides customers with a secure data meeting place and a proven Pervasive Datacenter Architecture (PDx ) solution methodology for powering innovation and efficiently managing Data Gravity challenges. Digital Realty gives its customers access to the connected data communities that matter to them with a global data center footprint of 300+ facilities in 50+ metros across 25+ countries on six continents.
To learn more about Digital Realty, please visit digitalrealty.com or follow us on LinkedIn and Twitter.
Our IT team is at the heart of our business. We develop infrastructures, design and build networks, support servers and provide the first line of support by delivering rich connectivity for our customers. With new data centers coming online all the time, it's a rapidly changing technical environment so our team is always ready to innovate and take the lead on projects. We constantly develop, deploy and support vital networks and data services that drive business performance and improve life for customers around the globe.
What we can offer you
Our rapidly evolving business sector offers the opportunity to be part of a courageous and passionate team who work together to understand and meet the changing needs of our global customers.
Join us and you'll be part of a supportive and inclusive environment where you can bring your whole self to work. As part of our team, you'll get to work with people from different business areas, challenge the way we do things and put your ideas into action. We'll also give you plenty of development opportunities so you can build a rewarding and successful career with us.
Apply today, take charge of your career and grow your talents with us.
Health and Safety
Safety isn't just a priority here at Digital; it's critical to everything we do. Safeguarding lives, protecting assets, and securing data aren't just ideals - they're essential pillars of our commitment to excellence for our people, our partners and our customers. We have a culture of care where every member of Team Digital embraces a relentless pursuit of working safely across Digital Realty. Together we are Safely Powering Progress.
Our Compensation Philosophy
Digital Realty offers its employees a highly competitive compensation package, excellent benefits, and an environment that recognizes and rewards your contributions. Central to our compensation philosophy is rewarding our employees for achieving the values and objectives aligned to the company's overall goals and values.
Auto-ApplyBlockchain Site Reliability Engineer
Reliability engineer job in Dallas, TX
Blockchain Site Reliability Engineer Company: ********************** Contact: [email protected] About Company InfStones is an advanced, enterprise-grade Platform as a Service (PaaS) blockchain infrastructure provider trusted by the top blockchain companies in the world. InfStones' AI-based infrastructure provides developers worldwide with a rugged, powerful node management platform alongside an easy-to-use API. With over 20,000 nodes supported on over 80 blockchains, InfStones gives developers all the control they need - reliability, speed, efficiency, security, and scalability - for cross-chain DeFi, NFT, GameFi, and decentralized application development. InfStones is trusted by the biggest blockchain companies in the world including Binance, CoinList, BitGo, OKX, Chainlink, Polygon, Harmony, and KuCoin, among a hundred other customers. InfStones is dedicated to developing the next evolution of a better world through limitless Web3 innovation.
To date, InfStones has raised over $110 million in capital and is backed by Softbank, GGV Capital, Susquehanna International Group (SIG), Dragonfly Capital, Qiming Venture Partners, Plug and Play, and many renowned institutional investors. InfStones is proud to offer medical, vision, dental, short-term and long-term disability insurance, 401(k) plan with company matching, FSA, and other benefits to all full-time employees, along with flexible paid time off, sick days, and holidays.
If you enjoy being on the cutting edge of technology, we encourage you to apply!
Job Description
As a Blockchain Site Reliability Engineer (SRE), you will be responsible for ensuring the reliability, availability, and performance of blockchain nodes and related infrastructure. You'll monitor, troubleshoot, and resolve incidents in production environments, while also building automation tools to improve efficiency and reduce operational risks.
This role requires strong Linux system expertise, solid on-call and incident response experience, and the ability to work under pressure to quickly restore services. You'll also collaborate with protocol engineers and open-source communities to ensure smooth upgrades and long-term system stability.
Key Responsibilities
1. Deploy, monitor, and maintain blockchain nodes across multiple networks.
2. Ensure system reliability and uptime by actively managing incidents, troubleshooting, and resolving node failures.
3. Develop automation and maintenance tools (using Golang, Shell, Python, etc.) to streamline operations.
4. Build and maintain monitoring, alerting, and logging systems to proactively detect and address issues.
5. Collaborate with engineering teams and solution architects on reliability improvements and incident prevention.
6. Participate in the on-call rotation to provide timely incident response and resolution.
Qualifications
1. Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience).
2. Strong Linux system administration skills (networking, performance tuning, debugging, security).
3. Expertise with at least one mainstream programming language such as Golang, Python, Javascript, Rust, etc., and have good programming skills and programming habits.
4. Experience with monitoring/alerting tools (e.g., Prometheus, Grafana, ELK, etc.).
5. Strong problem-solving skills and the ability to respond quickly under pressure.
6. Solid technical documentation skills.
Prefers (Nice to have)
1. Hands-on experience with blockchain node deployment, maintenance, and upgrades.
2. Familiarity with mainstream blockchain protocols (e.g., Ethereum, Cosmos, Polkadot, Solana).
3. Experience with containerization/orchestration tools (Docker, Kubernetes).
4. Knowledge of smart contracts, Web3 RPC, or Solidity is a plus.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
Site Reliability Engineer III
Reliability engineer job in Plano, TX
There's nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.
As a Site Reliability Engineer III at JPMorgan Chase within the Consumer and Community Banking, you will serve as an experienced member of an agile team, focusing on designing and delivering trusted, market-leading technology products that are secure, stable, and scalable. You will be responsible for implementing critical technology solutions across multiple technical domains, supporting various business functions to achieve the firm's objectives.
Job responsibilities
Guides and assists others in the areas of building appropriate level designs and gaining consensus from peers where appropriate
Collaborates with other software engineers and teams to design and implement deployment approaches using automated continuous integration and continuous delivery pipelines
Collaborates with other software engineers and teams to design, develop, test, and implement availability, reliability, scalability, and solutions in their applications
Implements infrastructure, configuration, and network as code for the applications and platforms in your remit
Collaborates with technical experts, key stakeholders, and team members to resolve complex problems
Understands service level indicators and utilizes service level objectives to proactively resolve issues before they impact customers
Supports the adoption of site reliability engineering best practices within your team
Required qualifications, capabilities, and skills
Formal training or certification on software engineering concepts and 3 + years applied experience
Proficient in site reliability culture and principles and familiarity with how to implement site reliability within an application or platform
Proficient in at least one programming language such as Python, Java/Spring Boot, and .Net
Proficient knowledge of software applications and technical processes within a given technical discipline (e.g., Cloud, artificial intelligence, Android, etc.)
Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others
Experience with continuous integration and continuous delivery tools like Jenkins, GitLab, or Terraform
Familiarity with container and container orchestration such as ECS, Kubernetes, and Docker
Familiarity with troubleshooting common networking technologies and issues
Preferred qualifications, capabilities, and skills
Ability to contribute to large and collaborative teams by presenting information in a logical and timely manner with compelling language and limited supervision
Ability to proactively recognize road blocks and demonstrates interest in learning technology that facilitates innovation
Ability to identify new technologies and relevant solutions to ensure design constraints are met by the software team
Ability to initiate and implement ideas to solve business problems
Auto-ApplySystems Safety Engineer
Reliability engineer job in Richardson, TX
Join the Specialty Engineering team to design, develop, and test product capabilities for a wide range of developmental equipment as well as existing solutions tailored for rugged environments. Our engineers support custom systems at the board, middleware, and complete system levels, working on hardware designs from simple microcontroller-based peripherals to complex secure and rugged processing systems. As a part of this multidisciplinary team, you will be involved in all program phases from bidding to delivery, requiring strong communication skills and a collaborative spirit.
Responsibilities
* Apply technical skills within the systems engineering framework and operate from a holistic view of programs.
* Develop, derive, understand, and assign System Safety requirements for all phases of every program.
* Write guides, develop analyses, and model to address requirements at component, subsystem, and system levels.
* Prepare and review documents consistent with US Military DIDs and participate in multidisciplinary design reviews.
* Perform hazard analysis IAW MIL-STD-882 and contractual requirements.
* Develop System Safety Program Plans (SSPP).
* Conduct analyses including PHA, SSHA, SHA, O&SHA, FHA, Software Safety Hazard Analysis, SAR, and Fault Tree Analysis.
* Review designs, specifications, tests, changes, reports, and deviation/waiver requests for potential safety impacts.
* Participate in incident investigations and root cause or corrective action analyses.
* Provide design direction for safety criteria for systems, equipment, and procedures to control and eliminate hazards.
Essential Skills
* Bachelor's degree in STEM with 10 years of experience or a Master's degree with 8 years of experience.
* DoD Secret clearance or higher with an investigation completed within 6 years.
* 5 years of System Safety Engineering experience.
* Deep familiarity with MIL-STD-882.
* Experience in DoD System Safety Working Groups.
* Comfortable in fast-paced and dynamic product development environments.
Additional Skills & Qualifications
* Experience with DoD/aerospace practices and programmatic experience with BOE, RFP, ROM.
* Ability to determine and execute the right evaluation method (quantitative or qualitative).
* Experience supporting independent government Safety Review Boards.
* Familiarity with DO-178C or other relevant standards.
* Wide breadth of experience in various aircraft systems.
* Ability to assess system-level impacts based on component-level data.
* Experience with system modeling tools like Cameo or similar MBSE tools.
Job Type & Location
This is a Contract to Hire position based out of Richardson, TX.
Pay and Benefits
The pay range for this position is $40.00 - $48.00/hr.
Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to specific elections, plan, or program terms. If eligible, the benefits available for this temporary role may include the following: • Medical, dental & vision • Critical Illness, Accident, and Hospital • 401(k) Retirement Plan - Pre-tax and Roth post-tax contributions available • Life Insurance (Voluntary Life & AD&D for the employee and dependents) • Short and long-term disability • Health Spending Account (HSA) • Transportation benefits • Employee Assistance Program • Time Off/Leave (PTO, Vacation or Sick Leave)
Workplace Type
This is a hybrid position in Richardson,TX.
Application Deadline
This position is anticipated to close on Dec 24, 2025.
About Actalent
Actalent is a global leader in engineering and sciences services and talent solutions. We help visionary companies advance their engineering and science initiatives through access to specialized experts who drive scale, innovation and speed to market. With a network of almost 30,000 consultants and more than 4,500 clients across the U.S., Canada, Asia and Europe, Actalent serves many of the Fortune 500.
The company is an equal opportunity employer and will consider all applications without regard to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.
If you would like to request a reasonable accommodation, such as the modification or adjustment of the job application process or interviewing due to a disability, please email actalentaccommodation@actalentservices.com for other accommodation options.
Manufacturing Process Engineer I
Reliability engineer job in Allen, TX
Job Description
Photronics is Hiring!
For more than 50 years, Photronics has been a global leader in photomask technology-powering the innovation behind smartphones, computers, automotive electronics, and the devices people rely on every day. Our success is built on precision, collaboration, and the problem-solving mindset of our people. Join us and help shape the future of semiconductor technology.
About the Role
The Manufacturing Process Engineer I is an entry-level engineering role that supports the development, optimization, and daily execution of photomask manufacturing processes. Working alongside senior engineers and cross-functional partners across Operations, Quality, IT, and Engineering, you will help ensure our processes run efficiently, consistently, and safely.
This is an excellent opportunity for a recent graduate or early-career engineer who is eager to learn, grow, and build technical expertise in a high-tech manufacturing environment.
What You'll Do
Assist in identifying opportunities to improve process efficiency, quality, and safety
Support troubleshooting of process and equipment issues alongside senior engineers
Collect, organize, and analyze production or test data to monitor process trends
Help maintain and update process documentation, including procedures and work instructions
Participate in basic process characterization, validation, and qualification activities
Assist with new equipment setups and initial process evaluations
Collaborate with operators, technicians, and engineers on daily manufacturing operations
Provide basic technical support and share knowledge with production teams
Perform additional duties as assigned
Travel: Less than 5%
What You Bring
Knowledge, Skills & Abilities
Foundational understanding of engineering or manufacturing principles
Strong attention to detail and eagerness to learn new tools and technologies
Effective written and verbal communication skills
Ability to work independently and collaboratively
Commitment to following safety and cleanroom protocols
Curiosity, analytical thinking, and a proactive approach to problem-solving
Ability to thrive in a fast-paced environment
Experience
0-2 years of experience in process engineering, manufacturing, or a related technical field
Internship or co-op experience strongly considered
Exposure to semiconductor or cleanroom environments is a plus
Familiarity with Lean, Six Sigma, or process improvement concepts is helpful but not required
Education
Bachelor's degree in Chemical Engineering, Materials Science, Mechanical Engineering, Physics, or a related technical field or equivalent work experience.
Compensation & Benefits
Competitive salary + bonus potential
Comprehensive health, dental, and vision insurance
401(k) with company match
Generous PTO and paid holidays
Career development and training opportunities
Collaborative, inclusive workplace culture
Equal Opportunity Statement: We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender, gender identity or expression, or veteran status. We are proud to be an equal opportunity workplace. We are committed to providing reasonable accommodation for team members' disabilities and religious beliefs or practices.
Agency Notice: Photronics does not accept unsolicited resumes or outreach from search firms or employment agencies. Please, no phone calls or emails to any employee regarding this opening. Resumes submitted outside of our approved agency engagement process will be considered the sole property of Photronics, and no fees will be paid if such candidates are hired. Only agencies with a valid agreement in place with Photronics and assigned to this role may submit candidates.
Senior Site Reliability Engineer - Integration Services
Reliability engineer job in Plano, TX
Who we are Collaborative. Respectful. A place to dream and do. These are just a few words that describe what life is like at Toyota. As one of the world's most admired brands, Toyota is growing and leading the future of mobility through innovative, high-quality solutions designed to enhance lives and delight those we serve. We're looking for talented team members who want to Dream. Do. Grow. with us.
An important part of the Toyota family is Toyota Financial Services (TFS), the finance and insurance brand for Toyota and Lexus in North America. While TFS is a separate business entity, it is an essential part of this world-changing company- delivering on Toyota's vision to move people beyond what's possible. At TFS, you will help create best-in-class customer experience in an innovative, collaborative environment.
To save time applying, Toyota does not offer sponsorship of job applicants for employment-based visas or any other work authorization for this position at this time.
Who we're looking for
Toyota Financial Services is hiring a Senior Site Reliability Engineer to support and scale our enterprise integration platforms. You'll focus on keeping our API-driven systems, like MuleSoft, Kafka, and Java microservices, resilient, observable, and automated.
This role is ideal for someone with a software engineering mindset who thrives in platform-focused environments and is passionate about reducing operational toil, improving reliability, and enabling velocity for development teams.
What you'll be doing
* Operate and scale integration and messaging platforms like MuleSoft, Kafka (MSK), Apache Camel, and TIBCO
* Build automation and self-healing capabilities using Python or equivalent scripting
* Define and manage SLIs/SLOs, health checks, and proactive remediation
* Establish observability with tools like Dynatrace, CloudWatch, and centralized logging
* Support enterprise middleware and COTS platforms (e.g., RightFax, Document Management Systems)
* Collaborate with architects and engineers to ensure solutions meet SRE and operational standards
* Participate in on-call rotations and lead incident response/postmortems
* Drive continuous improvement across deployment hygiene, monitoring, and platform resilience
What you bring
* 5+ years of experience in SRE, DevOps, or backend software engineering roles
* Proven experience with Java-based API development and integration tooling
* Hands-on with at least two of the following: MuleSoft, Kafka (MSK/Confluent), Apache Camel, TIBCO (BW/EMS)
* Strong scripting skills (Python preferred) to automate operations and reduce toil
* Experience with observability tools-Dynatrace highly preferred
* Solid grasp of API security protocols (OAuth2, JWT, mTLS)
* Background in middleware, integration platforms, or event-driven systems
Added bonus if you have
* Experience working in hybrid or cloud-native environments (AWS preferred)
* Familiarity with CI/CD pipelines and infrastructure automation tools
* Exposure to integration design patterns (ESB, microservices, pub/sub)
What we'll bring
During your interview process, our team can fill you in on all the details of our industry-leading benefits and career development opportunities. A few highlights include:
* A work environment built on teamwork, flexibility, and respect
* Professional growth and development programs to help advance your career, as well as tuition reimbursement
* Team Member Vehicle Purchase Discount
* Toyota Team Member Lease Vehicle Program (if applicable)
* Comprehensive health care and wellness plans for your entire family
* Toyota 401(k) Savings Plan featuring a company match, as well as an annual retirement contribution from Toyota regardless of whether you contribute
* Paid holidays and paid time off
* Referral services related to prenatal services, adoption, childcare, schools and more
* Tax Advantaged Accounts (Health Savings Account, Health Care FSA, Dependent Care FSA)
* Relocation assistance (if applicable)
#LI-DNI
Belonging at Toyota
Our success begins and ends with our people. We embrace all perspectives and value unique human experiences. Respect for all is our North Star. Toyota is proud to have 10+ different Business Partnering Groups across 100 different North American chapter locations that support team members' efforts to dream, do and grow without questioning that they belong.
Applicants for our positions are considered without regard to race, ethnicity, national origin, sex, sexual orientation, gender identity or expression, age, disability, religion, military or veteran status, or any other characteristics protected by law.
Have a question, need assistance with your application or do you require any special accommodations? Please send an email to *****************************.
Auto-ApplyDrinking Water Process Engineer
Reliability engineer job in Dallas, TX
Kennedy Jenks is seeking a Drinking Water Process Engineer to provide technical expertise and support for drinking water treatment projects. The ideal candidate will have experience in drinking water treatment technologies and regulations, with a strong interest in continuing to develop their skills in water treatment. This role offers opportunities to work alongside experienced engineers and support clients on a variety of water quality and treatment projects.
Key Responsibilities:
Provide technical support for drinking water treatment projects, including treatment process evaluation, process selection, and operations optimization.
Assist in preliminary engineering studies and feasibility assessments for municipal drinking water treatment systems.
Support project teams with design and technical contributions, including developing process flow diagrams and design criteria for water treatment facilities.
Collaborate with client service managers by contributing technical insights during project meetings and presentations.
Participate in research and process improvements related to water quality and treatment technologies.
Provide input on water treatment facility performance evaluations and assist in operations optimization.
Stay engaged in water-focused professional organizations and present technical material at conferences.
Qualifications:
Bachelor's or Master's degree in civil / environmental engineering, or related scientific discipline required.
7+ years of experience in drinking water treatment engineering
Practical professional engineer (PE) license or ability to obtain licensure within one year of hire. License in one or multiple states (CA, CO, FL, HI, OR, TX, VA, WA) preferred.
Strong familiarity with drinking water treatment regulations and technologies.
Ability to work as part of a project team, supporting senior staff and contributing to technical deliverables.
Strong communication skills and ability to convey technical information clearly to colleagues and clients.
Kennedy Jenks supports a healthy work-life balance and utilizes a hybrid model of home and office work, with a minimum of two days per week in the office. This approach empowers our people to thrive, collaborate, and achieve their full potential.
The salary range for this position is anticipated to be $110,000 to $140,000, and may vary based upon education, experience, qualifications, licensure/certifications and geographic location.
This position is eligible for performance and incentive compensation.
#LI-Hybrid
Process Engineer
Reliability engineer job in Dallas, TX
MTU Maintenance Fort Worth is part of the world's largest independent jet engine MRO company based in Germany providing aftermarket and OEM-licensed engine maintenance services worldwide. As a part of MTU Aero Engines, with over 80 years of experience in the design, development, and production of jet engine components, modules, and engines; MTU Maintenance is a global network of over 4,000 employees with over 35 years of experience in the MRO market.
MTU provides maintenance from targeted hospital visits to complete overhauls on over 30+ commercial aero engine and industrial gas turbine lines and has completed more than 18,000 shop visits for over 1,400 customers worldwide. Within this Global network, MTU Maintenance Dallas provides hospital shop and on-site maintenance services.
Process Engineer
The Process Engineer plays a critical role in ensuring efficient, reliable, and compliant engine assembly and disassembly operations. This position focuses on developing and continuously improving maintenance processes by integrating OEM data and engineering insights to optimize performance and reduce downtime. The role works closely with Production and Production Control to streamline workflows, minimize process bottlenecks, and reduce overall turn-around times (TAT), while also creating and harmonizing work plans across MTU locations within the global network to ensure consistency and best practices.
Duties/Responsibilities:
* Lead technical oversight of engine assembly and disassembly processes.
* Maintain harmonized work plans across locations within the global MTU network to ensure standardization and efficiency.
* Work closely with Production and Production Control to reduce process times and overall turn-around times (TAT).
* Develop and optimize innovative maintenance processes, balancing technical feasibility, cost-effectiveness, and stability.
* Provide expert support to production teams during operational disruptions, implementing sustainable corrective actions.
* Continuously improve work plans and integrate updates based on OEM data and engineering insights.
* Drive digitalization and automation initiatives to enhance engineering workflows and production capabilities.
* Participate in internal and external audits, applying MTU quality methodologies to drive process excellence.
Required Skills/Abilities:
* Bachelor's or Master's degree in Aerospace, Manufacturing, or related discipline.
* Proven experience in manufacturing or process engineering within civil aviation.
* Familiarity with OEM manuals (e.g., CMM, AMM, IPC) and digital maintenance platforms.
* Familiarity with automation and digitalization tools; experience with AI/RPA desirable.
* Strong analytical and problem-solving skills with a structured approach.
* Advanced MS Office & SAP skills.
* Excellent organizational and interpersonal skills; experience in international collaboration.
Benefits:
* Medical, Dental, Vision, and STD insurance are effective immediately
* Medical Flexible Spending Accounts
* Employer-paid LTD and Life / AD&D insurance
* 401k with employer matching up to 2% with an additional discretionary contribution to 1% provided from the employer
* Paid 2 weeks of Vacation, paid 10 days of PTO & Holidays
* Annual Tuition Reimbursement
* Monthly $30 Gym Membership Reimbursement
* Passport and renewal compliance, and TSA reimbursement
* Employee Assistance Program
Your Future at MTU Starts Here!
Ready to give your career a boost? Send us your complete application by listing your earliest possible start date. We look forward to getting to know you.
MTU Maintenance Dallas, Inc. is an Equal Opportunity Employer. All applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability, protected veteran status, sexual orientation, gender identity, or any other protected class.
For more information and additional resources on "EEO is the Law," please visit: ****************************************
Career Accelerator Program - Failure Analysis Engineer - PhD
Reliability engineer job in Dallas, TX
Change the world. Love your job. In your first year with TI, you will participate in the Career Accelerator Program (CAP), which provides professional and technical training and resources to accelerate your ramp into TI, and set you up for long-term career success. Within this program, we also offer function-specific technical training and on-the-job learning opportunities that will encourage you to solve problems through a variety of hands-on, meaningful experiences from your very first day on the job. The TMG Development program is a 12-month program for new college graduates in the TMG organization.
In this role you will be using ToF-SIMS (Time-of-Flight Secondary Ion Mass Spectrometry) tool for characterization of semiconductor devices. Identification and collection of chemical information related to the sample surface including the elemental composition, isotopic ratios, and depth distribution to support process technology development, wafer fab manufacturing, and package technology development is a primary goal. Lead and support surface analysis team by helping to plan, coordinate and execute experiments to characterize the surface science of semiconductor material using various analytical techniques primarily XPS, Auger, FTIR and other chemical analysis techniques like DSIMS, EDX, XRF etc. Individual must be driven, detail oriented, and thorough to manage daily operations while interfacing with customers to provide technical reports that steer results. You are expected to develop new techniques and implement corrective actions to address gaps and issues. AI/ML based data analysis skills is a huge plus. This position also offers a unique opportunity to collaborate with experts in other material characterization areas like TEM, SEM, FIB, scanning probe microscope analysis, surface analysis, and failure analysis in the lab.
Responsibilities include:
Lead the surface science characterization utilizing analytical techniques primarily TOFSIMS (IonTOF and PHI), XPS (PHI), Auger (PHI), FTIR (ThermoFisher) and other chemical analysis techniques like DSIMS, EDX, XRF etc.
Hands-on sample preparation, including coordination/logistics and technique development for difficult or complex samples.
Acquire and analyze data collected, generate comprehensive reports including your data interpretation and summary of your observations to help customers in their decision making
Follow all analytical procedures and specs in the lab
Be familiar with instrumentation maintenance and capable of troubleshooting if problems arise. Clearly communicate unresolved issues to the vendor service engineers for fast resolution.
Coordinate with engineers and customer base to address workload quickly with high quality data.
Interact with customers and stakeholders to better understand the background and make recommendations on the technique(s) that provides the best results.
Work in a collaborative, team environment interfacing with customer and lab members.
Texas Instruments will not sponsor job applicants for visas or work authorization for this position
Qualifications
Minimum requirements:
PhD in Material Science, Chemical Engineering, Physics, or Chemistry
Cumulative 3.0/4.0 GPA or higher
Preferred qualifications:
Strong verbal and written communication skills
Demonstrated strong analytical and problem solving skills
Ability to work in teams and collaborate effectively with people in different functions
Ability to take the initiative and drive for results
Strong time management skills that enable on-time project delivery
Ability to work effectively in an interrupt-driven, fast-paced and rapidly changing environment
Demonstrated ability to build strong, influential relationships
Ability to take the initiative and drive for results
Ability to analyze and interpret data, interpret customer input and apply input to TI's products and processes
Ability to take the initiative and ownership for implementing solutions that improve product quality
Auto-Apply