Post job

Reliability Engineer jobs at The Aerospace Corporation - 2166 jobs

  • Supplier Quality Engineer

    Pinnacle Group, Inc. 4.3company rating

    Columbia, MD jobs

    Job Title- Supplier Quality Engineer Pay range: $75,000 USD to $95,000 USD Bonus: 7.5% - prorated based on start date Must have demonstrated experience working directly with manufacturing suppliers, including leading supplier engagement, communication, and quality initiatives. This role will be responsible for spearheading supplier relationships and driving supplier performance. The Supply Quality Engineer ensures suppliers meet company standards and regulatory requirements through evaluation, audits, and continuous monitoring. Key duties include supplier qualification, risk management, developing the supplier quality program, and driving continuous improvement initiatives to resolve quality issues and enhance performance. What You'll Do Supplier Qualification and Onboarding Evaluate and qualify new suppliers based on company standards and requirements. Conduct initial supplier audits and assessments to ensure compliance with quality standards. Compliance and Risk Management Monitor supplier performance to ensure compliance with industry standards and regulatory requirements. Identify and mitigate risks in the supply chain through proactive measures and continuous monitoring. Supplier Quality Program Development Lead the further development and implementation of the supplier quality program. Establish and maintain quality metrics and KPIs to track supplier performance. Continuous Improvement Work with suppliers to develop and implement continuous improvement initiatives. Conduct root cause analysis and implement corrective actions for supplier quality issues. Supplier Audits and Assessments Perform regular audits and assessments of suppliers to ensure ongoing compliance and performance. Document audit findings and follow up on corrective actions. Collaboration and Communication Collaborate with internal departments such as quality, purchasing, and design engineering to address supplier-related requirements and issues. Communicate effectively with suppliers to ensure alignment on quality expectations and requirements. Technical Support and Training Provide technical support and training to suppliers to help them meet quality standards. Assist suppliers in understanding and implementing company-specific quality requirements. Travel and Supplier Visits Travel to supplier locations, both domestic and international, as needed to conduct audits, assessments, and support activities. Reporting and Documentation Report on supplier performance and quality issues. Maintain accurate and up-to-date documentation of supplier quality activities and performance metrics. Additional Expectation Remain compliant with the Code of Conduct and Policies which includes the Kingspan Group Compliance Policy. Ensure that all duties related to product compliance are adhered to in accordance with the Product Compliance Policy, Laws, Regulations, and market demands. Responsible for all tasks to achieve compliance goals and demands of the Compliance Management System. Must raise concerns related to the Compliance Management System to their supervisor, manager, or any member of the Leadership Team, or through the confidential whistle blower service. What You'll Bring A bachelor's degree is required, preferably in a STEM field. At least 2 to 3 years of relevant experience. Must have quality experience in manufacturing, with a focus on supplier quality. Proficiency with computer applications such as email, Excel, Word, Minitab, Power BI, and Unipoint is essential. Strong critical thinking skills and a solid understanding of statistics are mandatory. Familiarity with QA tools, including root cause analysis techniques like 8D, is required. Salary Range- $75K-$95K with Bonus The specific compensation for this position will be determined by several factors, including the scope, complexity, and location of the role, as well as the cost of labor in the market; the skills, education, training, credentials, and experience of the candidate; and other conditions of employment. Our full-time consultants have access to benefits, including medical, dental, vision, and 401K contributions, as well as PTO, sick leave, and other benefits mandated by applicable state or localities where you reside or work. If you receive a suspicious message, email, or phone call claiming to be from PTR Global do not respond or click on any links. Instead, contact us directly at ***************. To report any concerns, please email us at *******************
    $75k-95k yearly 2d ago
  • Job icon imageJob icon image 2

    Looking for a job?

    Let Zippia find it for you.

  • Principal Site Reliability Engineer

    Fortinet Inc. 4.8company rating

    Santa Clara, CA jobs

    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess over getting the details right. We love what we do and are proud of our work to secure clouds and container environments for thousands of b2b customers worldwide. Our team is growing, and we are looking for engineers with passion for automation. You will help support the FortiCNAPP platform and play a key role in building, operating, and improving the FortiCNAPP Cloud Security Platform, the world's best real-time cloud-native threat detection system. Our team develops and supports the infrastructure layers spanning our cloud accounts, network/connectivity, workload management, observability, and storage services. We build tooling to perform automated operations in order to scale the FortiCNAPP infrastructure and service. To be successful you will design, define, develop, deploy and operate internal tooling, APIs, and frameworks which streamline our workflows and automate our infrastructure. About this role: As a Principal Site Reliability Engineer at FortiCNAPP, you will lead the design, implementation, and optimization of our highly scalable, resilient, and efficient platform infrastructure. You will drive strategic initiatives to enhance operational excellence, mentor teams, and set the standard for reliability and automation across the organization. Your expertise will shape the future of FortiCNAPP's infrastructure, ensuring it meets the demands of our customers and supports rapid growth. Responsibilities: * Architect and implement advanced automation strategies to maximize operational efficiency and minimize toil across the FortiCNAPP platform. * Lead the design, development, and enhancement of infrastructure systems to ensure world-class scalability, resiliency, and performance. * Proactively identify and resolve complex, systemic issues through innovative automation, tooling, and architectural solutions, preventing customer-facing incidents. * Drive the evolution of monitoring, instrumentation, and observability systems to anticipate and mitigate scalability and reliability risks before they impact customers. * Champion company-wide adoption of reliability best practices, establishing key metrics, SLAs, and milestones to embed scalability and resiliency into all engineering processes. * Collaborate with cross-functional teams to define and implement industry-leading practices for infrastructure, deployment, and operational workflows. * Provide technical leadership and mentorship to engineering and operations teams, fostering a culture of reliability, automation, and continuous improvement. * Lead incident response and post-mortem processes, driving root cause analysis and implementing preventive measures. * Participate in an on-call rotation, serving as an escalation point for complex issues and guiding the team through critical incidents. * Influence strategic technology decisions, evaluating and integrating cutting-edge tools, services, and methodologies to enhance platform reliability. Minimum Qualifications: * 10+ years of DevOps/SRE experience, with at least 5 years in a senior or lead role managing production systems at scale. * Expert-level development and automation skills, with a proven track record of building sophisticated tools and workflows. * Deep expertise in Infrastructure as Code (e.g., Terraform) and supporting tools (e.g., Atlantis, ArgoCD, Flux). * Advanced experience with Kubernetes and its ecosystem (e.g., Helm, operators, Kustomize), including managing large-scale, production-grade clusters. * Extensive experience with multiple cloud providers and managed services (e.g., AWS: EKS, EC2, S3, RDS, Secrets Manager; GCP, Azure). * Proven ability to architect and operate highly reliable, fault-tolerant cloud infrastructure that supports rapid microservice deployment with robust monitoring and high availability. * Exceptional cross-team communication and leadership skills, with experience driving alignment across engineering, product, and operations teams. * Deep knowledge of large-scale system building blocks, including load balancing, distributed/cloud computing, container orchestration, and advanced monitoring/observability. * Expert understanding of cloud networking, including VPC configuration, cross-cloud connectivity, and hybrid cloud architectures. * Proficiency in one or more programming languages (e.g., Python, Go, Rust) for building tools and automation frameworks. Preferred Qualifications: * Extensive experience designing and implementing advanced monitoring and observability systems (e.g., Prometheus, Grafana, New Relic, Datadog, OpenTelemetry). * Strong advocate for "everything as code" principles, with experience institutionalizing IaC and GitOps practices across teams. * Deep expertise in Java application servers, JVM tuning, and performance optimization for high-throughput systems. * Experience leading cross-functional initiatives to improve system reliability, such as chaos engineering, disaster recovery planning, or zero-downtime deployments. Educational Requirements: * Bachelor or Masters degree in Computer Science, Computer Engineering or related fields. The US base salary range for this full-time position is $202,000-$247,000. Fortinet offers employees a variety of benefits, including medical, dental, vision, life and disability insurance, 401(k), 11 paid holidays, vacation time, and sick time as well as a comprehensive leave program. Wage ranges are based on various factors including the labor market, job type, and job level. Exact salary offers will be determined by factors such as the candidate's subject knowledge, skill level, qualifications, experience, and geographic location. All roles are eligible to participate in the Fortinet equity program, Bonus eligibility is reviewed at time of hire and annually at the Company's discretion. Why Join Us: We encourage candidates from all backgrounds and identities to apply. We offer a supportive work environment and a competitive Total Rewards package to support you with your overall health and financial well-being. Embark on a challenging, enjoyable, and rewarding career journey with Fortinet. Join us in bringing solutions that make a meaningful and lasting impact to our 660,000+ customers around the globe.
    $202k-247k yearly Auto-Apply 43d ago
  • Principal Site Reliability Engineer

    Fortinet 4.8company rating

    Santa Clara, CA jobs

    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess over getting the details right. We love what we do and are proud of our work to secure clouds and container environments for thousands of b2b customers worldwide. Our team is growing, and we are looking for engineers with passion for automation. You will help support the FortiCNAPP platform and play a key role in building, operating, and improving the FortiCNAPP Cloud Security Platform, the world's best real-time cloud-native threat detection system. Our team develops and supports the infrastructure layers spanning our cloud accounts, network/connectivity, workload management, observability, and storage services. We build tooling to perform automated operations in order to scale the FortiCNAPP infrastructure and service. To be successful you will design, define, develop, deploy and operate internal tooling, APIs, and frameworks which streamline our workflows and automate our infrastructure. About this role: As a Principal Site Reliability Engineer at FortiCNAPP, you will lead the design, implementation, and optimization of our highly scalable, resilient, and efficient platform infrastructure. You will drive strategic initiatives to enhance operational excellence, mentor teams, and set the standard for reliability and automation across the organization. Your expertise will shape the future of FortiCNAPP's infrastructure, ensuring it meets the demands of our customers and supports rapid growth. Responsibilities: Architect and implement advanced automation strategies to maximize operational efficiency and minimize toil across the FortiCNAPP platform. Lead the design, development, and enhancement of infrastructure systems to ensure world-class scalability, resiliency, and performance. Proactively identify and resolve complex, systemic issues through innovative automation, tooling, and architectural solutions, preventing customer-facing incidents. Drive the evolution of monitoring, instrumentation, and observability systems to anticipate and mitigate scalability and reliability risks before they impact customers. Champion company-wide adoption of reliability best practices, establishing key metrics, SLAs, and milestones to embed scalability and resiliency into all engineering processes. Collaborate with cross-functional teams to define and implement industry-leading practices for infrastructure, deployment, and operational workflows. Provide technical leadership and mentorship to engineering and operations teams, fostering a culture of reliability, automation, and continuous improvement. Lead incident response and post-mortem processes, driving root cause analysis and implementing preventive measures. Participate in an on-call rotation, serving as an escalation point for complex issues and guiding the team through critical incidents. Influence strategic technology decisions, evaluating and integrating cutting-edge tools, services, and methodologies to enhance platform reliability. Minimum Qualifications: 10+ years of DevOps/SRE experience, with at least 5 years in a senior or lead role managing production systems at scale. Expert-level development and automation skills, with a proven track record of building sophisticated tools and workflows. Deep expertise in Infrastructure as Code (e.g., Terraform) and supporting tools (e.g., Atlantis, ArgoCD, Flux). Advanced experience with Kubernetes and its ecosystem (e.g., Helm, operators, Kustomize), including managing large-scale, production-grade clusters. Extensive experience with multiple cloud providers and managed services (e.g., AWS: EKS, EC2, S3, RDS, Secrets Manager; GCP, Azure). Proven ability to architect and operate highly reliable, fault-tolerant cloud infrastructure that supports rapid microservice deployment with robust monitoring and high availability. Exceptional cross-team communication and leadership skills, with experience driving alignment across engineering, product, and operations teams. Deep knowledge of large-scale system building blocks, including load balancing, distributed/cloud computing, container orchestration, and advanced monitoring/observability. Expert understanding of cloud networking, including VPC configuration, cross-cloud connectivity, and hybrid cloud architectures. Proficiency in one or more programming languages (e.g., Python, Go, Rust) for building tools and automation frameworks. Preferred Qualifications: Extensive experience designing and implementing advanced monitoring and observability systems (e.g., Prometheus, Grafana, New Relic, Datadog, OpenTelemetry). Strong advocate for “everything as code” principles, with experience institutionalizing IaC and GitOps practices across teams. Deep expertise in Java application servers, JVM tuning, and performance optimization for high-throughput systems. Experience leading cross-functional initiatives to improve system reliability, such as chaos engineering, disaster recovery planning, or zero-downtime deployments. Educational Requirements: - Bachelor or Masters degree in Computer Science, Computer Engineering or related fields. The US base salary range for this full-time position is $202,000-$247,000. Fortinet offers employees a variety of benefits, including medical, dental, vision, life and disability insurance, 401(k), 11 paid holidays, vacation time, and sick time as well as a comprehensive leave program. Wage ranges are based on various factors including the labor market, job type, and job level. Exact salary offers will be determined by factors such as the candidate's subject knowledge, skill level, qualifications, experience, and geographic location. All roles are eligible to participate in the Fortinet equity program, Bonus eligibility is reviewed at time of hire and annually at the Company's discretion. Why Join Us: We encourage candidates from all backgrounds and identities to apply. We offer a supportive work environment and a competitive Total Rewards package to support you with your overall health and financial well-being. Embark on a challenging, enjoyable, and rewarding career journey with Fortinet. Join us in bringing solutions that make a meaningful and lasting impact to our 660,000+ customers around the globe.
    $202k-247k yearly Auto-Apply 60d+ ago
  • Staff Reliability Engineer

    Also 4.2company rating

    Palo Alto, CA jobs

    We're ALSO, an electric mobility company originally conceived as a part of Rivian. We're a passionate team of builders, dreamers, doers and innovators, focused on creating entirely new (not to mention, innovative and delightful) vertically integrated, small EVs designed to meet the global mobility challenges of today and tomorrow. Our mission is to inspire everyone to ride ALSO-replacing many local car, truck and SUV miles with ones on vehicles that are more affordable, more enjoyable and 10-50x more efficient. ALSO is looking for a Reliability Engineer to play a key role in developing and leading the reliability of multiple electric mobility vehicles over the full development cycle. What You Will Do Establish reliability targets and metrics for new product development that include actuators, batteries, drive units, PCBAs, chassis, and various low voltage electronic systems. Develop new reliability tests procedures and specifications such as highly accelerated life testing, environmental testing, and other reliability tests to demonstrate reliability. Apply various types of acceleration models for creation of reliability tests to accurately predict product lifetime under accelerated stress conditions. Prepare concise and detailed test plans and analyze test reports. Provide updates and progress on various testing campaigns. Guide the engineering teams on failure modes, typical countermeasures for reliability failures, and design guidelines. Provide risk analysis to guide design decisions. Monitor field performance and identify trends in failure. Leading and facilitating FMEAs with cross-functional teams and developing design verification plans from these activities. What You Will Bring Minimum Bachelor of Science in an engineering discipline or equivalent. Five or more years of industry experience in a reliability engineering role. Technical knowledge of one or more aspects related to PCBA reliability, energy storage systems, drive units, and chassis components. Working knowledge of a coding language, preferably Python. Working knowledge of statistical software for reliability, such as JMP or the ReliaSoft suite, or MATLAB. Working knowledge of failure analysis techniques such as SEM, CT, X-Ray, EDS, TDR. Experience with instrumentations for data collection and testing such as thermocouples, strain gauges, accelerometers. Experience with deploying reliability testing guidelines and inventing new ways of testing. Ability to use FEA software (Eg. Ansys Sherlock) for product life prediction in creation of reliability tests . The salary for this position ranges from $220,000 to $255,000 per year, depending on experience and qualifications. Why ALSO. We're passionate about helping the world find a better way to get there-wherever it is you're headed. We're located in the heart of Silicon Valley and have brought together a world-class team from some of the biggest brands in the technology, automotive, cycling, outdoor recreation and retail spaces. Together we're working hands-on to imagine, design and build an entirely new solution to a global set of transportation challenges. Perks and Benefits Robust health coverage. Excellent health, dental and vision insurance covered up to 100% by ALSO with FSA & HSA options. One Medical membership and dedicated insurance advocates. Rich fertility and family building benefits with Progyny. Flexible time off. 401(k) match.
    $220k-255k yearly Auto-Apply 60d+ ago
  • Staff Reliability Engineer

    Also 4.2company rating

    Palo Alto, CA jobs

    We're ALSO, an electric mobility company originally conceived as a part of Rivian. We're a passionate team of builders, dreamers, doers and innovators, focused on creating entirely new (not to mention, innovative and delightful) vertically integrated, small EVs designed to meet the global mobility challenges of today and tomorrow. Our mission is to inspire everyone to ride ALSO-replacing many local car, truck and SUV miles with ones on vehicles that are more affordable, more enjoyable and 10-50x more efficient. ALSO is looking for a Reliability Engineer to play a key role in developing and leading the reliability of multiple electric mobility vehicles over the full development cycle. What You Will Do Establish reliability targets and metrics for new product development that include actuators, batteries, drive units, PCBAs, chassis, and various low voltage electronic systems. Develop new reliability tests procedures and specifications such as highly accelerated life testing, environmental testing, and other reliability tests to demonstrate reliability. Apply various types of acceleration models for creation of reliability tests to accurately predict product lifetime under accelerated stress conditions. Prepare concise and detailed test plans and analyze test reports. Provide updates and progress on various testing campaigns. Guide the engineering teams on failure modes, typical countermeasures for reliability failures, and design guidelines. Provide risk analysis to guide design decisions. Monitor field performance and identify trends in failure. Leading and facilitating FMEAs with cross-functional teams and developing design verification plans from these activities. What You Will Bring Minimum Bachelor of Science in an engineering discipline or equivalent. Five or more years of industry experience in a reliability engineering role. Technical knowledge of one or more aspects related to PCBA reliability, energy storage systems, drive units, and chassis components. Working knowledge of a coding language, preferably Python. Working knowledge of statistical software for reliability, such as JMP or the ReliaSoft suite, or MATLAB. Working knowledge of failure analysis techniques such as SEM, CT, X-Ray, EDS, TDR. Experience with instrumentations for data collection and testing such as thermocouples, strain gauges, accelerometers. Experience with deploying reliability testing guidelines and inventing new ways of testing. Ability to use FEA software (Eg. Ansys Sherlock) for product life prediction in creation of reliability tests . The salary for this position ranges from $220,000 to $255,000 per year, depending on experience and qualifications. Why ALSO. We're passionate about helping the world find a better way to get there-wherever it is you're headed. We're located in the heart of Silicon Valley and have brought together a world-class team from some of the biggest brands in the technology, automotive, cycling, outdoor recreation and retail spaces. Together we're working hands-on to imagine, design and build an entirely new solution to a global set of transportation challenges. Perks and Benefits Robust health coverage. Excellent health, dental and vision insurance covered up to 100% by ALSO with FSA & HSA options. One Medical membership and dedicated insurance advocates. Rich fertility and family building benefits with Progyny. Flexible time off. 401(k) match.
    $220k-255k yearly Auto-Apply 60d+ ago
  • Site Reliability Engineer

    Fortinet Inc. 4.8company rating

    Sunnyvale, CA jobs

    We are seeking a talented and motivated Site Reliability Engineer to join our engineering team. You will be responsible for building, maintaining, and troubleshooting cloud service/cluster, infrastructure, and monitoring systems to ensure high availability, performance, and security across our services. This is a hands-on role that requires a strong background in infrastructure automation, system reliability, and a SRE mindset of continuous improvement. Key Responsibilities: * Design, implement, and maintain robust CI/CD pipelines for multiple environments. * Manage datacenter infrastructure (Linux servers, network devices, databases etc,.) * Improve observability with logging, monitoring, alerting, and tracing tools (e.g., Prometheus, Grafana, ELK, Datadog). * Automate operational processes and support system reliability and performance improvements. * Collaborate with engineering teams to ensure seamless integration and deployment of new features. * Ensure security best practices are followed in all infrastructure and deployment activities. * Participate in oncall rotation and incident response handling as needed. Required Qualifications: * 8+ years of experience in a DevOps, SRE, or Infrastructure Engineering role. * Proficiency with CI/CD tools (e.g., Jenkins, GitLab CI, ArgoCD, Rancher etc,.). * Strong experience with Kubernetes and container orchestration. * Expertise in scripting and automation (Python, Bash, or Go). * Familiarity with configuration management tools (Ansible, Chef, etc.). * Solid understanding of networking, security, and Linuxbased systems. Preferred Qualifications: * Experience with ingresscontroller tech details (nginxcontroller, FortiADC, F5 BigIP etc,.). * Experience with service mesh technologies (Istio, Linkerd). * Experience with network devices such as firewall, WAF, switch etc,. * Experience with storage service management such as Ceph. * Knowledge of zerodowntime deployment strategies and bluegreen or canary releases. * Experience in highavailability and faulttolerant system design. * Familiarity with GitOps principles and tooling. The US base salary range for this full-time position is $170,000-$200,000. Fortinet offers employees a variety of benefits, including medical, dental, vision, life and disability insurance, 401(k), 11 paid holidays, vacation time, and sick time as well as a comprehensive leave program. Wage ranges are based on various factors including the labor market, job type, and job level. Exact salary offers will be determined by factors such as the candidate's subject knowledge, skill level, qualifications, experience, and geographic location. All roles are eligible to participate in the Fortinet equity program, Bonus eligibility is reviewed at time of hire and annually at the Company's discretion. Why Join Us: We encourage candidates from all backgrounds and identities to apply. We offer a supportive work environment and a competitive Total Rewards package to support you with your overall health and financial well-being. Embark on a challenging, enjoyable, and rewarding career journey with Fortinet. Join us in bringing solutions that make a meaningful and lasting impact to our 660,000+ customers around the globe.
    $170k-200k yearly Auto-Apply 60d+ ago
  • Site Reliability Engineer

    Fortinet 4.8company rating

    Sunnyvale, CA jobs

    We are seeking a talented and motivated Site Reliability Engineer to join our engineering team. You will be responsible for building, maintaining, and troubleshooting cloud service/cluster, infrastructure, and monitoring systems to ensure high availability, performance, and security across our services. This is a hands-on role that requires a strong background in infrastructure automation, system reliability, and a SRE mindset of continuous improvement. Key Responsibilities: Design, implement, and maintain robust CI/CD pipelines for multiple environments. Manage datacenter infrastructure (Linux servers, network devices, databases etc,.) Improve observability with logging, monitoring, alerting, and tracing tools (e.g., Prometheus, Grafana, ELK, Datadog). Automate operational processes and support system reliability and performance improvements. Collaborate with engineering teams to ensure seamless integration and deployment of new features. Ensure security best practices are followed in all infrastructure and deployment activities. Participate in oncall rotation and incident response handling as needed. Required Qualifications: 8+ years of experience in a DevOps, SRE, or Infrastructure Engineering role. Proficiency with CI/CD tools (e.g., Jenkins, GitLab CI, ArgoCD, Rancher etc,.). Strong experience with Kubernetes and container orchestration. Expertise in scripting and automation (Python, Bash, or Go). Familiarity with configuration management tools (Ansible, Chef, etc.). Solid understanding of networking, security, and Linuxbased systems. Preferred Qualifications: Experience with ingresscontroller tech details (nginxcontroller, FortiADC, F5 BigIP etc,.). Experience with service mesh technologies (Istio, Linkerd). Experience with network devices such as firewall, WAF, switch etc,. Experience with storage service management such as Ceph. Knowledge of zerodowntime deployment strategies and bluegreen or canary releases. Experience in highavailability and faulttolerant system design. Familiarity with GitOps principles and tooling. The US base salary range for this full-time position is $170,000-$200,000. Fortinet offers employees a variety of benefits, including medical, dental, vision, life and disability insurance, 401(k), 11 paid holidays, vacation time, and sick time as well as a comprehensive leave program. Wage ranges are based on various factors including the labor market, job type, and job level. Exact salary offers will be determined by factors such as the candidate's subject knowledge, skill level, qualifications, experience, and geographic location. All roles are eligible to participate in the Fortinet equity program, Bonus eligibility is reviewed at time of hire and annually at the Company's discretion. Why Join Us: We encourage candidates from all backgrounds and identities to apply. We offer a supportive work environment and a competitive Total Rewards package to support you with your overall health and financial well-being. Embark on a challenging, enjoyable, and rewarding career journey with Fortinet. Join us in bringing solutions that make a meaningful and lasting impact to our 660,000+ customers around the globe.
    $170k-200k yearly Auto-Apply 60d+ ago
  • Site Reliability Engineer - Openstack

    Fortinet 4.8company rating

    Sunnyvale, CA jobs

    Fortinet is recruiting a Site Reliability Engineer- OPENSTACK to join our FortiStack team. This team is responsible for the management, operation and continued development of our Openstack-based private cloud platform. This position would represent a great fit for Openstack specialists or IT professionals with a combination of virtualization, Openstack, storage and networking experience. As a Site Reliability Engineer- OpenStack, you will: Play a leading role in the operation, maintenance and development of multiple Openstack private cloud platforms worldwide Focused on continuous improvement - always be looking for improvement, automation and optimization opportunities from the cloud Troubleshoot and resolve technical problems, both individually and as part of the broader team Provide technical leadership, actively sharing knowledge across the Fortistack team and supporting your peers Provide On-call support as required We Are Looking For: Minimum Bachelor's degree in computer science, Computer Technology, or related field. 5+ years of experience in production platform operation Knowledge in Openstack administration Knowledge in server virtualization (KVM, VMware, etc.) Knowledge in Linux server administration (RHEL, CentOS, Ubuntu) Knowledge in network administration and industry standard protocols Knowledge in network, server and system monitoring (Zabbix, Nagios, etc.) Knowledge in at least one scripting language (Python, Bash, etc.) Proficiency in automation tools such as Ansible and Terraform for orchestrating configuration management and automated provisioning Experience in troubleshooting Linux-based environments integrated with platforms like OpenStack and Kubernetes Strong analytical skills to interpret stack traces and application logs for effective issue resolution Ability to identify root causes of application issues and provide detailed technical summaries for escalation to relevant teams Excellent ability to organize, multitask, work individually and in a team Excellent verbal and written communication skills Good to Have: Certification in RHCE Certification in Openstack Certification in VMware VCAP, NSX Certification in CCIE Certification in PMP, ITIL Knowledge in container management (Docker, Podman, Kubernetes, etc. Knowledge in software defined storage and network (Ceph, OVN/openvswitch, NSX-T, etc.) Experience in ISO27001 The US base salary range for this full-time position is $140,000 - $180,000. Fortinet offers employees a variety of benefits, including medical, dental, vision, life and disability insurance, 401(k), 11 paid holidays, vacation time, and sick time as well as a comprehensive leave program. Wage ranges are based on various factors including the labor market, job type, and job level. Exact salary offers will be determined by factors such as the candidate's subject knowledge, skill level, qualifications, experience, and geographic location. All roles are eligible to participate in the Fortinet equity program, Bonus eligibility is reviewed at time of hire and annually at the Company's discretion. About Our Team: Join our team, known for its collaborative ethos, working seamlessly with global customers, internal engineering teams and product development groups. Our team culture emphasizes continuous learning, innovation, and a strong commitment to customer satisfaction. We embrace Fortinet's core values of openness, teamwork and innovation, fostering an environment where team members support each other, share knowledge, and leverage AI to solve complex technical challenges. Our inclusive and dynamic team thrives on collaboration and is driven by the shared goal of maintaining Fortinet's high standards of excellence in cybersecurity solutions Why Join Us: We encourage candidates from all backgrounds and identities to apply. We offer a supportive work environment and a competitive Total Rewards package to support you with your overall health and financial well-being. Embark on a challenging, enjoyable, and rewarding career journey with Fortinet. Join us in bringing solutions that make a meaningful and lasting impact to our 660,000+ customers around the globe
    $140k-180k yearly Auto-Apply 60d+ ago
  • Database Reliability Engineer

    Viant Technology 4.3company rating

    Irvine, CA jobs

    WHAT YOU'LL DO We are looking for a skilled and motivated Database Reliability Engineer to join our growing team. In this role, you will support the design, implementation, and day-to-day operations of our database infrastructure across cloud platforms including AWS and Google Cloud Platform (GCP). This position offers a great opportunity to grow technically while contributing to performance, automation, and reliability of our data systems. THE DAY-TO-DAY Database Maintenance and Operations - Maintain database health by managing backups, replication, and routine maintenance tasks across environments (e.g., MySQL, PostgreSQL, SQL Server). Cloud Database Support - Assist with administration of cloud-based databases such as AWS RDS, Aurora, DynamoDB, and Google Cloud SQL, ensuring reliability and performance. Monitoring and Alerting - Set up and maintain monitoring and alerting systems using Prometheus and Grafana, as well as cloud-native tools (e.g., CloudWatch, Stackdriver) to proactively detect and resolve database issues. Performance Tuning - Collaborate with senior DBAs and developers to optimize queries, indexes, and configurations for better performance. Automation and Scripting - Automate recurring tasks using scripts and contribute to deployment pipelines and database change management processes. Security and Access Management - Implement role-based access controls, audit trails, and enforce best practices for data security and compliance. Documentation and Support - Document database configurations, procedures, and incident reports. Provide support during incidents and collaborate with engineers to troubleshoot issues. QUALIFICATIONS AND REQUIREMENTS 2-5 years of experience in database administration in production environments. Experience with relational databases such as MySQL, PostgreSQL, or SQL Server. Hands-on exposure to AWS (e.g., RDS, Aurora) and/or GCP (e.g., Cloud SQL, BigQuery). Experience with Linux systems and cloud monitoring tools (e.g., CloudWatch, Stackdriver). Proficient in scripting (e.g., Bash, Python) and automation tools. Familiar with CI/CD and infrastructure automation (e.g., Terraform, GitHub Actions, Jenkins). Hands-on experience with Grafana and Prometheus for database and infrastructure monitoring. Understanding of backup and recovery strategies, replication, and high availability. Basic knowledge of performance tuning and monitoring tools (e.g., EverSQL). Strong analytical and troubleshooting skills; ability to work independently and collaboratively. Exposure to NoSQL or distributed database systems (e.g., MongoDB, Aerospike). Experience with Git, CI/CD pipelines, and infrastructure as code tools (e.g., Terraform, GitHub Actions). Familiarity with containerized environments (Docker, Kubernetes). Experience working in regulated environments or with data compliance standards. LIFE AT VIANT Investing in our employee's professional growth is important to us, but so is investing in their well-being. That's why Viant was voted one of the best places to work and some of our favorite employee benefits include fully paid health insurance, paid parental leave and unlimited PTO and more. Base compensation range: $130,000 - $150,000 In accordance with California law, the range provided is Viant's reasonable estimate of the compensation for this role. Final title and compensation for the position will be based on several factors including work experience and education. Not the right position for you? Check out our other opportunities! Viant Careers #LI-KW1About Viant Viant Technology Inc. (NASDAQ: DSP) is a leader in CTV and AI-powered programmatic advertising, dedicated to driving innovation in digital marketing. Viant's omnichannel platform built for CTV allows marketers to plan, execute and measure their campaigns with unmatched precision and efficiency. With the launch of ViantAI, Viant is building the future of fully autonomous advertising solutions, empowering advertisers to achieve their boldest goals. Viant was recently awarded Best AI-Powered Advertising Solution and Best Demand-Side Platform by MarTech Breakthrough, Great Place to Work certification and received the Business Intelligence Group's AI Excellence Award. Learn more at viantinc.com. Viant is an equal opportunity employer and makes employment decisions on the basis of merit. Viant prohibits unlawful discrimination against employees or applicants based on race (including traits historically associated with race, such as hair texture and protective hairstyles), religion, religious creed, color, national origin, ancestry, physical disability, mental disability, medical condition, genetic information, marital status, sex, reproductive health decision making, gender, gender identity, gender expression, age, military status, veteran status, uniformed service member status, sexual orientation, transgender identity, citizenship status, pregnancy, or any other consideration made unlawful by federal, state, or local laws. Viant also prohibits unlawful discrimination based on the perception that anyone has any of those characteristics, or is associated with a person who has or is perceived as having any of those characteristics. By clicking “Apply for this Job” and providing any information, I accept the Viant California Personnel Privacy Notice.
    $130k-150k yearly Auto-Apply 60d+ ago
  • Site Reliability Engineer

    Fortinet 4.8company rating

    Sunnyvale, CA jobs

    At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess over getting the details right. We love what we do and are proud of our work to secure clouds and container environments for thousands of b2b customers worldwide. Our team is growing, and we are looking for engineers with passion for automation. You will help support the Lacework platform and play a key role in building, operating, and improving the Lacework Cloud Security Platform, the world's best real-time cloud-native threat detection system. Our team develops and supports the infrastructure layers spanning our cloud accounts, network/connectivity, workload management, observability, and storage services. We build tooling to perform automated operations in order to scale the Lacework infrastructure and service. To be successful you will design, define, develop, deploy and operate internal tooling, APIs, and frameworks which streamline our workflows and automate our infrastructure. The Role: Automate as much as reasonable to significantly improve operational efficiency of the Lacework platform Design, build and improve our infrastructure to enhance service scalability, resiliency, and efficiency across the company. Identify mission-critical problems and solve them via automation, tooling, communication, and informed design. Build and improve monitoring and instrumentation to predict future scalability or failure risks and solve them before they manifest into customer-facing issues. Facilitate company-wide visibility into key metrics, SLAs, and milestones so that scale and resiliency are a part of every conversation. Develop best practices alongside engineering/operations teams to improve the scalability and reliability of internal processes. Participate in an on-call rotation. Minimum Qualifications: 3 years of Devops/SRE experience with production systems (depending on level) Strong development and automation skills. Extensive experience with Infrastructure as Code (Terraform, etc), as well as supporting tooling (Atlantis, ArgoCD, etc) Extensive experience with Kubernetes and supporting tooling (Helm, operators, etc) Extensive experience with a variety of cloud managed services and providers AWS: EKS, EC2, S3, RDS, Secrets Manager, etc. Experience building production quality cloud infrastructure that enables reliable and rapid deployment of microservices with effective monitoring and built in high availability and/or fault tolerance. Strong passion for using automation to create simple repeatable dev and ops patterns that ensures a stable, reliable experience for customers. Strong cross-team communication skills. Experience with the building blocks of large-scale systems including load balancing, distributed/cloud computing, containers, instrumentation, and monitoring. Knowledge of cloud networking, including VPC configuration and cross-cloud connectivity. Familiarity with one or more programming languages (Python, Golang, etc.). Preferred Qualifications: Experience with monitoring and observability systems and tools (Prometheus, Grafana, New Relic, DataDog, etc.) Believe everything should be "as code" Experience with Java application servers and JVM configuration
    $121k-158k yearly est. Auto-Apply 60d+ ago
  • Site Reliability Engineer

    Psiquantum 4.2company rating

    Palo Alto, CA jobs

    PsiQuantum's mission is to build the first useful quantum computers-machines capable of delivering the breakthroughs the field has long promised. Since our founding in 2016, our singular focus has been to build and deploy million-qubit, fault-tolerant quantum systems. Quantum computers harness the laws of quantum mechanics to solve problems that even the most advanced supercomputers or AI systems will never reach. Their impact will span energy, pharmaceuticals, finance, agriculture, transportation, materials, and other foundational industries. Our architecture and approach is based on silicon photonics. By leveraging the advanced semiconductor manufacturing industry-including partners like GlobalFoundries-we use the same high-volume processes that already produce billions of chips for telecom and consumer electronics. Photonics offers natural advantages for scale: photons don't feel heat, are immune to electromagnetic interference, and integrate with existing cryogenic cooling and standard fiber-optic infrastructure. In 2024, PsiQuantum announced government-funded projects to support the build-out of our first utility-scale quantum computers in Brisbane, Australia, and Chicago, Illinois. These initiatives reflect a growing recognition that quantum computing will be strategically and economically defining-and that now is the time to scale. PsiQuantum also develops the algorithms and software needed to make these systems commercially valuable. Our application, software, and industry teams work directly with leading Fortune 500 companies-including Lockheed Martin, Mercedes-Benz, Boehringer Ingelheim, and Mitsubishi Chemical-to prepare quantum solutions for real-world impact. Quantum computing is not an extension of classical computing. It represents a fundamental shift-and a path to mastering challenges that cannot be solved any other way. The potential is enormous, and we have a clear path to make it real. Come join us. Job Summary: Join the OS/Platform team as a Site Reliability Engineer (SRE) and keep our services healthy, observable, and fast. Partnering with the Platform Engineering group, you'll own the day‑to‑day operation of our monitoring stack-Grafana, Prometheus, Loki, and Tempo-crafting dashboards that surface golden signals and drive real‑time insight. You'll codify reliability through SLIs/SLOs, automate runbooks in Python, and lead incident response to maintain world‑class uptime across both on‑prem and AWS environments. Responsibilities: Define, implement, and iterate on Service Level Indicators & Service Level Objectives (SLIs/SLOs) and error budgets for critical services, with a focus on network reliability and data centre interconnects. Build and maintain Grafana dashboards that visualize golden signals (latency, traffic, errors, saturation), extending coverage to network telemetry such as packet loss, jitter, bandwidth utilization, and BGP/EVPN stability. Operate and tune the observability pipeline (Prometheus, Loki, Tempo) to ensure scalable, low-latency telemetry ingestion and alerting for networking as well as compute layers. Drive incident response: triage, mitigate, perform post-incident reviews, and implement preventive actions-particularly for network-related outages, congestion, or misconfigurations. Develop automation and self-service tooling in Python/Bash to streamline alerts, runbooks, and operational tasks, including network monitoring and diagnostics. Collaborate with Platform, Product, and Networking teams on capacity planning, performance testing, traffic engineering, and change management. Improve CI/CD health checks and release safety nets within GitLab, with attention to network dependencies in deployments. Contribute to Infrastructure as Code (Terraform, Ansible) for monitoring stack deployments and upgrades, including network observability tooling and configuration Experience/Qualifications: Bachelor's Degree or higher in Computer Science, Engineering, or related technical field. 5+ years in an SRE, DevOps, or Production Engineering role supporting distributed systems in production. Hands-on expertise with observability tools: Grafana, Prometheus, Loki, Tempo (or equivalent). Proven track record designing dashboards and alerts around golden signals and USE/RED methodologies, extended to network utilization, saturation, and error metrics. Solid scripting/automation skills in Python and Bash; familiarity with GitLab CI pipelines. Operational experience with Kubernetes and containerized workloads. Strong working knowledge of AWS services, data centre networking fundamentals, routing protocols, load balancing, and network overlays (e.g., VXLAN/EVPN). Experience running incident response and writing actionable post-mortems, including for network-related events. Familiarity with Infrastructure as Code (Terraform, Ansible) and configuration management. Exposure to regulated environments, multi-region networking architectures, and hybrid on-prem/cloud topologies is a plus. Strong communication and collaboration skills; comfortable acting as a generalist across infrastructure, networking, application, and data layers. PsiQuantum provides equal employment opportunity for all applicants and employees. PsiQuantum does not unlawfully discriminate on the basis of race, color, religion, sex (including pregnancy, childbirth, or related medical conditions), gender identity, gender expression, national origin, ancestry, citizenship, age, physical or mental disability, military or veteran status, marital status, domestic partner status, sexual orientation, genetic information, or any other basis protected by applicable laws. Note: PsiQuantum will only reach out to you using an official PsiQuantum email address and will never ask you for bank account information as part of the interview process. Please report any suspicious activity to *************************. We are not accepting unsolicited resumes from employment agencies. The ranges below reflect the target ranges for a new hire base salary. One is for the Bay Area (within 50 miles of HQ, Palo Alto), the second one (if applicable) is for elsewhere in the US (beyond 50 miles of HQ, Palo Alto). If there is only one range, it is for the specific location of where the position will be located. Actual compensation may vary outside of these ranges and is dependent on various factors including but not limited to a candidate's qualifications including relevant education and training, competencies, experience, geographic location, and business needs. Base pay is only one part of the total compensation package. Full time roles are eligible for equity and benefits. Base pay is subject to change and may be modified in the future. U.S. Base Pay Range $120,000-$140,000 USDBay Area Pay Range $145,000-$165,000 USD
    $145k-165k yearly Auto-Apply 14d ago
  • Site Reliability Engineer

    TP-Link Systems 3.9company rating

    Irvine, CA jobs

    At the forefront of the future of connected living, TP-Link's Systems Inc. R&D Center in Irvine, Southern California's innovation hub, spearheads research and development of next-generation networking, IoT smart home products, and software services. Our team of passionate engineers are constantly innovating, engineering solutions that transform the end user experience with simpler, smarter, and more reliable connectivity. We're looking for a passionate and experienced Site Reliability Engineer to join our team and play a crucial role in ensuring our cloud platform's security, Reliability, scalability, and operational excellence. About Us: Headquartered in the United States, TP-Link Systems Inc. is a global provider of reliable networking devices and smart home products, consistently ranked as the world's top provider of Wi-Fi devices. The company is committed to delivering innovative products that enhance people's lives through faster, more reliable connectivity. With a commitment to excellence, TP-Link serves customers in over 170 countries and continues to grow its global footprint. We believe technology changes the world for the better! At TP-Link Systems Inc, we are committed to crafting dependable, high-performance products to connect users worldwide with the wonders of technology. Embracing professionalism, innovation, excellence, and simplicity, we aim to assist our clients in achieving remarkable global performance and enable consumers to enjoy a seamless, effortless lifestyle. Responsibilities: Assist in implementing and operating Microservices on Kubernetes cloud-based platforms. Collaborate with the Cloud Technical Development and DevOps teams to deploy services to the Multi-Cloud Platform. Conduct Load Tests and Chaos Tests to ensure the scalability and reliability of microservices. Build observability for Microservices and cloud platforms like AWS, OCI, Azure, and GCP. Contribute to writing and executing disaster recovery plans in collaboration with the Development and DevOps teams. Help analyze and resolve production risks caused by insufficient resources, such as node groups, CPU, memory, HPA scheduling, JVM pre-warming, etc. Write and maintain scripts for automation using languages like Python, Go, or Bash. Assist in defining and maintaining the KPIs (SLA/SLO/SLI) for all cloud microservices with development teams to better understand the business. Create and maintain technical documentation, including architecture diagrams, design documents, and standard operating procedures. Ensure adherence to security and compliance standards, including ISO27001, SOC2, and GDPR. Participate in incident response efforts to troubleshoot and resolve production issues quickly. Conduct post-incident analysis to identify root causes and potential workarounds/solutions. Contribute to product/technology selection, including implementation of POCs. Be adaptable to change and evolving processes and tools. Participate in mentoring and training less senior members of the team. Be part of the on-call rotation and provide support after work hours and on weekends. Other duties as assigned. Requirements Bachelor's degree in Computer Science, Information Technology, or a related field. 1-3 years of experience as a Site Reliability Engineer or in a related role. Proficiency in programming and scripting languages like Java, Python, Bash, or PowerShell. Hands-on experience in SRE, DevOps, cloud operations, and cloud security best practices. Basic knowledge of security technologies, including Identity and access management, Network security, Application security, and Data protection. Strong problem-solving and analytical skills, with the ability to work independently and as part of a team. Experience in developing and maintaining technical documentation and implementing compliance requirements. Additional Skills (Preferred): Relevant cloud certifications include AWS Solutions Architect, Azure Solutions Architect Expert, or GCP Professional Cloud Architect. Experience with container orchestration technologies (e.g., Kubernetes) Benefits Salary range: $100,000 - $140,000 Free snacks and drinks, and provided lunch on Fridays Fully paid medical, dental, and vision insurance (partial coverage for dependents) Contributions to 401k funds Bi-annual reviews, and annual pay increases Health and wellness benefits, including free gym membership Quarterly team-building events At TP-Link Systems Inc., we are continually searching for ambitious individuals who are passionate about their work. We believe that diversity fuels innovation, collaboration, and drives our entrepreneurial spirit. As a global company, we highly value diverse perspectives and are committed to cultivating an environment where all voices are heard, respected, and valued. We are dedicated to providing equal employment opportunities to all employees and applicants, and we prohibit discrimination and harassment of any kind based on race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws. Beyond compliance, we strive to create a supportive and growth-oriented workplace for everyone. If you share our passion and connection to this mission, we welcome you to apply and join us in building a vibrant and inclusive team at TP-Link Systems Inc. Please, no third-party agency inquiries, and we are unable to offer visa sponsorships at this time.
    $100k-140k yearly Auto-Apply 60d+ ago
  • Site Reliability Engineer

    Intelliswift 4.0company rating

    San Francisco, CA jobs

    Hello, Greetings! We have the following open position. If interested, please send us your resume along with all the details Desired qualifications • Expertise in designing, analyzing and troubleshooting large-scale distributed systems. • In-depth knowledge of operating systems (processes, threads, concurrency issues, locks, mutexes, semaphores, monitors and how they work). • Familiarity with algorithms, data structures and complexity analysis. • Hands on Java and Apache optimization, performance tuning and configuration • Systematic problem solving approach, coupled with a strong sense of ownership and drive. Additional Information Need Locals
    $104k-140k yearly est. 17h ago
  • Site Reliability Engineer

    Intelliswift 4.0company rating

    San Francisco, CA jobs

    Hello, Greetings! We have the following open position. If interested, please send us your resume along with all the details Desired qualifications • Expertise in designing, analyzing and troubleshooting large-scale distributed systems. • In-depth knowledge of operating systems (processes, threads, concurrency issues, locks, mutexes, semaphores, monitors and how they work). • Familiarity with algorithms, data structures and complexity analysis. • Hands on Java and Apache optimization, performance tuning and configuration • Systematic problem solving approach, coupled with a strong sense of ownership and drive. Additional Information Need Locals
    $104k-140k yearly est. 60d+ ago
  • Database Reliability Engineer

    Electronic Arts Inc. 4.8company rating

    Austin, TX jobs

    Description & Requirements Electronic Arts creates next-level entertainment experiences that inspire players and fans around the world. Here, everyone is part of the story. Part of a community that connects across the globe. A place where creativity thrives, new perspectives are invited, and ideas matter. A team where everyone makes play happen. EA's Production Infrastructure & Engineering (PI&E) organization provides the essential platforms and infrastructure hosting solutions that power EA's live services. Our charter is to make EA's games and services available to all players anytime and anywhere. To do this, we focus on the high availability of infrastructure, primary services, and studio services. We aim to help developers to experiment and build new games quickly with infrastructure services on-demand and workflows that promote rapid development in the cloud. In all of this, we focus on being there for players where and when they want to play. Through the Cloud Database Infrastructure Lifecycle, Operations, Scalability and performance tuning, the DBA Services team supports EA's critical database services and its games entertaining millions of players around the world. As a Database Reliability Engineer, you will manage production environments and will use automation to design, optimize and implement reliable database solutions. You will report to the Senior Manager of DBA Services. Responsibilities: * You will implement automation and tooling to achieve higher availability, reliability, scalability and consistency across all aspects of Database life cycle. * You will support game launches by ensuring Operational Readiness, stress testing and performance optimization of EA's databases and related infrastructure. * You will collaborate with Infrastructure and Game Teams during incident situations. * You will measure and use data to improve environment reliability. * You will collaborate with other engineers to implement guidelines, methodologies, and documentation. * You will ensure that proper documentation is developed and maintained regarding Architecture diagrams, configuration details, Operational procedures to help with knowledge sharing. * You will provide emergency response through on-call rotation and may occasionally work outside usual business hours to support global delivery processes. Qualifications: * Bachelor or master's degree in Computer Science, Computer Engineering or equivalent work experience. * 3+ years' experience in IT, including coding experience (Python, Go, Ruby). * 2+ years' experience managing high-volume transactional production environments using database technologies such as RDS Aurora, MySQL, Postgres, Cassandra or Mongo. * Experience tuning and optimizing database environments to meet critical uptime requirements. * Experience deploying and optimizing infrastructure through IaC using Terraform, CI/CD, Gitlab or similar technologies. * Experience configuring and managing online backup technologies for their database environment and is familiar with Disaster Recovery processes. * Proficiency with AWS Technologies or other cloud offerings. About Electronic Arts We're proud to have an extensive portfolio of games and experiences, locations around the world, and opportunities across EA. We value adaptability, resilience, creativity, and curiosity. From leadership that brings out your potential, to creating space for learning and experimenting, we empower you to do great work and pursue opportunities for growth. We adopt a holistic approach to our benefits programs, emphasizing physical, emotional, financial, career, and community wellness to support a balanced life. Our packages are tailored to meet local needs and may include healthcare coverage, mental well-being support, retirement savings, paid time off, family leaves, complimentary games, and more. We nurture environments where our teams can always bring their best to what they do. Electronic Arts is an equal opportunity employer. All employment decisions are made without regard to race, color, national origin, ancestry, sex, gender, gender identity or expression, sexual orientation, age, genetic information, religion, disability, medical condition, pregnancy, marital status, family status, veteran status, or any other characteristic protected by law. We will also consider employment qualified applicants with criminal records in accordance with applicable law. EA also makes workplace accommodations for qualified individuals with disabilities as required by applicable law.
    $96k-129k yearly est. 60d+ ago
  • Service reliability Engineer

    360 It Professionals 3.6company rating

    Santa Clara, CA jobs

    360 IT Professionals is a Software Development Company based in Fremont, California that offers complete technology services in Mobile development, Web development, Cloud computing and IT staffing. Merging Information Technology skills in all its services and operations, the company caters to its globally positioned clients by providing dynamic feasible IT solutions. 360 IT Professionals work along with its clients to deliver high-performance results, based exclusively on the one of a kind requirement. Our services are vast and we produce software and web products. We specialize in Mobile development, i.e. iPhone and Android apps. We use Objective C and Swift programming languages to create native applications for iPhone, whereas we use Android Code to develop native applications for Android devices. To create applications that work on cross-platforms, we use a number of frameworks such as Titanium, PhoneGap and JQuery mobile. Furthermore, we build web products and offer services such as web designing, layouts, responsive designing, graphic designing, web application development using frameworks based on model view controller architecture and content management system. Our services also extend to the domain of Cloud Computing, where we provide Salesforce CRM to effectively manage one's business and ease out all the operations by giving an easy platform. Apart from this, we also provide IT Staffing services that can help your organization to a great extent as you can hire highly skilled personnel's through us. We make sure that we deliver performance driven products that are optimally developed as per your organization's needs. Take a shot at us for your IT requirements and experience a radical change. Job Description: Primary - Service Reliability Engineer • Front line technical service reliability operators accountable for handling critical customer issues coming in via support phone line and HUB. • Responsible for first touch incident resolution (via TSG or SOP) or escalation to the appropriate resource within SLA. • Responsible for monitoring the live service via HUB alerts, Heads Up Displays, Manual service checks or customer escalations. • Accountable for High Priority Bridge Moderation (Spin up bridge, start whiteboard, document sequence of events). • Document and refine Phone Script, TSGs and SOPs. • Service Request Management (User Provisioning, Client Invites, Environment requests, Deployments, etc.) • Responsible for refining Service Center tools and process Requirements: • Work on-shift as part a 24x7x365 operations center to provide phone/email support; flexible day or night • Identify the priority and criticality of incoming alerts and prioritize appropriately • Handle high-pressure situations with a calm and professional manner • Understand and edit scripting conventions to resolve issues • Escalate significant issues to service, network, or other operations engineers at all hours of the day/night • Track issues through ticketing systems and follow through until resolution • Write clear, and concise operational run books like TSGs • Utilize monitoring tools to proactively identify issues and trends • Aggressively troubleshoot and multitask to maintain service availability and performance Qualifications: • Degree/ MIS, or equivalent experience; • 2-5+ years of relevant phone/tech/ops center experience with Win, Unix, Linux, AWS/Azure • Ticketing system experience • Attention to detail • Solid communications skills (written & verbal both) Additional Information Unfeigned Regards, Preeti Nahar | Sr. Talent & Client Acquisition Specialist - TAG US | 360 IT Professionals Inc. C: +1 510-254-3300 ext. 140
    $109k-149k yearly est. 60d+ ago
  • Site Reliability Engineer

    Thales USA 4.5company rating

    Austin, TX jobs

    Location: Austin, United States of AmericaThales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000 organizations already rely on us to verify the identities of people and things, grant access to digital services, analyze vast quantities of information and encrypt data to make the connected world more secure. Austin, TX., (U.S.), Hybrid Position Summary Thales is seeking a Site Reliability Engineer to ensure the high level of service and operation excellence for the development of the innovative and ambitious Telecommunication solution (high availability, strong performance constraints) deployed in the public cloud. This position is eligible for the NORAM referral program for external candidates. The level is Tier 1 - $2500.00 Key Areas of Responsibility As Site Reliability Engineer team member, you will contribute to the following topics: Define and implement solution deployment architecture Define, implement, execute, and improve CI/CD and automation processes Application support activity (troubleshooting, implement and deploy fixes) Master the product as a whole, with the aim of proposing improvements, concerning the core of the product and the associated deployment and observability processes. Make regular monitor and reporting of the service availability like Service Level Agreements (SLA) Service Level Indicator (SLI) and Service Level Objectives (SLO) Manage support tickets and customer relationship Interface with other stakeholders/ teams to define solution improvement plan Minimum Qualifications Bachelors degree in Engineering, Computer Science or similar Technical field, or in lieu of a degree, 4 years of relevant, exempt experience as equivalency. A least 3 years of experience in monitoring of solutions deployed in a cloud infrastructure At least 2 years' experience of application support activity (troubleshooting, implement and deploy fixes) At least 1 year of Java development experience Demonstrated experience with Public Cloud deployment (GCP or AWS or Azure…), containers and microservices (Docker, Kubernetes, Java) Demonstrated experience on CI/CD and automation (Jenkins, Gitlab, Helm) Prior experience developing or maintaining CI/CD pipelines you are familiar with SLA/SLI and SLO concepts If you're excited about working with Thales, but not meeting the requirements for this position, we encourage you to join our Talent Community! Special Position Requirements Schedule: Core Business Hours Monday-Friday 9-5 and on call as needed (approximately one week a month ). Travel: infrequently, possibly a few times a year. What We Offer Thales provides an extensive benefits program for all full-time employees working 30 or more hours per week and their eligible dependents, including the following: •Elective Health, Dental, Vision, FSA/HSA, Voluntary Life and AD&D, Whole Group Life w/LTC, Critical Illness, Hospital Indemnity, Accident Insurance, Legal Plan, Identity Theft, and Pet Insurance. •Retirement Savings Plan after 30 days of employment with a company contribution and a match, and with no vesting period. •Company paid holidays and Paid Time Off. •Company provided Life Insurance, Why Join Us? Say HI and learn more about working at Thales click here #LI-Hybrid #LI-PD1 This position will require successfully completing a post-offer background check. Qualified candidates with [a] criminal history will be considered and are not automatically disqualified, consistent with federal law, state law, and local ordinances. We are an equal opportunity employer, including disability and veteran status. All qualified applicants will receive consideration for employment without regard to sex, race, color, religion, national origin, disability, protected Veteran status, age, or any other characteristic protected by law. If you need an accommodation or assistance in order to apply for a position with Thales, please contact us at ************************************. The reference Total Target Compensation (TTC) market range for this position, inclusive of annual base salary and the variable compensation target, is between Total Target Cash (TTC) 106,194.75 - 173,921.00 USD Annual This reflects how companies in a similar industry and geographic region generally pay for similar jobs. This range helps the Company make pay decisions as one data point among many. Where a position falls within this range is also dependent on other factors including - but not limited to - the employee's career path history, competencies, skills and performance, as well as the company's annual salary budget, the customer's program requirements, and the company's internal equity. Thales may offer additional benefits and other compensation, depending on circumstances not related to an applicant's status protected by local, state, or federal law. (For Internal candidate, if you need more information, please reach out to your HR Shared Service, 1st Point) Thales provides an extensive benefits program for all full-time employees working 30 or more hours per week and their eligible dependents, including the following: •Elective Health, Dental, Vision, FSA/HSA, Voluntary Life and AD&D, Whole Group Life w/LTC, Critical Illness, Hospital Indemnity, Accident Insurance, Legal Plan, Identity Theft, and Pet Insurance •Retirement Savings Plan after 30 days of employment with a company contribution and a match, and with no vesting period •Company paid holidays and Paid Time Off •Company provided Life Insurance, AD&D, Disability, Employee Assistance Plan, and Well-being Program
    $80k-103k yearly est. Auto-Apply 60d+ ago
  • Reliability Engineer III

    Lancesoft 4.5company rating

    Elkton, VA jobs

    Energy. It defines LanceSoft. Consider our unique ‘keep apace' operational culture, the spirited lot of hand-picked professionals, our ‘up-to-the-minute' knowledge base, together they form a dynamic mix of value-generating characteristics that help us delve into the heart of a problem to deliver precise services and solutions - repeatedly. In business since 2000, LanceSoft is a reputed and credible Contingent Workforce Management Services firm that has established itself as a pioneer in providing highly scalable workforce solutions and exceptionally competent global IT services to a diverse set of customers across various industries around the globe. LanceSoft is headquartered out of the Washington DC Metropolitan (Herndon, VA) and operates out of various locations in the US, Canada and India Job Description Ensures equipment is maintained in compliance with all safety, environmental, and quality requirements defines, designs, develops, monitors, and refines the preventive/predictive maintenance processes and precision maintenance techniques performs root cause failure analysis on equipment failures and develops strategies to prevent recurrence supports the Planner/Scheduler to develop technically complex work procedures serves as technical liaison for capital projects to ensure reliability maintainability and operability of installed equipment monitors the performance of maintenance metrics, investigates variance from targets and implements action items to improve performance supports all aspects of process change control (evaluation, approval, communication, planning scheduling, validation and project documentation, auditing, training, implementation). Qualifications As a member of the Engineering/Maintenance/Utility Department, the primary role of the Reliability Engineer is to provide and/or improve asset reliability and capacity of manufacturing process and utility equipment in a Pharmaceutical/Vaccine manufacturing facility. Through analysis of equipment history, process data, and Failure Modes Effect Analysis, the Reliability Engineer is responsible for the correction of equipment problems causing repetitive failures. Additional Information All your information will be kept confidential according to EEO guidelines.
    $81k-105k yearly est. 17h ago
  • Reliability Engineer III

    Lancesoft 4.5company rating

    Elkton, VA jobs

    Energy. It defines LanceSoft. Consider our unique ‘keep apace' operational culture, the spirited lot of hand-picked professionals, our ‘up-to-the-minute' knowledge base, together they form a dynamic mix of value-generating characteristics that help us delve into the heart of a problem to deliver precise services and solutions - repeatedly. In business since 2000, LanceSoft is a reputed and credible Contingent Workforce Management Services firm that has established itself as a pioneer in providing highly scalable workforce solutions and exceptionally competent global IT services to a diverse set of customers across various industries around the globe. LanceSoft is headquartered out of the Washington DC Metropolitan (Herndon, VA) and operates out of various locations in the US, Canada and India Job Description Ensures equipment is maintained in compliance with all safety, environmental, and quality requirements defines, designs, develops, monitors, and refines the preventive/predictive maintenance processes and precision maintenance techniques performs root cause failure analysis on equipment failures and develops strategies to prevent recurrence supports the Planner/Scheduler to develop technically complex work procedures serves as technical liaison for capital projects to ensure reliability maintainability and operability of installed equipment monitors the performance of maintenance metrics, investigates variance from targets and implements action items to improve performance supports all aspects of process change control (evaluation, approval, communication, planning scheduling, validation and project documentation, auditing, training, implementation). Qualifications As a member of the Engineering/Maintenance/Utility Department, the primary role of the Reliability Engineer is to provide and/or improve asset reliability and capacity of manufacturing process and utility equipment in a Pharmaceutical/Vaccine manufacturing facility. Through analysis of equipment history, process data, and Failure Modes Effect Analysis, the Reliability Engineer is responsible for the correction of equipment problems causing repetitive failures. Additional Information All your information will be kept confidential according to EEO guidelines.
    $81k-105k yearly est. 60d+ ago
  • Process Quality Engineer - Swing shift

    Hyve Solutions 3.9company rating

    Fremont, CA jobs

    @HYVE Solutions, missions to help customers, business partners, and employees achieve success through shared goals, strategies, resources and technology solutions. Salary range: $90K-120K THE SYNNEX CULTURE SYNNEX creates additional value for all of our partners at all transaction points. For the company to succeed, each SYNNEX associate is focused on delivering the finest products, services, and solutions in the industry. SYNNEX values and rewards loyalty, teamwork, integrity, and industry. We encourage team collaboration and the spirit of entrepreneurship. Our associates are our greatest asset, and we are dedicated to providing our team members with the opportunity to realize personal growth and professional success. Get in S•Y•N•C• with SYNNEX Start Your New Career as a……..Quality Engineer THE RIGHT FIT SYNNEX Corporation is looking for a detail-oriented, hands-on, results-driven individual with proven communication skills and a strong work ethic to work in a challenging, fast-paced, energetic environment with responsibilities that include managing all aspects of the quality control production process, fall-out, audits and ISO; ensuring that division and departmental practices comply with company requirements; achieve stated objectives and meet current ISO standards. PRINCIPAL DUTIES AND RESPONSIBILITIES (ESSENTIAL FUNCTIONS) Main point of contact for process quality issues, including any inspection activities, priorities and escalations. Collaborate with production teams to address quality issues and implement corrective actions. Collaborate with PE/TE/PM to ensure alignment on quality objectives and priorities. Support regular inspections and audits of manufacturing processes and products to identify defects or deviations from quality standards. Provide a guidance of the acceptance criteria on the cosmetic issues to QC and MFG team. Coordinate and resolve Stop line, Quarantine and QRQC (Quick Response Quality Control) issues. Refocusing QA resources from data-gathering/reporting to using audit for driving process improvement opportunities. Direct QA resources in performing primarily in-line audits to auditing primary upstream processes. Establish and build closer links between site QA teams and Engineering / Manufacturing teams. Work with internal Production, Engineering, Shipping/Receiving, Warehouse, Program Managers and Procurement to meet quality standards. Develop proactive solutions and implement Quality department strategies across the organization. Customer-facing site-based QA representative who can effectively present and communicate to internal customers and other areas. Direct site QA teams to maintain consistent standards & metrics & to share/implement best practices across products. Review and approve Product and Processes Corrective and Prevention Action Plans (8Ds) and perform additional assessment and analysis as assigned. Perform failure analyses, root cause analysis and corrective action follow-up. Assess and evaluate all reliability testing, equipment service and calibration and the verification process. Execute internal audits on QMS, EMS, ISO 9001 and ISO 14001 standards. Coordinate UL (Underwriter's Laboratory) and other regulatory factory audits. ESSENTIAL CRITERIA BS degree in Computer Science, Electrical, Mechanical or Industrial Engineering or relevant discipline plus 3 years of experience including a combination of 2+ years in contract manufacturing, 3+ years in quality control and 3+ years in a leadership position or equivalent combination of experience. Prior manufacturing engineering and quality experience. Proven understanding of mechanical drawing and/or tools. Experience with server/computer, build or repair processes. Knowledge of key customer processes (i.e. Microsoft, etc.) Demonstrated background in interfacing with key customers within the high tech industry and experience working across multiple sites sharing best practices & implementing process improvements. Working knowledge of MS Office programs; Word, Excel and PowerPoint. Hands on experience with quality system training including understanding of SPC (Statistics Process Control) principles and tools. Established ability prioritizing and managing multiple projects to meet strict deadlines. Flexibility to work in a fast-paced, high volume, and diverse environment across functions to produce expected results. Able to work as business needs require which may include long days, occasional evenings and weekends, and travel to all manufacturing and warehouse locations, for business meetings or training. WHAT SYNNEX OFFERS YOU SYNNEX Corporation (SNX) is committed to investing in our associates. They are our greatest asset, and we are dedicated to providing our team members with the opportunity to realize professional and personal growth. If you share our mission, our strong work ethic, and our values of integrity, continuous learning, quality of work, commitment, teamwork, execution and results, respect for the individual, and taking manageable risks, then SYNNEX may be the place for you. Competitive Compensation Profit Sharing Employee Stock Purchase Plan Paid Vacation Days Paid Holidays Paid Sick Days Direct Deposit Tuition Reimbursement Medical and Prescription Insurance Dental Insurance Vision Care Life & Accident Insurance Development Scholarship Program Flexible Spending Accounts (FSA) Short- & Long-Term Disability Bereavement and Jury Duty Leaves Casual Dress Code Employee Assistance Program Live Well Work Well Program Training Opportunities Pet Insurance “SYNNEX Corporation is an Equal Employment Opportunity employer M/F/D/V and is committed to the Quality Policy.” Note: The preceding job description has been designed to indicate the general nature and level of work performed by employees with this classification. It is not designed to contain or be interpreted as a comprehensive inventory of all duties, responsibilities, and qualifications required of employees assigned to this job. Top of Form Top of Form @ HYVE Solutions, we believe employees are our greatest asset and we empower them to make a difference in our business. Diversity and inclusion make us all better. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status.
    $90k-120k yearly Auto-Apply 60d+ ago

Learn more about The Aerospace Corporation jobs

View all jobs