Security Operations Engineer
Service engineer job at Microsoft
In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day and we need you as a **Security Operations Engineer** . Microsoft's Cloud Operations & Innovation (CO+I) is the engine that powers our cloud services. As a Network Security Service Engineer you will perform a key role in delivering the core infrastructure and foundational technologies for Microsoft's online services including Bing, Office 365, Xbox, OneDrive, and the Microsoft Azure platform. You will implement and operate modern cloud and on premises cybersecurity controls to defend Microsoft datacenter critical infrastructure from threat actors. Leveraging multiple solutions and partnering with internal and external teams, you will be at the forefront of advancing industrial network cybersecurity capabilities.
Through on the job learning and bi-directional mentorship, this opportunity will allow you to gain cyber defense, automation, and networking skills and experiences that are rare in both networking and security organizations, and in high demand across multiple industries. This is a flexible work opportunity for you to work from home partially or fully if desired. As a group, CO+I is focused on personal and professional development for all employees and offers trainings and growth opportunities including Career Rotation Programs, Diversity & Inclusion trainings and events, and professional certifications.
Our infrastructure is comprised of a large global portfolio of more than 100 datacenters and 1 million servers. Our foundation is built upon and managed by a team of subject matter experts working to support services for more than 1 billion customers and 20 million businesses in over 90 countries worldwide.
With environmental sustainability and optimization at the forefront of our datacenter design and operations, we continue to grow and evolve as we meet the ever-changing business demands that hold Microsoft as a world-class cloud provider.
Do you want to empower billions across the world? Come and join us in CO+I and be at the forefront of the action!
**Responsibilities**
+ Proactively identify and investigate potential issues and patterns in security controls and recommend mitigation strategies, while also surfacing opportunities for automation to improve efficiency and effectiveness across the network.
+ Install, upgrade, and maintain security hardware, operating system and software.
+ Identify gaps in security policy and administration, recommend solutions, and implement new and revised security standards, while working with partner teams to drive consistency and awareness.
+ Maintain standards and drive improvements for our customer and partner experience, responding appropriately to emerging issues and advocating for our customer experience through analyzing key metrics, performance indicators, and other data sources (e.g. bugs, unhealthy data pipeline). Escalate, recommend improvements as appropriate to address gaps.
+ Participate in on-call rotation to support security services.
+ With minimal guidance, analyze attempted or successful efforts to compromise systems security and, alongside partner teams, create recommendations to limit exposure, implement response plans, and take action.
+ Analyze potential or actual intrusions identified from monitoring activities and create detections based on available data (e.g., Indicators of Compromise [IOC] and Tools Tactics Procedures [TTP]).
+ Administer globally distributed Authentication, Authorization, and Accounting (AAA) and Privileged Access Management (PAM) functions end-to-end.
**Other**
+ Embody our culture (*************************************************** and values (******************************************************* .
**Qualifications**
**Required Qualifications:**
+ 1+ years of experience in software development lifecycle, large-scale computing, modeling, cyber security, anomaly detection, Security Operations Center (SOC) detection, threat analytics, security incident and event management (SIEM), information technology (IT), and operations incident response
+ Bachelor's Degree in Statistics, Mathematics, Computer Science
+ or related field.
+ Experience with security tooling such as Firewalls, Intrusion detection/prevention systems, or Identity and Access Management (IAM)
While not required, we also look for the following **Preferred Qualifications:**
+ 3+ years of experience in software development lifecycle, large-scale computing, modeling, cyber security, anomaly detection.
+ Master's Degree in Statistics, Mathematics, Computer Science
+ or related field.
+ CISSP, CISA, CISM, SANS, GCIA, GCIH, OSCP, PCCSE, PCNSE, PCSAE, CCNP Security, CCIE Security and/or Security+ certification.
+ 1+ years of direct experience designing, deploying, or operating common Identity and Access Management (IAM) tooling
+ Any experience with industrial control systems is preferred (not mandatory)
**Background Check Requirements:**
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
+ Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Security Operations Engineering IC2 - The typical base pay range for this role across the U.S. is USD $84,200 - $165,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $109,000 - $180,400 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
****************************************************
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations. (**************************************************************
Entry Level Hire Remote Technical Support - AI and Data Science Engineer and Support 2026
Poughkeepsie, NY jobs
**Introduction** IBM Infrastructure is a catalyst that makes the world work better because our clients demand it. Heterogeneous environments, the explosion of data, digital automation, and cybersecurity threats require hybrid cloud infrastructure that only IBM can provide.
Your ability to be creative, a forward-thinker and to focus on innovation that matters, is all support by our growth minded culture as we continue to drive career development across our teams. Collaboration is key to IBM Infrastructure success, as we bring together different business units and teams that balance their priorities in a way that best serves our client's needs.
IBM's product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.
**Your role and responsibilities**
We are seeking a motivated and technically skilled early-career professional to join our AI and Data Science development team. As a Junior Developer, you will contribute to the design, development, and implementation of AI solutions and look for ways to effienciently collect, clean, analyze, and visualize data to support business decisions that support real-world applications across enterprise systems. This role is ideal for someone with a strong foundation in machine learning and software engineering who is eager to grow in a collaborative, innovation-driven environment. You will work with Senior Developers to build models helping to create predictive models, generate insights and help optimize company performance.
**Required technical and professional expertise**
- Proficiency in Python and experience with libraries such as NumPy, pandas, scikit-learn.
- Solid understanding of machine learning algorithms and model evaluation techniques.
- Experience with Git and collaborative development workflows.
- Ability to work with structured and unstructured data, including preprocessing and transformation.
- Familiarity with software engineering principles and debugging practices.
- Strong analytical and problem-solving skills.
**Preferred technical and professional experience**
- Experience with deep learning frameworks (e.g., PyTorch, TensorFlow, Keras).
- Exposure to model deployment using Docker, REST APIs, or cloud platforms (AWS, Azure, GCP).
- Understanding of MLOps tools and practices (e.g., MLflow, Kubeflow, CI/CD pipelines).
- Knowledge of distributed systems, storage architectures (e.g., IBM Storage Scale), and performance optimization.
- Familiarity with Linux environments and container orchestration (e.g., Kubernetes, OpenShift).
- Awareness of ethical AI principles, including fairness, transparency, and bias mitigation.
IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
Entry Level Hire Remote Technical Support - AI and Data Science Engineer and Support 2026
Austin, TX jobs
**Introduction** IBM Infrastructure is a catalyst that makes the world work better because our clients demand it. Heterogeneous environments, the explosion of data, digital automation, and cybersecurity threats require hybrid cloud infrastructure that only IBM can provide.
Your ability to be creative, a forward-thinker and to focus on innovation that matters, is all support by our growth minded culture as we continue to drive career development across our teams. Collaboration is key to IBM Infrastructure success, as we bring together different business units and teams that balance their priorities in a way that best serves our client's needs.
IBM's product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.
**Your role and responsibilities**
We are seeking a motivated and technically skilled early-career professional to join our AI and Data Science development team. As a Junior Developer, you will contribute to the design, development, and implementation of AI solutions and look for ways to effienciently collect, clean, analyze, and visualize data to support business decisions that support real-world applications across enterprise systems. This role is ideal for someone with a strong foundation in machine learning and software engineering who is eager to grow in a collaborative, innovation-driven environment. You will work with Senior Developers to build models helping to create predictive models, generate insights and help optimize company performance.
**Required technical and professional expertise**
- Proficiency in Python and experience with libraries such as NumPy, pandas, scikit-learn.
- Solid understanding of machine learning algorithms and model evaluation techniques.
- Experience with Git and collaborative development workflows.
- Ability to work with structured and unstructured data, including preprocessing and transformation.
- Familiarity with software engineering principles and debugging practices.
- Strong analytical and problem-solving skills.
**Preferred technical and professional experience**
- Experience with deep learning frameworks (e.g., PyTorch, TensorFlow, Keras).
- Exposure to model deployment using Docker, REST APIs, or cloud platforms (AWS, Azure, GCP).
- Understanding of MLOps tools and practices (e.g., MLflow, Kubeflow, CI/CD pipelines).
- Knowledge of distributed systems, storage architectures (e.g., IBM Storage Scale), and performance optimization.
- Familiarity with Linux environments and container orchestration (e.g., Kubernetes, OpenShift).
- Awareness of ethical AI principles, including fairness, transparency, and bias mitigation.
IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
Entry Level Hire Remote Technical Support - AI and Data Science Engineer and Support 2026
Tucson, AZ jobs
**Introduction** IBM Infrastructure is a catalyst that makes the world work better because our clients demand it. Heterogeneous environments, the explosion of data, digital automation, and cybersecurity threats require hybrid cloud infrastructure that only IBM can provide.
Your ability to be creative, a forward-thinker and to focus on innovation that matters, is all support by our growth minded culture as we continue to drive career development across our teams. Collaboration is key to IBM Infrastructure success, as we bring together different business units and teams that balance their priorities in a way that best serves our client's needs.
IBM's product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.
**Your role and responsibilities**
We are seeking a motivated and technically skilled early-career professional to join our AI and Data Science development team. As a Junior Developer, you will contribute to the design, development, and implementation of AI solutions and look for ways to effienciently collect, clean, analyze, and visualize data to support business decisions that support real-world applications across enterprise systems. This role is ideal for someone with a strong foundation in machine learning and software engineering who is eager to grow in a collaborative, innovation-driven environment. You will work with Senior Developers to build models helping to create predictive models, generate insights and help optimize company performance.
**Required technical and professional expertise**
- Proficiency in Python and experience with libraries such as NumPy, pandas, scikit-learn.
- Solid understanding of machine learning algorithms and model evaluation techniques.
- Experience with Git and collaborative development workflows.
- Ability to work with structured and unstructured data, including preprocessing and transformation.
- Familiarity with software engineering principles and debugging practices.
- Strong analytical and problem-solving skills.
**Preferred technical and professional experience**
- Experience with deep learning frameworks (e.g., PyTorch, TensorFlow, Keras).
- Exposure to model deployment using Docker, REST APIs, or cloud platforms (AWS, Azure, GCP).
- Understanding of MLOps tools and practices (e.g., MLflow, Kubeflow, CI/CD pipelines).
- Knowledge of distributed systems, storage architectures (e.g., IBM Storage Scale), and performance optimization.
- Familiarity with Linux environments and container orchestration (e.g., Kubernetes, OpenShift).
- Awareness of ethical AI principles, including fairness, transparency, and bias mitigation.
IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
Critical Facility Engineer
New Albany, OH jobs
Meta is seeking a data center Critical Facility Engineer to join our Data Center Facility Operations team. Our data centers serve as the foundation upon which our software operates to meet the demands of our customers. The Critical Facility Engineer will be a part of the Facility Operations team responsible for operating and maintaining critical systems in our data centers.
Minimum Qualifications
* 6+ years of experience in electrical, HVAC, mechanical, controls, or other technical maintenance field
* Associate's degree in engineering plus 4+ years experience or Bachelor's degree in related field plus 2+ years experience in electrical, HVAC, mechanical, or controls will be considered in lieu of 6+ years experience
* Use hands and fingers
* Reach/push/pull with arms/hands/shoulders
* Stoop, kneel, crouch and crawl
* Lift and/or otherwise move 45 pounds or more
* Able to sit or stand at a workstation for extended periods of time
Preferred Qualifications
* Knowledge of mechanical, electrical, and life safety monitoring and control systems typically used in critical environments
* Knowledge of EAM/CMMS systems
* Experience interpreting blueprints/CAD drawings
* Trade Certification or state licensure in Electrical or Mechanical (HVAC)
* 8+ years experience in a data center or other Critical Environment (pharma, clean room, medical, power production, etc.)
* Professional affiliations (7x24 Exchange, IFMA, Data Center Pulse, etc.)
* Knowledge of critical facilities operations
Responsibilities
* Perform hands-on operations and maintenance which includes all physical and administrative operations tasks, service, and maintenance in accordance with site processes and procedures to ensure the highest levels of uptime, efficiency, and safety without disruption to the business
* Complete work order requests accurately and on time in an enterprise asset management (EAM) system (Hexagon)
* Achieve and maintain a high-level of technical knowledge regarding data center infrastructure and operations. Successfully complete personnel qualification standards (PQS) training
* Provides technical expertise and assistance as required
* Respond quickly, using Meta procedures or emergency operating procedures to data center facility emergencies
* Regularly inspect equipment, buildings, safety routes and grounds to check or identify any abnormal or unsafe conditions or faults
* Troubleshoot, evaluate and recommend system upgrades
* Order parts and supplies for maintenance and repairs
* Work with vendors and contractors to ensure work is in accordance with the agreed Meta processes, procedures, and standards
* Escalate issues to facility management appropriately and timely
* Assist in scheduling and supervising vendors/subcontractors during equipment/systems maintenance and service
* Provide recommendation of improvements to the operations and maintenance program on an on-going basis
* Required to work on a shift schedule, which may include nights and holidays
* Utilize computerized tooling, in a control room environment, to operate remote equipment, monitor system alerts, centralize communications, diagnose faults, and prioritize dispatching of resources
About Meta
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today-beyond the constraints of screens, the limits of distance, and even the rules of physics.
Equal Employment Opportunity
Meta is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. You may view our Equal Employment Opportunity notice here.
Meta is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, fill out the Accommodations request form.
Global Infrastructure Engineer
New Albany, OH jobs
The Site Operations team is responsible for the delivery of data center compute and storage at Meta, enabling our family of apps and services to support a growing global community. We are seeking a forward-thinking individual skilled across multiple disciplines to lead global initiatives on this team. The mission of this role is to identify and tackle the biggest technical and operational challenges and opportunities before SiteOps. The Infrastructure Engineer is expected to personally advance our highest impact initiatives, and to work with others to closure through the right working groups and delegates. The scope of the role is Infra-wide; the DC Infra Engineer is expected to work with the data center teams, Core Systems, CEA, PE, and hardware engineering to architect and implement adaptable solutions that transform our infrastructure in dimensions including performance, efficiency, quality, and resiliency. Areas of emphasis include next gen platforms, tools, and technologies; the interplay between our platforms and data centers; and the underlying architecture of our infrastructure including physical vs logical layer trade-offs.
**Required Skills:**
Global Infrastructure Engineer Responsibilities:
1. Represent Site Operations in leading work to define and architect new solutions on global initiatives, working with stakeholders across Infra Data Centers & Infrastructure teams
2. Assemble and lead teams to address complex engineering challenges, requiring technical expertise as well as a broad understanding of Meta's overall infrastructure
3. Address issues that can be ambiguous and global in nature, requiring leadership and collaboration across time zones, teams, and technical domains
4. Act as key SME and mentor in the design, operation, and troubleshooting of tools, technologies, and processes utilized within Site Operations
5. Understand and assess risks and challenges associated with emerging new hardware, data center and software technologies, and define & implement effective mitigations for these
6. Employ a holistic understanding of the full infrastructure stack to lead solutions that appropriately balance physical and logical layer
7. Act as a global communication and advisory point of contact for the design, implementation and delivery of projects that affect our global data center and server fleet and facilitate resolution of issues drawing on local expertise and global support partners
8. Leverage data-driven methodologies to understand a problem at the onset, define a plan, and measure progress throughout a project
9. Provide data supplied narratives and ensure a focus on continuous improvement
10. Build and support, trusted, cross-functional connections with teams across the globe and serve as an advocate for the Site Operations Team with key stakeholders, influencing policies and procedures to improve global data center operations
11. Approximately 20% - 30% travel
**Minimum Qualifications:**
Minimum Qualifications:
12. Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
13. Knowledge of the full stack of infrastructure, with experience building or operating logical infrastructure on top of a complex, distributed physical infrastructure
14. Proven communication skills and experience working in a highly distributed environment, across teams/department boundaries
15. 10+ years of technical experience, in a large-scale data center or IT Infrastructure environment, or equivalent experience building platforms and systems for large scale compute
16. Experience building globally scalable solutions and translating global strategic initiatives into local executable projects
17. Knowledge of the interdependencies of data center functions and technologies including electrical, cooling, structured cabling, security, network, server and storage systems
18. Experience building, operating, and scaling with Linux or Unix Operating systems
19. Experience communicating the results of analysis and insights to cross functional teams and influencing the strategy of these teams
20. Experience with Data Center Design and Expansion
**Preferred Qualifications:**
Preferred Qualifications:
21. Extensive knowledge of storage and AI/ML related services and the hardware that supports them
22. Coding or scripting experience such as Bash, PHP, Python, SQL, or Perl
23. Experience in providing technical guidance to external vendors and partners. Knowledge and experience with virtualization, containerization, distributed systems, fault tolerance, and incident management
24. Experience with high level data center design, operations, basic electrical/mechanical infrastructure, and scaling physical infrastructure
**Public Compensation:**
$191,000/year to $262,000/year + bonus + equity + benefits
**Industry:** Internet
**Equal Opportunity:**
Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.
Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.
Critical Facility Engineer
New Albany, OH jobs
Meta is seeking a data center Critical Facility Engineer to join our Data Center Facility Operations team. Our data centers serve as the foundation upon which our software operates to meet the demands of our customers. The Critical Facility Engineer will be a part of the Facility Operations team responsible for operating and maintaining critical systems in our data centers.
**Required Skills:**
Critical Facility Engineer Responsibilities:
1. Perform hands-on operations and maintenance which includes all physical and administrative operations tasks, service, and maintenance in accordance with site processes and procedures to ensure the highest levels of uptime, efficiency, and safety without disruption to the business
2. Complete work order requests accurately and on time in an enterprise asset management (EAM) system (Hexagon)
3. Achieve and maintain a high-level of technical knowledge regarding data center infrastructure and operations. Successfully complete personnel qualification standards (PQS) training
4. Provides technical expertise and assistance as required
5. Respond quickly, using Meta procedures or emergency operating procedures to data center facility emergencies
6. Regularly inspect equipment, buildings, safety routes and grounds to check or identify any abnormal or unsafe conditions or faults
7. Troubleshoot, evaluate and recommend system upgrades
8. Order parts and supplies for maintenance and repairs
9. Work with vendors and contractors to ensure work is in accordance with the agreed Meta processes, procedures, and standards
10. Escalate issues to facility management appropriately and timely
11. Assist in scheduling and supervising vendors/subcontractors during equipment/systems maintenance and service
12. Provide recommendation of improvements to the operations and maintenance program on an on-going basis
13. Required to work on a shift schedule, which may include nights and holidays
14. Utilize computerized tooling, in a control room environment, to operate remote equipment, monitor system alerts, centralize communications, diagnose faults, and prioritize dispatching of resources
**Minimum Qualifications:**
Minimum Qualifications:
15. 6+ years of experience in electrical, HVAC, mechanical, controls, or other technical maintenance field
16. Associate's degree in engineering plus 4+ years experience or Bachelor's degree in related field plus 2+ years experience in electrical, HVAC, mechanical, or controls will be considered in lieu of 6+ years experience
17. Use hands and fingers
18. Reach/push/pull with arms/hands/shoulders
19. Stoop, kneel, crouch and crawl
20. Lift and/or otherwise move 45 pounds or more
21. Able to sit or stand at a workstation for extended periods of time
**Preferred Qualifications:**
Preferred Qualifications:
22. Knowledge of mechanical, electrical, and life safety monitoring and control systems typically used in critical environments
23. Knowledge of EAM/CMMS systems
24. Experience interpreting blueprints/CAD drawings
25. Trade Certification or state licensure in Electrical or Mechanical (HVAC)
26. 8+ years experience in a data center or other Critical Environment (pharma, clean room, medical, power production, etc.)
27. Professional affiliations (7x24 Exchange, IFMA, Data Center Pulse, etc.)
28. Knowledge of critical facilities operations
**Public Compensation:**
$42.31/hour to $58.65/hour + bonus + equity + benefits
**Industry:** Internet
**Equal Opportunity:**
Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.
Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.
Global Systems Engineer
New Albany, OH jobs
Meta is seeking a forward thinking, experienced Data Center Systems Engineer to join the Data Center Site Operations team. Our data centers, and the tens of thousands of servers installed in them, are the foundation upon which our rapidly scaling infrastructure efficiently operates and upon which our innovative services are delivered. Meta is at the leading edge of the global data center industry both in terms of how data centers are designed and operated. This person should enjoy working in a fast-paced environment where adaptability and flexibility will be key to their success. We seek a forward-thinking IT professional with experience utilizing multiple varied software tools to identify automation solutions intended to address complex operational issues. This role is cross-functional and considers the technical needs of frontline users to identify and automate diagnostic tooling which enables quality and efficient delivery of production servers. They should have demonstrated experience of performing data analysis to drive decisions on the top priorities to automate repairs on servers in a hyperscale environment. An engineer who can drive solutions with code, through collaboration, clear and timely communication with global teams.
**Required Skills:**
Global Systems Engineer Responsibilities:
1. Deliver maximum server fleet up-time and utilization rates, by leveraging data to understand hardware failure conditions and root cause. Identify trends and systemic issues in the fleet and drive resolution
2. Collaborate with stakeholders and subject matter experts to interpret business and operational needs, articulate success criteria in partnership with engineering and field based operations teams
3. Build cross functional relationships and have the capacity to influence policies and procedures to improve global data center operations
4. Mentor team members to evaluate and identify better ways to resolve issues and define updates to tools and processes
5. Write and review code, develop documentation, and debug the hardest problems, live, on some of the largest and most complex systems in the world
6. Participate in defining diagnostic tooling requirements with multiple cross-functional support teams
7. Execute validation and verification activities for the new product integration, including system level testing
8. Through consistent collaboration with cross-functional tooling teams, help determine root cause and provide input into their development process, with an operations central view of how open issues are affecting the fleet
9. Capacity to travel up to 25% required
**Minimum Qualifications:**
Minimum Qualifications:
10. Engineering degree or commensurate experience
11. 7+ years of experience in systems infrastructure operations or related field
12. Experience in configuration and maintenance of applications such as web servers, load balancers, relational databases, storage systems and messaging systems
13. Experience coding in higher-level languages (e.g., Python, PHP, C++, or Java)
14. Experience learning software, frameworks and APIs
15. Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
16. Experienced with Linux systems
**Public Compensation:**
$132,000/year to $191,000/year + bonus + equity + benefits
**Industry:** Internet
**Equal Opportunity:**
Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.
Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.
Global Infrastructure Engineer
New Albany, OH jobs
The Site Operations team is responsible for the delivery of data center compute and storage at Meta, enabling our family of apps and services to support a growing global community. We are seeking a forward-thinking individual skilled across multiple disciplines to lead global initiatives on this team. The mission of this role is to identify and tackle the biggest technical and operational challenges and opportunities before SiteOps. The Infrastructure Engineer is expected to personally advance our highest impact initiatives, and to work with others to closure through the right working groups and delegates. The scope of the role is Infra-wide; the DC Infra Engineer is expected to work with the data center teams, Core Systems, CEA, PE, and hardware engineering to architect and implement adaptable solutions that transform our infrastructure in dimensions including performance, efficiency, quality, and resiliency. Areas of emphasis include next gen platforms, tools, and technologies; the interplay between our platforms and data centers; and the underlying architecture of our infrastructure including physical vs logical layer trade-offs.
Minimum Qualifications
* Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
* Knowledge of the full stack of infrastructure, with experience building or operating logical infrastructure on top of a complex, distributed physical infrastructure
* Proven communication skills and experience working in a highly distributed environment, across teams/department boundaries
* 10+ years of technical experience, in a large-scale data center or IT Infrastructure environment, or equivalent experience building platforms and systems for large scale compute
* Experience building globally scalable solutions and translating global strategic initiatives into local executable projects
* Knowledge of the interdependencies of data center functions and technologies including electrical, cooling, structured cabling, security, network, server and storage systems
* Experience building, operating, and scaling with Linux or Unix Operating systems
* Experience communicating the results of analysis and insights to cross functional teams and influencing the strategy of these teams
* Experience with Data Center Design and Expansion
Preferred Qualifications
* Extensive knowledge of storage and AI/ML related services and the hardware that supports them
* Coding or scripting experience such as Bash, PHP, Python, SQL, or Perl
* Experience in providing technical guidance to external vendors and partners. Knowledge and experience with virtualization, containerization, distributed systems, fault tolerance, and incident management
* Experience with high level data center design, operations, basic electrical/mechanical infrastructure, and scaling physical infrastructure
Responsibilities
* Represent Site Operations in leading work to define and architect new solutions on global initiatives, working with stakeholders across Infra Data Centers & Infrastructure teams
* Assemble and lead teams to address complex engineering challenges, requiring technical expertise as well as a broad understanding of Meta's overall infrastructure
* Address issues that can be ambiguous and global in nature, requiring leadership and collaboration across time zones, teams, and technical domains
* Act as key SME and mentor in the design, operation, and troubleshooting of tools, technologies, and processes utilized within Site Operations
* Understand and assess risks and challenges associated with emerging new hardware, data center and software technologies, and define & implement effective mitigations for these
* Employ a holistic understanding of the full infrastructure stack to lead solutions that appropriately balance physical and logical layer
* Act as a global communication and advisory point of contact for the design, implementation and delivery of projects that affect our global data center and server fleet and facilitate resolution of issues drawing on local expertise and global support partners
* Leverage data-driven methodologies to understand a problem at the onset, define a plan, and measure progress throughout a project
* Provide data supplied narratives and ensure a focus on continuous improvement
* Build and support, trusted, cross-functional connections with teams across the globe and serve as an advocate for the Site Operations Team with key stakeholders, influencing policies and procedures to improve global data center operations
* Approximately 20% - 30% travel
About Meta
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today-beyond the constraints of screens, the limits of distance, and even the rules of physics.
Equal Employment Opportunity
Meta is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. You may view our Equal Employment Opportunity notice here.
Meta is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, fill out the Accommodations request form.
Global Systems Engineer
New Albany, OH jobs
Meta is seeking a forward thinking, experienced Data Center Systems Engineer to join the Data Center Site Operations team. Our data centers, and the tens of thousands of servers installed in them, are the foundation upon which our rapidly scaling infrastructure efficiently operates and upon which our innovative services are delivered. Meta is at the leading edge of the global data center industry both in terms of how data centers are designed and operated. This person should enjoy working in a fast-paced environment where adaptability and flexibility will be key to their success. We seek a forward-thinking IT professional with experience utilizing multiple varied software tools to identify automation solutions intended to address complex operational issues. This role is cross-functional and considers the technical needs of frontline users to identify and automate diagnostic tooling which enables quality and efficient delivery of production servers. They should have demonstrated experience of performing data analysis to drive decisions on the top priorities to automate repairs on servers in a hyperscale environment. An engineer who can drive solutions with code, through collaboration, clear and timely communication with global teams.
Minimum Qualifications
* Engineering degree or commensurate experience
* 7+ years of experience in systems infrastructure operations or related field
* Experience in configuration and maintenance of applications such as web servers, load balancers, relational databases, storage systems and messaging systems
* Experience coding in higher-level languages (e.g., Python, PHP, C++, or Java)
* Experience learning software, frameworks and APIs
* Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
* Experienced with Linux systems
Responsibilities
* Deliver maximum server fleet up-time and utilization rates, by leveraging data to understand hardware failure conditions and root cause. Identify trends and systemic issues in the fleet and drive resolution
* Collaborate with stakeholders and subject matter experts to interpret business and operational needs, articulate success criteria in partnership with engineering and field based operations teams
* Build cross functional relationships and have the capacity to influence policies and procedures to improve global data center operations
* Mentor team members to evaluate and identify better ways to resolve issues and define updates to tools and processes
* Write and review code, develop documentation, and debug the hardest problems, live, on some of the largest and most complex systems in the world
* Participate in defining diagnostic tooling requirements with multiple cross-functional support teams
* Execute validation and verification activities for the new product integration, including system level testing
* Through consistent collaboration with cross-functional tooling teams, help determine root cause and provide input into their development process, with an operations central view of how open issues are affecting the fleet
* Capacity to travel up to 25% required
About Meta
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today-beyond the constraints of screens, the limits of distance, and even the rules of physics.
Equal Employment Opportunity
Meta is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. You may view our Equal Employment Opportunity notice here.
Meta is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, fill out the Accommodations request form.
Global Infrastructure Engineer
Newark, OH jobs
The Site Operations team is responsible for the delivery of data center compute and storage at Meta, enabling our family of apps and services to support a growing global community. We are seeking a forward-thinking individual skilled across multiple disciplines to lead global initiatives on this team. The mission of this role is to identify and tackle the biggest technical and operational challenges and opportunities before SiteOps. The Infrastructure Engineer is expected to personally advance our highest impact initiatives, and to work with others to closure through the right working groups and delegates. The scope of the role is Infra-wide; the DC Infra Engineer is expected to work with the data center teams, Core Systems, CEA, PE, and hardware engineering to architect and implement adaptable solutions that transform our infrastructure in dimensions including performance, efficiency, quality, and resiliency. Areas of emphasis include next gen platforms, tools, and technologies; the interplay between our platforms and data centers; and the underlying architecture of our infrastructure including physical vs logical layer trade-offs.
**Required Skills:**
Global Infrastructure Engineer Responsibilities:
1. Represent Site Operations in leading work to define and architect new solutions on global initiatives, working with stakeholders across Infra Data Centers & Infrastructure teams
2. Assemble and lead teams to address complex engineering challenges, requiring technical expertise as well as a broad understanding of Meta's overall infrastructure
3. Address issues that can be ambiguous and global in nature, requiring leadership and collaboration across time zones, teams, and technical domains
4. Act as key SME and mentor in the design, operation, and troubleshooting of tools, technologies, and processes utilized within Site Operations
5. Understand and assess risks and challenges associated with emerging new hardware, data center and software technologies, and define & implement effective mitigations for these
6. Employ a holistic understanding of the full infrastructure stack to lead solutions that appropriately balance physical and logical layer
7. Act as a global communication and advisory point of contact for the design, implementation and delivery of projects that affect our global data center and server fleet and facilitate resolution of issues drawing on local expertise and global support partners
8. Leverage data-driven methodologies to understand a problem at the onset, define a plan, and measure progress throughout a project
9. Provide data supplied narratives and ensure a focus on continuous improvement
10. Build and support, trusted, cross-functional connections with teams across the globe and serve as an advocate for the Site Operations Team with key stakeholders, influencing policies and procedures to improve global data center operations
11. Approximately 20% - 30% travel
**Minimum Qualifications:**
Minimum Qualifications:
12. Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
13. Knowledge of the full stack of infrastructure, with experience building or operating logical infrastructure on top of a complex, distributed physical infrastructure
14. Proven communication skills and experience working in a highly distributed environment, across teams/department boundaries
15. 10+ years of technical experience, in a large-scale data center or IT Infrastructure environment, or equivalent experience building platforms and systems for large scale compute
16. Experience building globally scalable solutions and translating global strategic initiatives into local executable projects
17. Knowledge of the interdependencies of data center functions and technologies including electrical, cooling, structured cabling, security, network, server and storage systems
18. Experience building, operating, and scaling with Linux or Unix Operating systems
19. Experience communicating the results of analysis and insights to cross functional teams and influencing the strategy of these teams
20. Experience with Data Center Design and Expansion
**Preferred Qualifications:**
Preferred Qualifications:
21. Extensive knowledge of storage and AI/ML related services and the hardware that supports them
22. Coding or scripting experience such as Bash, PHP, Python, SQL, or Perl
23. Experience in providing technical guidance to external vendors and partners. Knowledge and experience with virtualization, containerization, distributed systems, fault tolerance, and incident management
24. Experience with high level data center design, operations, basic electrical/mechanical infrastructure, and scaling physical infrastructure
**Public Compensation:**
$191,000/year to $262,000/year + bonus + equity + benefits
**Industry:** Internet
**Equal Opportunity:**
Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.
Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.
Critical Facility Engineer
Bowling Green, OH jobs
Meta is seeking a data center Critical Facility Engineer to join our Data Center Facility Operations team. Our data centers serve as the foundation upon which our software operates to meet the demands of our customers. The Critical Facility Engineer will be a part of the Facility Operations team responsible for operating and maintaining critical systems in our data centers.
Minimum Qualifications
* 6+ years of experience in electrical, HVAC, mechanical, controls, or other technical maintenance field
* Associate's degree in engineering plus 4+ years experience or Bachelor's degree in related field plus 2+ years experience in electrical, HVAC, mechanical, or controls will be considered in lieu of 6+ years experience
* Use hands and fingers
* Reach/push/pull with arms/hands/shoulders
* Stoop, kneel, crouch and crawl
* Lift and/or otherwise move 45 pounds or more
* Able to sit or stand at a workstation for extended periods of time
Preferred Qualifications
* Knowledge of mechanical, electrical, and life safety monitoring and control systems typically used in critical environments
* Knowledge of EAM/CMMS systems
* Experience interpreting blueprints/CAD drawings
* Trade Certification or state licensure in Electrical or Mechanical (HVAC)
* 8+ years experience in a data center or other Critical Environment (pharma, clean room, medical, power production, etc.)
* Professional affiliations (7x24 Exchange, IFMA, Data Center Pulse, etc.)
* Knowledge of critical facilities operations
Responsibilities
* Perform hands-on operations and maintenance which includes all physical and administrative operations tasks, service, and maintenance in accordance with site processes and procedures to ensure the highest levels of uptime, efficiency, and safety without disruption to the business
* Complete work order requests accurately and on time in an enterprise asset management (EAM) system (Hexagon)
* Achieve and maintain a high-level of technical knowledge regarding data center infrastructure and operations. Successfully complete personnel qualification standards (PQS) training
* Provides technical expertise and assistance as required
* Respond quickly, using Meta procedures or emergency operating procedures to data center facility emergencies
* Regularly inspect equipment, buildings, safety routes and grounds to check or identify any abnormal or unsafe conditions or faults
* Troubleshoot, evaluate and recommend system upgrades
* Order parts and supplies for maintenance and repairs
* Work with vendors and contractors to ensure work is in accordance with the agreed Meta processes, procedures, and standards
* Escalate issues to facility management appropriately and timely
* Assist in scheduling and supervising vendors/subcontractors during equipment/systems maintenance and service
* Provide recommendation of improvements to the operations and maintenance program on an on-going basis
* Required to work on a shift schedule, which may include nights and holidays
* Utilize computerized tooling, in a control room environment, to operate remote equipment, monitor system alerts, centralize communications, diagnose faults, and prioritize dispatching of resources
About Meta
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today-beyond the constraints of screens, the limits of distance, and even the rules of physics.
Equal Employment Opportunity
Meta is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. You may view our Equal Employment Opportunity notice here.
Meta is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, fill out the Accommodations request form.
Global Infrastructure Engineer
Bowling Green, OH jobs
The Site Operations team is responsible for the delivery of data center compute and storage at Meta, enabling our family of apps and services to support a growing global community. We are seeking a forward-thinking individual skilled across multiple disciplines to lead global initiatives on this team. The mission of this role is to identify and tackle the biggest technical and operational challenges and opportunities before SiteOps. The Infrastructure Engineer is expected to personally advance our highest impact initiatives, and to work with others to closure through the right working groups and delegates. The scope of the role is Infra-wide; the DC Infra Engineer is expected to work with the data center teams, Core Systems, CEA, PE, and hardware engineering to architect and implement adaptable solutions that transform our infrastructure in dimensions including performance, efficiency, quality, and resiliency. Areas of emphasis include next gen platforms, tools, and technologies; the interplay between our platforms and data centers; and the underlying architecture of our infrastructure including physical vs logical layer trade-offs.
**Required Skills:**
Global Infrastructure Engineer Responsibilities:
1. Represent Site Operations in leading work to define and architect new solutions on global initiatives, working with stakeholders across Infra Data Centers & Infrastructure teams
2. Assemble and lead teams to address complex engineering challenges, requiring technical expertise as well as a broad understanding of Meta's overall infrastructure
3. Address issues that can be ambiguous and global in nature, requiring leadership and collaboration across time zones, teams, and technical domains
4. Act as key SME and mentor in the design, operation, and troubleshooting of tools, technologies, and processes utilized within Site Operations
5. Understand and assess risks and challenges associated with emerging new hardware, data center and software technologies, and define & implement effective mitigations for these
6. Employ a holistic understanding of the full infrastructure stack to lead solutions that appropriately balance physical and logical layer
7. Act as a global communication and advisory point of contact for the design, implementation and delivery of projects that affect our global data center and server fleet and facilitate resolution of issues drawing on local expertise and global support partners
8. Leverage data-driven methodologies to understand a problem at the onset, define a plan, and measure progress throughout a project
9. Provide data supplied narratives and ensure a focus on continuous improvement
10. Build and support, trusted, cross-functional connections with teams across the globe and serve as an advocate for the Site Operations Team with key stakeholders, influencing policies and procedures to improve global data center operations
11. Approximately 20% - 30% travel
**Minimum Qualifications:**
Minimum Qualifications:
12. Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
13. Knowledge of the full stack of infrastructure, with experience building or operating logical infrastructure on top of a complex, distributed physical infrastructure
14. Proven communication skills and experience working in a highly distributed environment, across teams/department boundaries
15. 10+ years of technical experience, in a large-scale data center or IT Infrastructure environment, or equivalent experience building platforms and systems for large scale compute
16. Experience building globally scalable solutions and translating global strategic initiatives into local executable projects
17. Knowledge of the interdependencies of data center functions and technologies including electrical, cooling, structured cabling, security, network, server and storage systems
18. Experience building, operating, and scaling with Linux or Unix Operating systems
19. Experience communicating the results of analysis and insights to cross functional teams and influencing the strategy of these teams
20. Experience with Data Center Design and Expansion
**Preferred Qualifications:**
Preferred Qualifications:
21. Extensive knowledge of storage and AI/ML related services and the hardware that supports them
22. Coding or scripting experience such as Bash, PHP, Python, SQL, or Perl
23. Experience in providing technical guidance to external vendors and partners. Knowledge and experience with virtualization, containerization, distributed systems, fault tolerance, and incident management
24. Experience with high level data center design, operations, basic electrical/mechanical infrastructure, and scaling physical infrastructure
**Public Compensation:**
$191,000/year to $262,000/year + bonus + equity + benefits
**Industry:** Internet
**Equal Opportunity:**
Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.
Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.
Global Infrastructure Engineer
Newark, OH jobs
The Site Operations team is responsible for the delivery of data center compute and storage at Meta, enabling our family of apps and services to support a growing global community. We are seeking a forward-thinking individual skilled across multiple disciplines to lead global initiatives on this team. The mission of this role is to identify and tackle the biggest technical and operational challenges and opportunities before SiteOps. The Infrastructure Engineer is expected to personally advance our highest impact initiatives, and to work with others to closure through the right working groups and delegates. The scope of the role is Infra-wide; the DC Infra Engineer is expected to work with the data center teams, Core Systems, CEA, PE, and hardware engineering to architect and implement adaptable solutions that transform our infrastructure in dimensions including performance, efficiency, quality, and resiliency. Areas of emphasis include next gen platforms, tools, and technologies; the interplay between our platforms and data centers; and the underlying architecture of our infrastructure including physical vs logical layer trade-offs.
Minimum Qualifications
* Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
* Knowledge of the full stack of infrastructure, with experience building or operating logical infrastructure on top of a complex, distributed physical infrastructure
* Proven communication skills and experience working in a highly distributed environment, across teams/department boundaries
* 10+ years of technical experience, in a large-scale data center or IT Infrastructure environment, or equivalent experience building platforms and systems for large scale compute
* Experience building globally scalable solutions and translating global strategic initiatives into local executable projects
* Knowledge of the interdependencies of data center functions and technologies including electrical, cooling, structured cabling, security, network, server and storage systems
* Experience building, operating, and scaling with Linux or Unix Operating systems
* Experience communicating the results of analysis and insights to cross functional teams and influencing the strategy of these teams
* Experience with Data Center Design and Expansion
Preferred Qualifications
* Extensive knowledge of storage and AI/ML related services and the hardware that supports them
* Coding or scripting experience such as Bash, PHP, Python, SQL, or Perl
* Experience in providing technical guidance to external vendors and partners. Knowledge and experience with virtualization, containerization, distributed systems, fault tolerance, and incident management
* Experience with high level data center design, operations, basic electrical/mechanical infrastructure, and scaling physical infrastructure
Responsibilities
* Represent Site Operations in leading work to define and architect new solutions on global initiatives, working with stakeholders across Infra Data Centers & Infrastructure teams
* Assemble and lead teams to address complex engineering challenges, requiring technical expertise as well as a broad understanding of Meta's overall infrastructure
* Address issues that can be ambiguous and global in nature, requiring leadership and collaboration across time zones, teams, and technical domains
* Act as key SME and mentor in the design, operation, and troubleshooting of tools, technologies, and processes utilized within Site Operations
* Understand and assess risks and challenges associated with emerging new hardware, data center and software technologies, and define & implement effective mitigations for these
* Employ a holistic understanding of the full infrastructure stack to lead solutions that appropriately balance physical and logical layer
* Act as a global communication and advisory point of contact for the design, implementation and delivery of projects that affect our global data center and server fleet and facilitate resolution of issues drawing on local expertise and global support partners
* Leverage data-driven methodologies to understand a problem at the onset, define a plan, and measure progress throughout a project
* Provide data supplied narratives and ensure a focus on continuous improvement
* Build and support, trusted, cross-functional connections with teams across the globe and serve as an advocate for the Site Operations Team with key stakeholders, influencing policies and procedures to improve global data center operations
* Approximately 20% - 30% travel
About Meta
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today-beyond the constraints of screens, the limits of distance, and even the rules of physics.
Equal Employment Opportunity
Meta is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. You may view our Equal Employment Opportunity notice here.
Meta is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, fill out the Accommodations request form.
Critical Facility Engineer
Bowling Green, OH jobs
Meta is seeking a data center Critical Facility Engineer to join our Data Center Facility Operations team. Our data centers serve as the foundation upon which our software operates to meet the demands of our customers. The Critical Facility Engineer will be a part of the Facility Operations team responsible for operating and maintaining critical systems in our data centers.
**Required Skills:**
Critical Facility Engineer Responsibilities:
1. Perform hands-on operations and maintenance which includes all physical and administrative operations tasks, service, and maintenance in accordance with site processes and procedures to ensure the highest levels of uptime, efficiency, and safety without disruption to the business
2. Complete work order requests accurately and on time in an enterprise asset management (EAM) system (Hexagon)
3. Achieve and maintain a high-level of technical knowledge regarding data center infrastructure and operations. Successfully complete personnel qualification standards (PQS) training
4. Provides technical expertise and assistance as required
5. Respond quickly, using Meta procedures or emergency operating procedures to data center facility emergencies
6. Regularly inspect equipment, buildings, safety routes and grounds to check or identify any abnormal or unsafe conditions or faults
7. Troubleshoot, evaluate and recommend system upgrades
8. Order parts and supplies for maintenance and repairs
9. Work with vendors and contractors to ensure work is in accordance with the agreed Meta processes, procedures, and standards
10. Escalate issues to facility management appropriately and timely
11. Assist in scheduling and supervising vendors/subcontractors during equipment/systems maintenance and service
12. Provide recommendation of improvements to the operations and maintenance program on an on-going basis
13. Required to work on a shift schedule, which may include nights and holidays
14. Utilize computerized tooling, in a control room environment, to operate remote equipment, monitor system alerts, centralize communications, diagnose faults, and prioritize dispatching of resources
**Minimum Qualifications:**
Minimum Qualifications:
15. 6+ years of experience in electrical, HVAC, mechanical, controls, or other technical maintenance field
16. Associate's degree in engineering plus 4+ years experience or Bachelor's degree in related field plus 2+ years experience in electrical, HVAC, mechanical, or controls will be considered in lieu of 6+ years experience
17. Use hands and fingers
18. Reach/push/pull with arms/hands/shoulders
19. Stoop, kneel, crouch and crawl
20. Lift and/or otherwise move 45 pounds or more
21. Able to sit or stand at a workstation for extended periods of time
**Preferred Qualifications:**
Preferred Qualifications:
22. Knowledge of mechanical, electrical, and life safety monitoring and control systems typically used in critical environments
23. Knowledge of EAM/CMMS systems
24. Experience interpreting blueprints/CAD drawings
25. Trade Certification or state licensure in Electrical or Mechanical (HVAC)
26. 8+ years experience in a data center or other Critical Environment (pharma, clean room, medical, power production, etc.)
27. Professional affiliations (7x24 Exchange, IFMA, Data Center Pulse, etc.)
28. Knowledge of critical facilities operations
**Public Compensation:**
$42.31/hour to $58.65/hour + bonus + equity + benefits
**Industry:** Internet
**Equal Opportunity:**
Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.
Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.
Global Infrastructure Engineer
Bowling Green, OH jobs
The Site Operations team is responsible for the delivery of data center compute and storage at Meta, enabling our family of apps and services to support a growing global community. We are seeking a forward-thinking individual skilled across multiple disciplines to lead global initiatives on this team. The mission of this role is to identify and tackle the biggest technical and operational challenges and opportunities before SiteOps. The Infrastructure Engineer is expected to personally advance our highest impact initiatives, and to work with others to closure through the right working groups and delegates. The scope of the role is Infra-wide; the DC Infra Engineer is expected to work with the data center teams, Core Systems, CEA, PE, and hardware engineering to architect and implement adaptable solutions that transform our infrastructure in dimensions including performance, efficiency, quality, and resiliency. Areas of emphasis include next gen platforms, tools, and technologies; the interplay between our platforms and data centers; and the underlying architecture of our infrastructure including physical vs logical layer trade-offs.
Minimum Qualifications
* Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
* Knowledge of the full stack of infrastructure, with experience building or operating logical infrastructure on top of a complex, distributed physical infrastructure
* Proven communication skills and experience working in a highly distributed environment, across teams/department boundaries
* 10+ years of technical experience, in a large-scale data center or IT Infrastructure environment, or equivalent experience building platforms and systems for large scale compute
* Experience building globally scalable solutions and translating global strategic initiatives into local executable projects
* Knowledge of the interdependencies of data center functions and technologies including electrical, cooling, structured cabling, security, network, server and storage systems
* Experience building, operating, and scaling with Linux or Unix Operating systems
* Experience communicating the results of analysis and insights to cross functional teams and influencing the strategy of these teams
* Experience with Data Center Design and Expansion
Preferred Qualifications
* Extensive knowledge of storage and AI/ML related services and the hardware that supports them
* Coding or scripting experience such as Bash, PHP, Python, SQL, or Perl
* Experience in providing technical guidance to external vendors and partners. Knowledge and experience with virtualization, containerization, distributed systems, fault tolerance, and incident management
* Experience with high level data center design, operations, basic electrical/mechanical infrastructure, and scaling physical infrastructure
Responsibilities
* Represent Site Operations in leading work to define and architect new solutions on global initiatives, working with stakeholders across Infra Data Centers & Infrastructure teams
* Assemble and lead teams to address complex engineering challenges, requiring technical expertise as well as a broad understanding of Meta's overall infrastructure
* Address issues that can be ambiguous and global in nature, requiring leadership and collaboration across time zones, teams, and technical domains
* Act as key SME and mentor in the design, operation, and troubleshooting of tools, technologies, and processes utilized within Site Operations
* Understand and assess risks and challenges associated with emerging new hardware, data center and software technologies, and define & implement effective mitigations for these
* Employ a holistic understanding of the full infrastructure stack to lead solutions that appropriately balance physical and logical layer
* Act as a global communication and advisory point of contact for the design, implementation and delivery of projects that affect our global data center and server fleet and facilitate resolution of issues drawing on local expertise and global support partners
* Leverage data-driven methodologies to understand a problem at the onset, define a plan, and measure progress throughout a project
* Provide data supplied narratives and ensure a focus on continuous improvement
* Build and support, trusted, cross-functional connections with teams across the globe and serve as an advocate for the Site Operations Team with key stakeholders, influencing policies and procedures to improve global data center operations
* Approximately 20% - 30% travel
About Meta
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today-beyond the constraints of screens, the limits of distance, and even the rules of physics.
Equal Employment Opportunity
Meta is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. You may view our Equal Employment Opportunity notice here.
Meta is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, fill out the Accommodations request form.
Distinguished Engineer - Spark Infrastructure Architecture
Richfield, OH jobs
Splunk, a Cisco company, is building a safer and more resilient digital world with an end-to-end full stack platform made for a hybrid, multi-cloud world. Leading enterprises use our unified security and observability platform to keep their digital systems secure and reliable. Come help organizations be their best, while you reach new heights with a team that has your back.
**Meet the Team:**
Our Distinguished Engineer team drives architecture and technical direction for Splunk's $3.5B platform, ingesting petabytes of data for over 95% of the Fortune 100. As a tight-knit group of DEs reporting to the VP of Architecture, we operate autonomously while coordinating closely on practical, durable solutions that evolve the platform for future requirements. Each DE owns cross-organizational domains that shift fluidly as needs arise. Work alongside exceptionally smart engineers who value rigorous thinking and friendly collaboration-this is where the technical future gets defined!
**Impact:**
Architect and operationalize Spark at scale as a core component of Cisco Data Fabric, enabling customers to access and analyze exponentially more data across diverse organizational sources. Build resilient, always-on infrastructure with automatic retry logic, fault tolerance, and seamless integration across myriad data sources and formats including Parquet, Iceberg, and modern table formats. Design the operational backbone-monitoring, observability, error handling, and horizontal scalability-that enables hundreds of engineers to build features confidently on solid foundations. Establish architectural patterns and support mechanisms that ensure high availability while maintaining performance at petabyte scale. Success is measured by customer adoption of data fabric capabilities and operational reliability metrics-uptime, ingestion performance, and incident reduction-that directly impact customer value.
**Minimum Qualifications:**
+ Bachelor's in Computer Science (or equivalent) with 15+ years of related experience; or Master's with 12+ years; or PhD with 8+ years or equivalent experience
+ Designed and deployed production Spark deployments at scale in cloud environments
+ Experience with modern data formats including Parquet, Iceberg, and related table/columnar formats
+ Production experience with AWS cloud platform
+ Led technical decisions and architectural direction across engineering organizations of 50+ engineers
**Preferred Qualifications:**
+ Production experience with Azure and/or GCP cloud platforms
+ Experience integrating with diverse data platforms such as Snowflake, Pinot, Databricks, Trino, or similar systems
+ Extensive Kubernetes experience in production environments
+ Proven track record mentoring and growing engineers, with strong collaboration skills across sponsors, design, product management, and engineering teams
**Why Cisco?**
At Cisco, we're revolutionizing how data and infrastructure connect and protect organizations in the AI era - and beyond. We've been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint.
Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you'll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere.
We are Cisco, and our power starts with you.
**Message to applicants applying to work in the U.S. and/or Canada:**
The starting salary range posted for this position is $267,600.00 to $339,400.00 and reflects the projected salary range for new hires in this position in U.S. and/or Canada locations, not including incentive compensation*, equity, or benefits.
Individual pay is determined by the candidate's hiring location, market conditions, job-related skillset, experience, qualifications, education, certifications, and/or training. The full salary range for certain locations is listed below. For locations not listed below, the recruiter can share more details about compensation for the role in your location during the hiring process.
U.S. employees are offered benefits, subject to Cisco's plan eligibility rules, which include medical, dental and vision insurance, a 401(k) plan with a Cisco matching contribution, paid parental leave, short and long-term disability coverage, and basic life insurance. Please see the Cisco careers site to discover more benefits and perks. Employees may be eligible to receive grants of Cisco restricted stock units, which vest following continued employment with Cisco for defined periods of time.
U.S. employees are eligible for paid time away as described below, subject to Cisco's policies:
+ 10 paid holidays per full calendar year, plus 1 floating holiday for non-exempt employees
+ 1 paid day off for employee's birthday, paid year-end holiday shutdown, and 4 paid days off for personal wellness determined by Cisco
+ Non-exempt employees** receive 16 days of paid vacation time per full calendar year, accrued at rate of 4.92 hours per pay period for full-time employees
+ Exempt employees participate in Cisco's flexible vacation time off program, which has no defined limit on how much vacation time eligible employees may use (subject to availability and some business limitations)
+ 80 hours of sick time off provided on hire date and each January 1st thereafter, and up to 80 hours of unused sick time carried forward from one calendar year to the next
+ Additional paid time away may be requested to deal with critical or emergency issues for family members
+ Optional 10 paid days per full calendar year to volunteer
For non-sales roles, employees are also eligible to earn annual bonuses subject to Cisco's policies.
Employees on sales plans earn performance-based incentive pay on top of their base salary, which is split between quota and non-quota components, subject to the applicable Cisco plan. For quota-based incentive pay, Cisco typically pays as follows:
+ .75% of incentive target for each 1% of revenue attainment up to 50% of quota;
+ 1.5% of incentive target for each 1% of attainment between 50% and 75%;
+ 1% of incentive target for each 1% of attainment between 75% and 100%; and
+ Once performance exceeds 100% attainment, incentive rates are at or above 1% for each 1% of attainment with no cap on incentive compensation.
For non-quota-based sales performance elements such as strategic sales objectives, Cisco may pay 0% up to 125% of target. Cisco sales plans do not have a minimum threshold of performance for sales incentive compensation to be paid.
The applicable full salary ranges for this position, by specific state, are listed below:
New York City Metro Area:
$273,200.00 - $442,600.00
Non-Metro New York state & Washington state:
$267,600.00 - $390,300.00
* For quota-based sales roles on Cisco's sales plan, the ranges provided in this posting include base pay and sales target incentive compensation combined.
** Employees in Illinois, whether exempt or non-exempt, will participate in a unique time off program to meet local requirements.
Cisco is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis.
Cisco will consider for employment, on a case by case basis, qualified applicants with arrest and conviction records.
Distinguished Engineer - AI Infrastructure Architecture
Richfield, OH jobs
Splunk, a Cisco company, is building a safer and more resilient digital world with an end-to-end full stack platform made for a hybrid, multi-cloud world. Leading enterprises use our unified security and observability platform to keep their digital systems secure and reliable. Come help organizations be their best, while you reach new heights with a team that has your back.
**Meet the Team**
Our Distinguished Engineer team drives architecture and technical direction for Splunk's $3.5B platform, ingesting petabytes of data for over 95% of the Fortune 100. As a tight-knit group of DEs reporting to the VP of Architecture, we operate autonomously while coordinating closely on practical, durable solutions that evolve the platform for future requirements. Each DE owns cross-organizational domains that shift fluidly as needs arise. Work alongside exceptionally smart engineers who value rigorous thinking and friendly collaboration-this is where the technical future gets defined!
**Impact:**
Architect and operationalize AI infrastructure as a core component of Cisco Data Fabric, enabling engineering teams to integrate AI capabilities across the world's largest, most diverse data sets. Design MLOps platforms that solve complex lifecycle challenges-managing embedding model migrations, version compatibility, and continuous model updates-while ensuring operational excellence through monitoring, governance, and serving at petabyte scale. Identify and drive new opportunities for leveraging AI across the stack, establishing consistent architectural patterns for agent prompts, tool integration, and model orchestration. Build data pipelines and operational frameworks that enable hundreds of engineers to confidently deploy both LLMs and traditional ML models into production. Success is measured by customer adoption and usage of AI-powered features, alongside operational metrics-model performance, inference latency, and deployment velocity-that directly impact customer value.
**Minimum Qualifications:**
+ Bachelor's in Computer Science (or equivalent) with 15+ years of related experience; or Master's with 12+ years; or PhD with 8+ years or equivalent experience
+ Designed and deployed AI/ML infrastructure and features in production cloud environments
+ Experience with major AI services including OpenAI, Anthropic, HuggingFace, AWS Bedrock, Azure OpenAI Service, or similar platforms
+ Production experience with AWS, Azure, or GCP cloud platform
+ Led technical decisions and architectural direction across engineering organizations of 50+ engineers
**Preferred Qualifications:**
+ Experience with LLM-specific infrastructure including agent frameworks, prompt management, and tool integration
+ Model serving at scale with experience in inference optimization and performance monitoring
+ Designed and deployed AI/ML infrastructure for on-premises environments
+ Experience with major ML frameworks (TensorFlow, PyTorch, etc.) and model formats
+ Proven track record mentoring and growing engineers, with strong collaboration skills
**Why Cisco?**
At Cisco, we're revolutionizing how data and infrastructure connect and protect organizations in the AI era - and beyond. We've been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint.
Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you'll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere.
We are Cisco, and our power starts with you.
**Message to applicants applying to work in the U.S. and/or Canada:**
The starting salary range posted for this position is $267,600.00 to $339,400.00 and reflects the projected salary range for new hires in this position in U.S. and/or Canada locations, not including incentive compensation*, equity, or benefits.
Individual pay is determined by the candidate's hiring location, market conditions, job-related skillset, experience, qualifications, education, certifications, and/or training. The full salary range for certain locations is listed below. For locations not listed below, the recruiter can share more details about compensation for the role in your location during the hiring process.
U.S. employees are offered benefits, subject to Cisco's plan eligibility rules, which include medical, dental and vision insurance, a 401(k) plan with a Cisco matching contribution, paid parental leave, short and long-term disability coverage, and basic life insurance. Please see the Cisco careers site to discover more benefits and perks. Employees may be eligible to receive grants of Cisco restricted stock units, which vest following continued employment with Cisco for defined periods of time.
U.S. employees are eligible for paid time away as described below, subject to Cisco's policies:
+ 10 paid holidays per full calendar year, plus 1 floating holiday for non-exempt employees
+ 1 paid day off for employee's birthday, paid year-end holiday shutdown, and 4 paid days off for personal wellness determined by Cisco
+ Non-exempt employees** receive 16 days of paid vacation time per full calendar year, accrued at rate of 4.92 hours per pay period for full-time employees
+ Exempt employees participate in Cisco's flexible vacation time off program, which has no defined limit on how much vacation time eligible employees may use (subject to availability and some business limitations)
+ 80 hours of sick time off provided on hire date and each January 1st thereafter, and up to 80 hours of unused sick time carried forward from one calendar year to the next
+ Additional paid time away may be requested to deal with critical or emergency issues for family members
+ Optional 10 paid days per full calendar year to volunteer
For non-sales roles, employees are also eligible to earn annual bonuses subject to Cisco's policies.
Employees on sales plans earn performance-based incentive pay on top of their base salary, which is split between quota and non-quota components, subject to the applicable Cisco plan. For quota-based incentive pay, Cisco typically pays as follows:
+ .75% of incentive target for each 1% of revenue attainment up to 50% of quota;
+ 1.5% of incentive target for each 1% of attainment between 50% and 75%;
+ 1% of incentive target for each 1% of attainment between 75% and 100%; and
+ Once performance exceeds 100% attainment, incentive rates are at or above 1% for each 1% of attainment with no cap on incentive compensation.
For non-quota-based sales performance elements such as strategic sales objectives, Cisco may pay 0% up to 125% of target. Cisco sales plans do not have a minimum threshold of performance for sales incentive compensation to be paid.
The applicable full salary ranges for this position, by specific state, are listed below:
New York City Metro Area:
$273,200.00 - $442,600.00
Non-Metro New York state & Washington state:
$267,600.00 - $390,300.00
* For quota-based sales roles on Cisco's sales plan, the ranges provided in this posting include base pay and sales target incentive compensation combined.
** Employees in Illinois, whether exempt or non-exempt, will participate in a unique time off program to meet local requirements.
Cisco is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis.
Cisco will consider for employment, on a case by case basis, qualified applicants with arrest and conviction records.
Distinguished Engineer - Spark Infrastructure Architecture
Richfield, OH jobs
Splunk, a Cisco company, is building a safer and more resilient digital world with an end-to-end full stack platform made for a hybrid, multi-cloud world. Leading enterprises use our unified security and observability platform to keep their digital systems secure and reliable. Come help organizations be their best, while you reach new heights with a team that has your back.
Meet the Team:
Our Distinguished Engineer team drives architecture and technical direction for Splunk's $3.5B platform, ingesting petabytes of data for over 95% of the Fortune 100. As a tight-knit group of DEs reporting to the VP of Architecture, we operate autonomously while coordinating closely on practical, durable solutions that evolve the platform for future requirements. Each DE owns cross-organizational domains that shift fluidly as needs arise. Work alongside exceptionally smart engineers who value rigorous thinking and friendly collaboration-this is where the technical future gets defined!
Impact:
Architect and operationalize Spark at scale as a core component of Cisco Data Fabric, enabling customers to access and analyze exponentially more data across diverse organizational sources. Build resilient, always-on infrastructure with automatic retry logic, fault tolerance, and seamless integration across myriad data sources and formats including Parquet, Iceberg, and modern table formats. Design the operational backbone-monitoring, observability, error handling, and horizontal scalability-that enables hundreds of engineers to build features confidently on solid foundations. Establish architectural patterns and support mechanisms that ensure high availability while maintaining performance at petabyte scale. Success is measured by customer adoption of data fabric capabilities and operational reliability metrics-uptime, ingestion performance, and incident reduction-that directly impact customer value.
Minimum Qualifications:
* Bachelor's in Computer Science (or equivalent) with 15+ years of related experience; or Master's with 12+ years; or PhD with 8+ years or equivalent experience
* Designed and deployed production Spark deployments at scale in cloud environments
* Experience with modern data formats including Parquet, Iceberg, and related table/columnar formats
* Production experience with AWS cloud platform
* Led technical decisions and architectural direction across engineering organizations of 50+ engineers
Preferred Qualifications:
* Production experience with Azure and/or GCP cloud platforms
* Experience integrating with diverse data platforms such as Snowflake, Pinot, Databricks, Trino, or similar systems
* Extensive Kubernetes experience in production environments
* Proven track record mentoring and growing engineers, with strong collaboration skills across sponsors, design, product management, and engineering teams
Why Cisco?
At Cisco, we're revolutionizing how data and infrastructure connect and protect organizations in the AI era - and beyond. We've been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint.
Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you'll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere.
We are Cisco, and our power starts with you.
Message to applicants applying to work in the U.S. and/or Canada:
The starting salary range posted for this position is $267,600.00 to $339,400.00 and reflects the projected salary range for new hires in this position in U.S. and/or Canada locations, not including incentive compensation*, equity, or benefits.
Individual pay is determined by the candidate's hiring location, market conditions, job-related skillset, experience, qualifications, education, certifications, and/or training. The full salary range for certain locations is listed below. For locations not listed below, the recruiter can share more details about compensation for the role in your location during the hiring process.
U.S. employees are offered benefits, subject to Cisco's plan eligibility rules, which include medical, dental and vision insurance, a 401(k) plan with a Cisco matching contribution, paid parental leave, short and long-term disability coverage, and basic life insurance. Please see the Cisco careers site to discover more benefits and perks. Employees may be eligible to receive grants of Cisco restricted stock units, which vest following continued employment with Cisco for defined periods of time.
U.S. employees are eligible for paid time away as described below, subject to Cisco's policies:
* 10 paid holidays per full calendar year, plus 1 floating holiday for non-exempt employees
* 1 paid day off for employee's birthday, paid year-end holiday shutdown, and 4 paid days off for personal wellness determined by Cisco
* Non-exempt employees receive 16 days of paid vacation time per full calendar year, accrued at rate of 4.92 hours per pay period for full-time employees
* Exempt employees participate in Cisco's flexible vacation time off program, which has no defined limit on how much vacation time eligible employees may use (subject to availability and some business limitations)
* 80 hours of sick time off provided on hire date and each January 1st thereafter, and up to 80 hours of unused sick time carried forward from one calendar year to the next
* Additional paid time away may be requested to deal with critical or emergency issues for family members
* Optional 10 paid days per full calendar year to volunteer
For non-sales roles, employees are also eligible to earn annual bonuses subject to Cisco's policies.
Employees on sales plans earn performance-based incentive pay on top of their base salary, which is split between quota and non-quota components, subject to the applicable Cisco plan. For quota-based incentive pay, Cisco typically pays as follows:
* .75% of incentive target for each 1% of revenue attainment up to 50% of quota;
* 1.5% of incentive target for each 1% of attainment between 50% and 75%;
* 1% of incentive target for each 1% of attainment between 75% and 100%; and
* Once performance exceeds 100% attainment, incentive rates are at or above 1% for each 1% of attainment with no cap on incentive compensation.
For non-quota-based sales performance elements such as strategic sales objectives, Cisco may pay 0% up to 125% of target. Cisco sales plans do not have a minimum threshold of performance for sales incentive compensation to be paid.
The applicable full salary ranges for this position, by specific state, are listed below:
New York City Metro Area:
$273,200.00 - $442,600.00
Non-Metro New York state & Washington state:
$267,600.00 - $390,300.00
* For quota-based sales roles on Cisco's sales plan, the ranges provided in this posting include base pay and sales target incentive compensation combined.
Employees in Illinois, whether exempt or non-exempt, will participate in a unique time off program to meet local requirements.
Distinguished Engineer - AI Infrastructure Architecture
Richfield, OH jobs
Splunk, a Cisco company, is building a safer and more resilient digital world with an end-to-end full stack platform made for a hybrid, multi-cloud world. Leading enterprises use our unified security and observability platform to keep their digital systems secure and reliable. Come help organizations be their best, while you reach new heights with a team that has your back.
Meet the Team
Our Distinguished Engineer team drives architecture and technical direction for Splunk's $3.5B platform, ingesting petabytes of data for over 95% of the Fortune 100. As a tight-knit group of DEs reporting to the VP of Architecture, we operate autonomously while coordinating closely on practical, durable solutions that evolve the platform for future requirements. Each DE owns cross-organizational domains that shift fluidly as needs arise. Work alongside exceptionally smart engineers who value rigorous thinking and friendly collaboration-this is where the technical future gets defined!
Impact:
Architect and operationalize AI infrastructure as a core component of Cisco Data Fabric, enabling engineering teams to integrate AI capabilities across the world's largest, most diverse data sets. Design MLOps platforms that solve complex lifecycle challenges-managing embedding model migrations, version compatibility, and continuous model updates-while ensuring operational excellence through monitoring, governance, and serving at petabyte scale. Identify and drive new opportunities for leveraging AI across the stack, establishing consistent architectural patterns for agent prompts, tool integration, and model orchestration. Build data pipelines and operational frameworks that enable hundreds of engineers to confidently deploy both LLMs and traditional ML models into production. Success is measured by customer adoption and usage of AI-powered features, alongside operational metrics-model performance, inference latency, and deployment velocity-that directly impact customer value.
Minimum Qualifications:
* Bachelor's in Computer Science (or equivalent) with 15+ years of related experience; or Master's with 12+ years; or PhD with 8+ years or equivalent experience
* Designed and deployed AI/ML infrastructure and features in production cloud environments
* Experience with major AI services including OpenAI, Anthropic, HuggingFace, AWS Bedrock, Azure OpenAI Service, or similar platforms
* Production experience with AWS, Azure, or GCP cloud platform
* Led technical decisions and architectural direction across engineering organizations of 50+ engineers
Preferred Qualifications:
* Experience with LLM-specific infrastructure including agent frameworks, prompt management, and tool integration
* Model serving at scale with experience in inference optimization and performance monitoring
* Designed and deployed AI/ML infrastructure for on-premises environments
* Experience with major ML frameworks (TensorFlow, PyTorch, etc.) and model formats
* Proven track record mentoring and growing engineers, with strong collaboration skills
Why Cisco?
At Cisco, we're revolutionizing how data and infrastructure connect and protect organizations in the AI era - and beyond. We've been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint.
Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you'll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere.
We are Cisco, and our power starts with you.
Message to applicants applying to work in the U.S. and/or Canada:
The starting salary range posted for this position is $267,600.00 to $339,400.00 and reflects the projected salary range for new hires in this position in U.S. and/or Canada locations, not including incentive compensation*, equity, or benefits.
Individual pay is determined by the candidate's hiring location, market conditions, job-related skillset, experience, qualifications, education, certifications, and/or training. The full salary range for certain locations is listed below. For locations not listed below, the recruiter can share more details about compensation for the role in your location during the hiring process.
U.S. employees are offered benefits, subject to Cisco's plan eligibility rules, which include medical, dental and vision insurance, a 401(k) plan with a Cisco matching contribution, paid parental leave, short and long-term disability coverage, and basic life insurance. Please see the Cisco careers site to discover more benefits and perks. Employees may be eligible to receive grants of Cisco restricted stock units, which vest following continued employment with Cisco for defined periods of time.
U.S. employees are eligible for paid time away as described below, subject to Cisco's policies:
* 10 paid holidays per full calendar year, plus 1 floating holiday for non-exempt employees
* 1 paid day off for employee's birthday, paid year-end holiday shutdown, and 4 paid days off for personal wellness determined by Cisco
* Non-exempt employees receive 16 days of paid vacation time per full calendar year, accrued at rate of 4.92 hours per pay period for full-time employees
* Exempt employees participate in Cisco's flexible vacation time off program, which has no defined limit on how much vacation time eligible employees may use (subject to availability and some business limitations)
* 80 hours of sick time off provided on hire date and each January 1st thereafter, and up to 80 hours of unused sick time carried forward from one calendar year to the next
* Additional paid time away may be requested to deal with critical or emergency issues for family members
* Optional 10 paid days per full calendar year to volunteer
For non-sales roles, employees are also eligible to earn annual bonuses subject to Cisco's policies.
Employees on sales plans earn performance-based incentive pay on top of their base salary, which is split between quota and non-quota components, subject to the applicable Cisco plan. For quota-based incentive pay, Cisco typically pays as follows:
* .75% of incentive target for each 1% of revenue attainment up to 50% of quota;
* 1.5% of incentive target for each 1% of attainment between 50% and 75%;
* 1% of incentive target for each 1% of attainment between 75% and 100%; and
* Once performance exceeds 100% attainment, incentive rates are at or above 1% for each 1% of attainment with no cap on incentive compensation.
For non-quota-based sales performance elements such as strategic sales objectives, Cisco may pay 0% up to 125% of target. Cisco sales plans do not have a minimum threshold of performance for sales incentive compensation to be paid.
The applicable full salary ranges for this position, by specific state, are listed below:
New York City Metro Area:
$273,200.00 - $442,600.00
Non-Metro New York state & Washington state:
$267,600.00 - $390,300.00
* For quota-based sales roles on Cisco's sales plan, the ranges provided in this posting include base pay and sales target incentive compensation combined.
Employees in Illinois, whether exempt or non-exempt, will participate in a unique time off program to meet local requirements.