Can I apply directly for this job on this page?

Yes, you can begin your application on this page using a quick form. You'll then be redirected to the employer's career site to complete the full application process.

What type of employment is offered for this Datacenter Hardware Operations Technician Lead, Industrial Compute role?

Full-time or part-time position

What is the expected salary for this Datacenter Hardware Operations Technician Lead, Industrial Compute job?

Compensation will be discussed during the hiring process.

Datacenter Hardware Operations Technician Lead, Industrial Compute job near me in United States, Other / Non-US at Jobgether

Datacenter Hardware Operations Technician Lead, Industrial Compute

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Datacenter Hardware Operations Technician Lead, Industrial Compute based in United States.

This role sits at the core of large-scale AI infrastructure reliability, where hands-on datacenter expertise directly supports the performance of advanced compute environments powering frontier AI systems. You will act as the senior on-site technical authority for hardware operations, ensuring the stability, availability, and lifecycle health of GPU, server, and storage systems. The position combines deep technical troubleshooting with operational leadership across high-density industrial compute campuses. You will partner closely with engineering, operations, and external vendors to resolve complex hardware issues and drive long-term reliability improvements. The environment is fast-scaling, mission-critical, and deeply collaborative, requiring both precision execution and systems-level thinking. This is a highly impactful role shaping the operational backbone of next-generation AI infrastructure.

Accountabilities:

In this role, you will lead on-site hardware operations and ensure the reliability and performance of large-scale compute infrastructure supporting mission-critical workloads.

Serve as the senior on-site technical lead for server, GPU, storage, and rack-level hardware operations
Drive diagnosis, triage, and resolution of complex hardware failures impacting production systems
Lead root cause analysis (RCA) efforts and implement corrective and preventive actions to improve fleet reliability
Partner with engineering, OEM vendors, and operations teams to manage repairs, replacements, and lifecycle activities
Develop, refine, and standardize hardware maintenance procedures, troubleshooting runbooks, and operational best practices
Analyze hardware failure trends and operational telemetry to identify risks and reliability improvement opportunities
Support hardware onboarding, validation, and production readiness for new infrastructure deployments
Mentor technicians and partner teams on advanced troubleshooting and hardware reliability practices

Requirements:

This role requires extensive experience in large-scale datacenter environments, with strong technical depth in hardware systems and proven leadership in operational troubleshooting.

8+ years of experience in datacenter hardware operations, sustaining engineering, or senior technician roles
Strong expertise in server, GPU, storage, and rack-level infrastructure in large-scale environments
Proven ability to diagnose complex hardware failures and lead high-priority production incident resolution
Experience conducting root cause analysis and driving long-term reliability improvements
Solid understanding of hardware reliability engineering, fleet health, and operational monitoring systems
Ability to collaborate across engineering, operations, and vendor ecosystems in high-pressure environments
Strong communication skills with experience documenting processes and influencing technical decisions
Familiarity with Linux systems, hardware validation workflows, and datacenter tooling is a plus

Benefits:

Competitive base compensation with equity and performance-based bonus eligibility
Comprehensive medical, dental, and vision coverage with employer contributions
401(k) retirement plan with employer match
Generous paid time off, holidays, and company-wide recharge breaks
Paid parental leave, medical leave, and caregiver support programs
Annual learning and development stipend for professional growth
Wellness and mental health support resources
Relocation support for eligible employees and additional lifestyle benefits

How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

#LI-CL1

Datacenter Hardware Operations Technician Lead, Industrial Compute in United States at Jobgether

Explore Related Opportunities

Job Description

Scan to Apply

Job Location

Frequently asked questions about this position