Can I apply directly for this job on this page?

Yes, you can begin your application on this page using a quick form. You'll then be redirected to the employer's career site to complete the full application process.

What is the role of a Infrastructure Engineer (GPU & Compute) at Jobgether?

The Infrastructure Engineer (GPU & Compute) position at Jobgether is a Full-time or part-time position opportunity in the Engineering field.

Where is this Infrastructure Engineer (GPU & Compute) job located?

United States, Other / Non-US, United States

What type of employment is offered for this Infrastructure Engineer (GPU & Compute) role?

Full-time or part-time position

What is the expected salary for this Infrastructure Engineer (GPU & Compute) job?

Compensation will be discussed during the hiring process.

Infrastructure Engineer (GPU & Compute) job near me in United States, Other / Non-US at Jobgether

Infrastructure Engineer (GPU & Compute)

This position is posted by Jobgether on behalf of a partner company. We are currently looking for an Infrastructure Engineer (GPU & Compute) in the United States.

This role is at the core of building and scaling high-performance infrastructure designed for modern AI and machine learning workloads. You will work across hardware, systems, and software layers to ensure GPU-enabled environments are reliable, efficient, and production-ready from day one. The position combines deep technical expertise with hands-on ownership of image pipelines, system validation, and large-scale compute environments. You will play a critical role in enabling seamless deployment and operation of cutting-edge AI infrastructure by improving automation, diagnostics, and performance. Collaborating with cross-functional teams, you will help bring new systems online, validate next-generation hardware, and enhance operational efficiency. This is a high-impact opportunity within a fast-paced, innovation-driven environment focused on scaling compute for the future of AI.

Accountabilities:

Own and evolve systems for image management, deployment, and validation across large-scale bare-metal and GPU-enabled infrastructure environments.
Maintain and operate validation clusters used for system diagnostics, testing, and infrastructure bring-up to ensure readiness and reliability.
Lead GPU diagnostics and validation workflows, identifying performance bottlenecks, failure patterns, and system-level issues across hardware and software layers.
Build and enhance automation tools and workflows (primarily in Python) to streamline provisioning, validation, and operational processes.
Support hardware qualification efforts for new platforms, including firmware, drivers, and operating system validation.
Manage Linux-based production and validation environments, including virtualization and bare-metal provisioning systems (e.g., PXE workflows).
Collaborate with infrastructure, hardware, data center, and ML teams to align systems with workload requirements and ensure optimal performance.
Contribute to best practices for infrastructure lifecycle management, system diagnostics, and scalability improvements.

Requirements:

5+ years of experience in infrastructure engineering, systems engineering, or related technical roles.
Strong expertise in Linux systems administration within production or large-scale environments.
Hands-on experience with GPU-enabled systems and performance/monitoring tools such as NVIDIA DCGM.
Solid understanding of bare-metal provisioning, system bring-up processes, and image-based deployment workflows.
Proficiency in Python or similar programming/scripting languages for building automation tools.
Demonstrated ability to troubleshoot complex issues across hardware, operating systems, GPUs, and system software layers.
Familiarity with hardware management interfaces such as IPMI, iDRAC, or Redfish.
Experience working with data center infrastructure and physical hardware environments is highly valued.
Bonus: Experience with high-performance interconnects (InfiniBand, NVLink), AI/ML or HPC workloads, and large-scale hardware validation frameworks.

Benefits:

Competitive base salary ranging from $180,000 to $200,000 USD, based on experience and location.
Performance-based bonus and meaningful equity participation.
Comprehensive medical, dental, and vision coverage.
Retirement and financial wellness programs.
Generous paid time off, holidays, and paid parental leave.
Flexible remote or hybrid work options within the United States.
Professional development support and learning opportunities.
Wellness and home office stipends.
Inclusive and collaborative work environment focused on innovation and balance.

How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

#LI-CL1

Infrastructure Engineer (GPU & Compute) in United States at Jobgether

Explore Related Opportunities

Job Description

Scan to Apply

Job Location

Frequently asked questions about this position