JobTarget Logo

Software Engineer, Compute Infrastructure in United States at Jobgether

NewJob Function: Information Technology
Jobgether
United States, United States
Posted on
New job! Apply early to increase your chances of getting hired.

Explore Related Opportunities

Job Description

Software Engineer, Compute Infrastructure

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Software Engineer, Compute Infrastructure in United States.

This role sits at the core of building and scaling the infrastructure that powers large-scale AI systems, transforming massive compute resources into a reliable, efficient, and high-performance platform. You will work across the full infrastructure stack, from hardware and networking to orchestration, storage, and developer tooling, enabling researchers and product teams to run complex workloads with speed and reliability. The environment is highly technical and deeply collaborative, where small improvements in systems performance, scheduling, or observability can have significant downstream impact. You will contribute to designing and operating distributed systems that span accelerators, CPUs, networks, and data centers. This role offers exposure to cutting-edge compute environments and the opportunity to directly influence the efficiency and scalability of frontier AI workloads. It is ideal for engineers who enjoy working across systems layers and solving deeply complex infrastructure challenges.

Accountabilities:
  • Design, build, and optimize large-scale compute infrastructure systems supporting high-performance AI workloads across distributed environments.
  • Develop and operate infrastructure spanning compute, networking, storage, orchestration, and cluster scheduling systems.
  • Improve performance and reliability through profiling, benchmarking, and optimization of workloads across compute, memory, and network layers.
  • Build automation and tooling for provisioning, monitoring, incident response, and lifecycle management of compute resources.
  • Contribute to the design of developer platforms, observability tools, CaaS systems, and agent infrastructure to improve usability and efficiency.
  • Collaborate with research, hardware, networking, and operations teams to ensure efficient and scalable compute capacity.
  • Identify system bottlenecks and translate operational insights into durable infrastructure improvements and abstractions.
  • Support the evolution of platform architecture to better support heterogeneous and large-scale compute environments.
Requirements:
  • Strong software engineering background with experience in production-grade infrastructure systems.
  • Experience in one or more areas such as distributed systems, high-performance computing, networking, storage systems, Kubernetes, observability, or infrastructure tooling.
  • Solid understanding of system-level performance optimization, debugging, and large-scale system behavior.
  • Familiarity with GPU infrastructure, RDMA, NCCL, or other high-performance communication frameworks is a plus.
  • Ability to work across hardware, software, and networking layers to diagnose and resolve complex issues.
  • Strong ownership mindset with the ability to operate effectively in ambiguous and fast-changing environments.
  • Excellent collaboration and communication skills across multidisciplinary engineering teams.
  • Motivation to build scalable infrastructure that enables advanced AI research and production systems.
Benefits:
  • Competitive compensation aligned with experience and market standards.
  • Comprehensive health, dental, and vision insurance coverage.
  • Flexible work arrangements supporting collaboration across distributed teams.
  • Opportunity to work on cutting-edge AI infrastructure at massive scale.
  • High-impact role with direct contribution to frontier AI research and systems.
  • Professional growth in a highly technical and research-driven engineering environment.
  • Inclusive and mission-driven workplace culture focused on safety, collaboration, and innovation.
How Jobgether works:
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
#LI-CL1

Job Location

United States, United States

Frequently asked questions about this position

Continue to apply
Enter your email to continue. You’ll be redirected to the employer’s application.
By clicking Continue, you understand and agree to JobTarget's Terms of Use and Privacy Policy.