Senior Software Platform Engineer at Jobgether – United States
Explore Related Opportunities
About This Position
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Software Platform Engineer in United States.
This role offers the chance to shape and scale a cutting-edge software platform supporting high-performance quantum simulations. You will work at the intersection of cloud infrastructure, GPU clusters, and developer tooling to enable researchers to run complex quantum algorithms efficiently. The position combines hands-on platform engineering with HPC expertise, giving you the ability to influence the evolution of infrastructure and workflows. You’ll collaborate closely with platform engineers and researchers, ensuring systems are reliable, performant, and secure. The environment is fast-paced and innovative, with high visibility and the freedom to make technical decisions. Ideal candidates enjoy solving complex engineering challenges, building scalable systems, and enabling teams to deliver breakthrough research.
As a Senior Software Platform Engineer, you will:
- Own and maintain AWS infrastructure end-to-end, including ECS/EKS clusters, VPCs, security groups, and multi-account setups.
- Improve CI/CD pipelines, streamline deployments, and enhance monitoring, alerting, and incident response procedures.
- Harden systems by securing IAM roles, container images, and authentication flows while balancing usability for researchers.
- Bridge GPU/HPC infrastructure with researcher workflows, ensuring CUDA compatibility, SLURM job scheduling, and reproducible containerized Python simulations.
- Monitor cluster utilization, cost, and runtime efficiency, and optimize workloads to maximize performance and reliability.
- Partner with platform engineers and research teams to reduce operational friction and implement scalable, repeatable processes.
Candidates should have:
- 5+ years of experience in Platform Engineering, DevOps, or SRE roles with production AWS environments.
- Strong hands-on knowledge of Infrastructure as Code using Terraform, Pulumi, or CDK.
- CI/CD experience with tools such as GitLab CI, GitHub Actions, or equivalent, improving reliability and speed of deployments.
- Experience supporting GPU workloads, including CUDA, driver/version management, and HPC scheduling (e.g., SLURM).
- Familiarity with monitoring, alerting, and incident response best practices in cloud-native platforms.
- Strong troubleshooting, collaboration, and communication skills, comfortable making judgment calls under uncertainty.
- Preferred: experience in scientific computing, research infrastructure, ML platforms, or early-stage startups; exposure to quantum computing SDKs or hybrid classical-quantum workflows; security/compliance experience (FedRAMP a plus).
- Competitive US-based salary range.
- Fully remote work with flexibility across the United States.
- Equity ownership opportunities.
- Healthcare contributions and benefits.
- Professional growth in a high-impact, innovative environment.
- Access to cutting-edge research and HPC infrastructure.