JobTarget Logo

Site Reliability Engineer in United States at Jobgether

NewJob Function: Engineering
Jobgether
United States, United States
Posted on
New job! Apply early to increase your chances of getting hired.

Explore Related Opportunities

Job Description

Site Reliability Engineer

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Site Reliability Engineer based in United States.

This role focuses on building and maintaining highly available, secure, and performant cloud platforms that power critical customer-facing services. You will be responsible for ensuring system reliability through automation, observability, and strong engineering practices across distributed cloud environments.
You will work closely with product and engineering teams to design scalable infrastructure and improve system resilience in a fast-paced cloud services environment.
The role combines hands-on cloud engineering with operational excellence, including incident response, monitoring, and continuous improvement of production systems.
You will play a key part in shaping reliability standards, deployment practices, and infrastructure automation across the organization.
This is a highly collaborative role requiring strong technical depth, problem-solving skills, and a proactive mindset toward system health and performance.
You will also contribute to mentoring peers and improving engineering practices within a small, agile, and technically driven team.
Success in this role means consistently increasing system stability, reducing operational risk, and improving deployment efficiency at scale.

Accountabilities:
  • Design, implement, and maintain observability solutions to ensure high availability, performance, and reliability across cloud-based systems
  • Participate in on-call rotations, incident response, and postmortem analysis to drive continuous operational improvements
  • Collaborate with product and engineering teams to design and deploy scalable, resilient, and secure infrastructure solutions
  • Develop and enforce cloud architecture standards, reliability practices, and automation strategies for large-scale systems
  • Build and maintain infrastructure automation using Infrastructure-as-Code tools such as Terraform, ARM, Bicep, or CloudFormation
  • Implement CI/CD and deployment automation workflows using modern DevOps toolchains and source control systems
  • Integrate and automate monitoring and operational tools such as Dynatrace, Datadog, App Insights, and similar observability platforms
  • Develop scripting and automation solutions using Python, PowerShell, Bash, or REST APIs to improve operational efficiency
  • Maintain technical documentation, operational runbooks, and knowledge base content to support engineering and support teams
  • Collaborate on security and compliance requirements including SOC, FedRAMP, and cloud security best practices

Requirements:

  • 6+ years of experience in Site Reliability Engineering, cloud infrastructure, or software engineering roles
  • Strong hands-on experience with Kubernetes-based environments such as AKS, EKS, GKE, or OpenShift
  • Deep knowledge of cloud platforms including Microsoft Azure, AWS, or Google Cloud Platform
  • Proven experience implementing Infrastructure-as-Code using tools such as Terraform, ARM templates, Bicep, or CloudFormation
  • Strong expertise in observability and monitoring tools such as Dynatrace, Datadog, New Relic, Prometheus, Grafana, or Log Analytics
  • Solid scripting and automation skills using Python, PowerShell, Bash, or similar languages
  • Strong understanding of CI/CD pipelines, Git-based workflows, and DevOps practices
  • Experience with configuration management tools such as Ansible, Chef, Puppet, or similar
  • Familiarity with distributed systems, containerized applications, and cloud-native architectures
  • Ability to work independently in ambiguous environments while managing multiple priorities effectively
  • Strong communication skills with the ability to collaborate across engineering, product, and operations teams
  • Experience working in Agile environments using Jira or Azure DevOps Boards
  • Knowledge of compliance frameworks such as SOC or FedRAMP is a strong advantage

Benefits:

  • Competitive base salary ranging from USD 114,000 to 148,000 depending on experience and location
  • Comprehensive health coverage including medical, dental, vision, and life insurance
  • Retirement savings plan (401K) with employer support
  • Short-term and long-term disability coverage
  • Paid vacation time and paid holidays
  • Professional development and training opportunities
  • Remote work flexibility within the United States
  • Exposure to large-scale cloud environments and modern DevOps practices
  • Opportunity to work on high-impact production systems with strong engineering ownership
How Jobgether works:
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
#LI-CL1

Job Location

United States, United States

Frequently asked questions about this position

Continue to apply
Enter your email to continue. You’ll be redirected to the employer’s application.
By clicking Continue, you understand and agree to JobTarget's Terms of Use and Privacy Policy.