JobTarget Logo

Site Reliability Engineer (SRE) in United States at Jobgether

NewJob Function: Engineering
Jobgether
United States, United States
Posted on
New job! Apply early to increase your chances of getting hired.

Explore Related Opportunities

Job Description

Site Reliability Engineer (SRE)

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Site Reliability Engineer (SRE) in United States.

This opportunity is ideal for an experienced reliability engineer who thrives in highly scalable, distributed environments and is passionate about operational excellence. In this role, you will play a critical part in ensuring the stability, scalability, and performance of modern cloud-native systems while collaborating closely with engineering and infrastructure teams. The position offers the chance to work on long-term, high-impact initiatives focused on automation, observability, resilience, and continuous delivery. You will help shape reliability standards, optimize production systems, and reduce operational overhead through engineering-driven solutions. The environment encourages innovation, technical leadership, and proactive problem-solving, making it a strong fit for professionals who enjoy balancing software engineering with systems operations. This is a fully remote opportunity within the United States, offering long-term career growth and exposure to complex, enterprise-grade platforms.

Accountabilities:
  • Define, implement, and continuously improve service reliability standards through SLOs, SLIs, and error budget management for critical production services.
  • Lead incident response efforts, coordinate production issue resolution, and conduct detailed post-incident reviews to strengthen system resilience and operational maturity.
  • Design and maintain observability frameworks using monitoring, logging, and tracing tools such as Prometheus, Grafana, OpenTelemetry, ELK/EFK, or Datadog.
  • Develop automation tools and operational workflows using Python, Go, Bash, or similar technologies to eliminate repetitive manual tasks and improve system efficiency.
  • Architect, manage, and optimize Kubernetes-based infrastructure, including autoscaling, networking, capacity planning, and container orchestration.
  • Build and improve CI/CD pipelines that support safe deployments, automated testing, canary releases, and progressive rollout strategies.
  • Partner with development teams to embed reliability, fault tolerance, and graceful degradation practices early in the software design lifecycle.
  • Drive initiatives related to chaos engineering, performance testing, security hardening, failover readiness, and platform resiliency improvements.
  • Mentor engineers on SRE best practices while contributing to a collaborative, blameless culture focused on continuous improvement.

Requirements:

  • Bachelor’s degree in Computer Science, Engineering, or a related technical field.
  • 5+ years of professional experience in Site Reliability Engineering, DevOps, production engineering, or infrastructure-focused roles supporting distributed systems.
  • Strong programming and scripting experience with Python, Go, Java, Bash, or similar languages used for automation and tooling development.
  • Deep expertise in Linux systems administration, networking concepts, systems troubleshooting, and performance optimization.
  • Hands-on experience managing Kubernetes clusters and containerized production workloads at scale.
  • Strong understanding of observability practices and modern monitoring ecosystems including Prometheus, Grafana, OpenTelemetry, ELK/EFK, or equivalent platforms.
  • Experience designing and maintaining CI/CD pipelines and deployment automation processes.
  • Solid knowledge of distributed systems concepts, including reliability engineering, failure handling, partitioning, and scalability principles.
  • Proven experience leading incident management processes and conducting actionable post-mortem reviews.
  • Excellent communication, collaboration, and technical documentation skills.
  • Additional exposure to cloud platforms such as AWS, Azure, or GCP, along with service mesh technologies or chaos engineering practices, is highly valued.

Benefits:

  • 100% remote work opportunity across the Continental United States.
  • Full-time direct W2 employment with long-term project stability.
  • Competitive base salary aligned with experience and technical expertise.
  • Comprehensive employee benefits package, including healthcare coverage and additional employee perks.
  • Opportunity to work on multi-year engineering initiatives involving modern cloud-native technologies and enterprise-scale infrastructure.
  • Supportive environment focused on technical growth, mentorship, and career advancement.
  • Exposure to cutting-edge reliability, automation, and observability practices in a collaborative engineering culture.
  • H1B transfer support available for qualified candidates currently holding valid H1B status.
  • Flexible, remote-first work environment designed to support productivity and work-life balance.
How Jobgether works:
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
#LI-CL1

Job Location

United States, United States

Frequently asked questions about this position

Continue to apply
Enter your email to continue. You’ll be redirected to the employer’s application.
By clicking Continue, you understand and agree to JobTarget's Terms of Use and Privacy Policy.