What is the role of a Sr. Site Reliability Engineer at Jobgether?

The Sr. Site Reliability Engineer position at Jobgether is a Full-time or part-time position opportunity in the relevant field.

Where is this Sr. Site Reliability Engineer job located?

United States, Other / Non-US, United States

What type of employment is offered for this Sr. Site Reliability Engineer role?

Full-time or part-time position

What industry does this Sr. Site Reliability Engineer position belong to?

This role spans multiple industries.

What is the expected salary for this Sr. Site Reliability Engineer job?

Compensation will be discussed during the hiring process.

How can I apply for the Sr. Site Reliability Engineer position at Jobgether?

You can apply directly through the application link provided.

Sr. Site Reliability Engineer at Jobgether

Sr. Site Reliability Engineer

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Sr. Site Reliability Engineer in United States.

This role sits at the core of building and maintaining highly reliable, scalable, and secure infrastructure that powers mission-critical insurance technology platforms. You will work across cloud and on-prem environments, ensuring systems are resilient, observable, and optimized for performance at scale. Operating in a collaborative, engineering-driven environment, you will partner closely with development, platform, and product teams to design and evolve robust distributed systems. The position blends hands-on infrastructure engineering with automation, reliability strategy, and operational excellence. You will play a key role in improving deployment workflows, strengthening system resilience, and enabling continuous delivery practices. With a strong focus on observability and incident response, you will help ensure services remain highly available and performant. This is an opportunity to directly impact platform stability while contributing to modern DevOps and SRE practices in a fast-evolving technical ecosystem.

Accountabilities

In this role, you will ensure the reliability, scalability, and performance of complex systems across cloud and hybrid environments while driving automation and operational maturity. You will design, build, and maintain infrastructure and tooling that supports high availability and efficient software delivery.

Develop and maintain Infrastructure as Code (IaC) using tools such as Terraform, Terraform CDK, Packer, and Ansible to automate provisioning and configuration across environments
Collaborate with engineering teams to design scalable, fault-tolerant, and high-performance distributed systems
Implement and manage observability and monitoring solutions (e.g., Datadog), ensuring adherence to SLIs, SLOs, and SLAs
Build and optimize CI/CD pipelines using GitHub Actions, GitLab, and related tools to support reliable and efficient deployments
Manage Kubernetes environments, including Helm and ArgoCD, for container orchestration and application delivery
Drive automation of operational processes using scripting languages such as Python, Go, Bash, and PowerShell
Support incident response, troubleshooting, and on-call rotations to ensure system stability and rapid issue resolution
Design disaster recovery and high-availability strategies across cloud and hybrid infrastructure
Collaborate with vendors and cross-functional teams to integrate external tools and services into the platform ecosystem
Document systems, workflows, and operational standards to ensure knowledge sharing and consistency across teams

Requirements

You bring strong experience in site reliability engineering, DevOps, or infrastructure-focused roles, with a deep understanding of distributed systems and production-scale environments. You are comfortable working across cloud platforms, automation tooling, and containerized architectures, and you thrive in fast-paced, collaborative engineering environments.

5+ years of experience in DevOps, SRE, or Infrastructure Engineering roles
Strong expertise in incident management, troubleshooting, and production system reliability
Hands-on experience with cloud platforms such as AWS, GCP, or Azure
Strong proficiency in Infrastructure as Code tools, especially Terraform and related frameworks
Experience with Kubernetes, including Helm charts and ArgoCD for deployment orchestration
Proficiency in scripting and programming languages such as Python, Go, Bash, and PowerShell
Familiarity with CI/CD pipelines and version control systems like GitHub and GitLab
Experience with observability tools such as Datadog and logging/monitoring best practices
Knowledge of both Linux and Windows system administration
Strong communication skills with the ability to collaborate across technical and non-technical teams
Ability to prioritize effectively, troubleshoot under pressure, and drive operational improvements
Experience mentoring engineers and contributing to team knowledge sharing is a plus

Benefits

Competitive salary ranging from $65,000 to $160,000 depending on experience and qualifications
Bonus and additional compensation eligibility based on role
Comprehensive medical, dental, and vision insurance coverage
Paid vacation, holidays, health & wellness days, and a birthday bonus day
Flexible remote work arrangement across North America
401(k) retirement savings plan and other financial benefits
Strong learning and career development culture with mentorship opportunities
Collaborative, inclusive engineering environment focused on innovation and reliability
Work-life balance supported through flexible scheduling and remote-first practices

How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

#LI-CL1

Sr. Site Reliability Engineer at Jobgether – United States

Explore Related Opportunities

About This Position

Scan to Apply

Job Location

Frequently asked questions about this position