Can I apply directly for this job on this page?

Yes, you can begin your application on this page using a quick form. You'll then be redirected to the employer's career site to complete the full application process.

What is the role of a Senior Site Reliability Enigneer at Jobgether?

The Senior Site Reliability Enigneer position at Jobgether is a Full-time or part-time position opportunity in the Information Technology field.

Where is this Senior Site Reliability Enigneer job located?

United States, Other / Non-US, United States

What type of employment is offered for this Senior Site Reliability Enigneer role?

Full-time or part-time position

What is the expected salary for this Senior Site Reliability Enigneer job?

Compensation will be discussed during the hiring process.

Senior Site Reliability Enigneer job near me in United States, Other / Non-US at Jobgether

Senior Site Reliability Enigneer

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Senior Site Reliability Engineer based in United States.

This role sits at the core of a high-scale cloud infrastructure environment powering a leading AI-driven video platform used by global enterprise customers. You will take ownership of operational excellence across critical systems running on AWS, Kubernetes, and supporting services such as MongoDB and workflow orchestration tools. The position blends deep production reliability work with meaningful engineering ownership, focusing on eliminating operational fragility and reducing reliance on individual knowledge. You will be responsible for transforming manual, high-risk processes into automated, resilient systems that scale with the business. Working closely with engineering, infrastructure, and external vendors, you will help define how reliability is achieved at scale. This is a high-impact role for someone who thrives in ownership-heavy environments and enjoys solving complex operational challenges. The environment is fast-moving, highly technical, and deeply collaborative.

Accountabilities:

You will be responsible for ensuring the reliability, scalability, and operational excellence of core cloud infrastructure systems. This includes owning incident response processes, improving monitoring and detection, and driving long-term reductions in system failures and customer-impacting events.

Lead incident management activities, including on-call coordination, postmortems, and continuous improvement of response workflows
Design and implement automation to eliminate high-risk, low-frequency operational tasks and reduce system fragility
Take ownership of key infrastructure domains such as Kubernetes operations, observability systems, or workflow orchestration platforms
Manage vendor relationships and external integrations, ensuring reliability, accountability, and reduced operational dependency
Drive FinOps initiatives by improving cost visibility, optimizing cloud usage, and aligning infrastructure spend with business needs
Collaborate with engineering teams to define reliability standards, operational best practices, and scalable system design patterns
Build documentation and operational frameworks that eliminate single points of failure across critical systems

Requirements:

The ideal candidate brings strong hands-on experience in production infrastructure environments, with a focus on reliability engineering, automation, and cloud-native systems. You are comfortable operating in high-scale AWS and Kubernetes environments and have a pragmatic approach to solving operational challenges.

5+ years of experience in Site Reliability Engineering, DevOps, or infrastructure-focused engineering roles in production environments
Strong experience with AWS and Kubernetes in large-scale systems, with additional familiarity with MongoDB and distributed systems
Proficiency in Python or similar scripting languages for automation and operational tooling
Deep understanding of incident management, root cause analysis, and production reliability practices
Strong judgment under pressure, with the ability to remain calm and effective during critical incidents
Experience working cross-functionally across engineering, infrastructure, and external vendor teams
Strong communication skills with the ability to influence through data, clarity, and collaboration rather than escalation
Bonus: exposure to FinOps, observability platforms, Temporal, or vendor management in infrastructure environments

Benefits:

Competitive base salary with performance-based compensation components
Equity participation in a high-growth technology company
Comprehensive medical, dental, and vision coverage for employees and eligible dependents
Flexible and remote-first working environment
Paid time off, parental leave, and company holidays
Learning and development budget to support continuous skill growth
Modern cloud infrastructure environment with opportunities to work on large-scale distributed systems
Exposure to cutting-edge AI infrastructure and enterprise-grade production systems

How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

#LI-CL1

Senior Site Reliability Enigneer in United States at Jobgether

Explore Related Opportunities

Job Description

Scan to Apply

Job Location

Frequently asked questions about this position