Senior Site Reliability Engineer - AWS in United States at Jobgether
Explore Related Opportunities
Job Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer - AWS in United States.
This role sits at the core of ensuring reliability, scalability, and performance for large-scale, cloud-native systems powering modern legal technology platforms. You will work closely with cross-functional engineering teams to design and maintain highly automated infrastructure that supports continuous deployment, observability, and operational excellence. The position emphasizes building resilient systems in AWS while reducing operational toil through advanced automation and engineering best practices. You will act as a key guardian of system reliability, ensuring production environments remain secure, stable, and highly available. The environment is fast-paced and innovation-driven, requiring both deep technical expertise and a proactive, ownership-oriented mindset. This is a high-impact engineering role where your work directly enables seamless, intelligent legal workflows at scale.
- Design, build, and maintain highly automated and autonomous systems for deployment, testing, monitoring, and operation of production environments.
- Lead reliability engineering efforts across the SDLC, ensuring system stability, performance, and scalability standards are consistently met.
- Develop and enhance CI/CD pipelines, automation scripts, and operational tooling to reduce manual effort and improve delivery speed.
- Implement robust monitoring, alerting, and observability systems to ensure real-time visibility into infrastructure and application health.
- Identify and resolve issues related to system availability, performance bottlenecks, and security vulnerabilities.
- Collaborate with engineering teams to improve architecture, reliability practices, and incident response processes.
- Participate in on-call rotations and provide rapid response support for production incidents.
- Document system architecture, operational procedures, and best practices while mentoring junior engineers.
Requirements:
- 8+ years of experience in software engineering, infrastructure, or operations, including at least 4+ years in Site Reliability Engineering roles.
- Strong hands-on expertise with AWS services such as EC2, EKS, Lambda, S3, IAM, and CloudWatch.
- Proficiency in scripting and programming languages such as Python, Bash, or PowerShell.
- Proven experience building and maintaining highly automated, large-scale production systems.
- Strong knowledge of CI/CD pipelines, monitoring/alerting systems, incident response, and capacity planning.
- Experience improving system reliability through automation and reducing operational toil in production environments.
- Strong understanding of security best practices in cloud infrastructure.
- Ability to work independently in fast-paced environments while driving continuous improvement initiatives.
- Strong communication skills with the ability to collaborate across technical and non-technical stakeholders.
- Bachelor’s degree in Computer Science or related field, or equivalent hands-on experience and certifications.
Benefits:
- Competitive base salary range of $175,000–$190,000 depending on experience and location.
- Comprehensive medical, dental, and vision insurance for full-time employees.
- Paid time off, maternity and paternity leave, and short- and long-term disability coverage.
- Opportunity to work in a fast-growing, innovation-focused technology environment.
- Exposure to large-scale cloud infrastructure and modern DevOps practices.
- Strong culture of learning, mentorship, and technical growth.
- Additional perks including competitive compensation structure and company-wide benefits programs.