Can I apply directly for this job on this page?

Yes, you can begin your application on this page using a quick form. You'll then be redirected to the employer's career site to complete the full application process.

What is the role of a Cloud Reliability & Recovery Engineer at Jobgether?

The Cloud Reliability & Recovery Engineer position at Jobgether is a Full-time or part-time position opportunity in the Information Technology field.

What type of employment is offered for this Cloud Reliability & Recovery Engineer role?

Full-time or part-time position

What is the expected salary for this Cloud Reliability & Recovery Engineer job?

Compensation will be discussed during the hiring process.

Cloud Reliability & Recovery Engineer job near me in India at Jobgether

Cloud Reliability & Recovery Engineer

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Cloud Reliability & Recovery Engineer based in India.

This role sits at the core of large-scale cloud resilience engineering, focused on ensuring critical systems remain highly available, fault-tolerant, and recoverable under any disruption. You will design and operate advanced AWS-based disaster recovery and business continuity architectures across multi-region environments. The position requires deep hands-on engineering expertise in cloud infrastructure, automation, and reliability practices, with a strong emphasis on Kubernetes, Infrastructure as Code, and CI/CD-driven operations. You will work closely with security, infrastructure, and application teams to define and enforce recovery strategies aligned with strict RTO/RPO objectives. This is a highly technical role where you will build automated DR systems, validate resiliency through chaos engineering, and continuously improve platform stability. The environment is fast-paced, engineering-driven, and focused on measurable reliability outcomes at enterprise scale.

Accountabilities:

Design, implement, and maintain highly resilient cloud architectures with a strong focus on disaster recovery, business continuity, and system availability. Responsibilities include:

Designing multi-region and multi-AZ AWS architectures aligned with defined RTO/RPO targets
Building and maintaining failover and failback mechanisms using Route 53, Global Accelerator, and CloudFront
Developing automated disaster recovery runbooks using AWS Systems Manager, Step Functions, and related services
Implementing backup and recovery strategies across AWS services including EC2, RDS, S3, DynamoDB, and Aurora
Automating backup policies, replication workflows, and recovery validation processes
Performing chaos engineering and resilience testing using AWS Fault Injection Simulator
Managing Infrastructure as Code using Terraform and/or CloudFormation for DR environments
Developing CI/CD-driven automation for failover, deployment, and recovery workflows
Building observability dashboards, alerts, and incident response workflows using CloudWatch and third-party tools
Participating in on-call rotations, incident response, and post-incident reviews
Maintaining DR documentation, compliance artifacts, and audit-ready recovery evidence

Requirements:

The ideal candidate brings strong AWS expertise, deep cloud reliability experience, and a proven ability to design and operate large-scale disaster recovery systems.

5+ years of experience in cloud infrastructure, SRE, or disaster recovery engineering roles
3+ years of hands-on AWS production experience at scale
Proven experience designing and implementing multi-region DR architectures with defined RTO/RPO
Strong expertise in AWS services including EC2, RDS, S3, DynamoDB, Aurora, and related resilience tools
Hands-on experience with Kubernetes-based deployments and cloud-native architecture
Strong scripting skills in Python, Bash, or PowerShell for automation and orchestration
Experience with Infrastructure as Code tools such as Terraform or AWS CloudFormation
Solid understanding of networking concepts including VPC, DNS failover, VPN, and Direct Connect
Strong knowledge of CI/CD pipelines and automation frameworks
Excellent communication skills with the ability to produce clear technical and executive reports
Experience with resilience frameworks, compliance standards, and operational best practices

Benefits:

Competitive compensation aligned with experience and industry standards
Opportunity to work on mission-critical, large-scale cloud resilience systems
Remote-friendly work environment with global collaboration
Exposure to advanced AWS architectures, DR automation, and chaos engineering practices
Strong focus on engineering excellence, automation, and continuous improvement
Learning opportunities in cloud reliability, security, and enterprise-scale infrastructure
Collaborative environment working with highly skilled engineering and security teams

How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

#LI-CL1

Cloud Reliability & Recovery Engineer in India at Jobgether

Explore Related Opportunities

Job Description

Scan to Apply

Job Location

Frequently asked questions about this position