Can I apply directly for this job on this page?

Yes, you can begin your application on this page using a quick form. You'll then be redirected to the employer's career site to complete the full application process.

What is the role of a Cloud Reliability & Recovery Engineer at Jobgether?

The Cloud Reliability & Recovery Engineer position at Jobgether is a Full-time or part-time position opportunity in the Information Technology field.

What type of employment is offered for this Cloud Reliability & Recovery Engineer role?

Full-time or part-time position

What is the expected salary for this Cloud Reliability & Recovery Engineer job?

Compensation will be discussed during the hiring process.

Cloud Reliability & Recovery Engineer job near me in India at Jobgether

Cloud Reliability & Recovery Engineer

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Cloud Reliability & Recovery Engineer based in India.

This is a senior, hands-on cloud engineering role focused on building and maintaining highly resilient, always-available AWS environments. You will design and operate large-scale disaster recovery (DR) and business continuity (BCP) frameworks that ensure critical systems remain operational even during major disruptions. The role sits at the intersection of SRE, infrastructure engineering, and incident response, with a strong emphasis on automation, fault tolerance, and cloud-native architecture. You will work extensively with Kubernetes, Terraform, and AWS-native resilience services to engineer multi-region failover and recovery strategies. The environment is fast-paced, security-conscious, and highly collaborative, involving close partnership with infrastructure, security, and application teams. Your work will directly reduce downtime risk and strengthen global service reliability across mission-critical systems.

Accountabilities:

Design and implement highly available, multi-region and multi-AZ AWS architectures aligned with defined RTO/RPO objectives, ensuring system continuity under failure scenarios.
Build and maintain disaster recovery (DR) solutions including automated failover/failback mechanisms using services such as Route 53, Global Accelerator, CloudFront, and AWS Systems Manager.
Develop and execute backup, restore, and data replication strategies across AWS services (RDS, DynamoDB, S3, EFS, Aurora), ensuring integrity and recoverability.
Implement infrastructure as code using Terraform or CloudFormation to standardize and automate DR-ready environments.
Create and maintain CI/CD-driven DR testing pipelines, including chaos engineering practices to validate system resilience under real-world failure conditions.
Monitor system availability and resilience using CloudWatch, incident tooling, and AWS health services, participating in on-call rotations and leading incident response efforts.
Conduct DR drills, tabletop exercises, and post-incident reviews to continuously improve recovery readiness and compliance posture.

Requirements:

5+ years of experience in cloud engineering, SRE, infrastructure, or disaster recovery roles, with at least 3+ years in AWS production environments at scale.
Proven experience designing and operating multi-region disaster recovery architectures with measurable RTO/RPO outcomes.
Strong expertise in AWS services related to resilience, including networking (VPC, DNS, VPN, Direct Connect) and storage/database replication.
Hands-on experience with Infrastructure as Code tools such as Terraform and/or CloudFormation.
Proficiency in scripting and automation using Python, Bash, or PowerShell.
Solid understanding of Kubernetes-based deployments, including scaling, self-healing, and multi-cluster strategies.
Experience with CI/CD tools and practices (e.g., GitHub Actions, CodePipeline, CodeBuild).
Strong communication skills with the ability to document DR strategies and present technical risks and recovery plans clearly.
Preferred: AWS certifications (Solutions Architect – Professional, DevOps Engineer – Professional, Advanced Networking Specialty).

Benefits:

Competitive compensation package aligned with senior-level cloud engineering roles.
Opportunity to work on large-scale, mission-critical cloud infrastructure with global impact.
Flexible and remote-friendly work arrangements (depending on team policy).
Strong focus on learning and upskilling in advanced AWS, resilience engineering, and cloud architecture.
Exposure to modern engineering practices including chaos engineering, SRE methodologies, and GitOps workflows.
Collaborative, high-autonomy environment with strong engineering ownership.
Health, wellness, and standard employee benefits in line with industry benchmarks.

How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

#LI-CL1

Cloud Reliability & Recovery Engineer in India at Jobgether

Explore Related Opportunities

Job Description

Scan to Apply

Job Location

Frequently asked questions about this position