Senior Engineer – AWS in Gurugram, Uttar Pradesh at AHEAD
Explore Related Opportunities
Job Description
Incident & Problem Management
Lead triage, diagnosis, and resolution of critical (P1/P2) incidents across the AWS ecosystem.
Perform root cause analysis (RCA) and deliver customer-facing post-mortem reports with actionable prevention plans.
Coordinate with Technical Account Managers (TAMs), Service Teams, and customer stakeholders during live outages, Demos and meetings.
Advanced Troubleshooting (Broad AWS Coverage)
Compute: EC2 (AMI baking, Spot, Graviton), Lambda (concurrency, VPC), ECS/EKS (Fargate, Karpenter), Batch, Outposts.
Storage: S3 (lifecycle, replication, Event Notifications), EBS (io2 Block Express, snapshots), EFS, FSx (ONTAP, Lustre), Storage Gateway, Backup.
Database: RDS (Aurora, Multi-AZ, read replicas), DynamoDB (GSI, DAX, Streams), DocumentDB, Neptune, ElastiCache (Redis/Memcached).
Networking & Content Delivery: VPC (peering, TGW, Network Firewall), Direct Connect, Site-to-Site VPN, Route 53 (health checks, latency routing), CloudFront, Global Accelerator, API Gateway.
Security, Identity & Compliance: IAM (policies, SCPs, permissions boundaries), KMS, Secrets Manager, Security Hub, GuardDuty, Macie, Certificate Manager, WAF, Shield.
Management & Governance: AWS Organizations, Control Tower, CloudTrail, Config, Trusted Advisor, Service Quotas, License Manager.
Analytics: Athena, Redshift, EMR, Kinesis (Data Streams, Firehose), MSK, QuickSight, OpenSearch Service.
AI/ML & Serverless: SageMaker, Bedrock, Rekognition, Comprehend, AppFlow, Step Functions, EventBridge.
Developer Tools: CodeCommit, CodeBuild, CodePipeline, CodeDeploy, Cloud9, X-Ray.
Team Leadership & Knowledge Sharing
Mentor junior engineers and maintain internal knowledge base.
Contribute to internal tooling (custom dashboards, alerting, automation).
Required Qualifications8+ years hands-on experience designing, operating, and troubleshooting production workloads on AWS.
AWS Professional-level certification (Solutions Architect Pro or DevOps Engineer Pro) + at least one Specialty (e.g., Security, Networking, Data Analytics, ML).
Broad, practical knowledge across compute, storage, database, networking, security, and management tools (see domains above).
Proficiency in Infrastructure as Code (CloudFormation, CDK, or Terraform).
Strong scripting: Python (boto3) and shell scripting; experience with automation frameworks.
Advanced Troubleshooting (Broad AWS Coverage)
Compute: EC2 (AMI baking, Spot, Graviton), Lambda (concurrency, VPC), ECS/EKS (Fargate, Karpenter), Batch, Outposts.
Storage:S3 (lifecycle, replication, Event Notifications), EBS (io2 Block Express, snapshots), EFS, FSx (ONTAP, Lustre), Storage Gateway, Backup.
Database: RDS (Aurora, Multi-AZ, read replicas), DynamoDB (GSI, DAX, Streams), DocumentDB, Neptune, ElastiCache (Redis/Memcached).
Networking & Content Delivery: VPC (peering, TGW, Network Firewall), Direct Connect, Site-to-Site VPN, Route 53 (health checks, latency routing), CloudFront, Global Accelerator, API Gateway.
Security, Identity & Compliance: IAM (policies, SCPs, permissions boundaries), KMS, Secrets Manager, Security Hub, GuardDuty, Macie, Certificate Manager, WAF, Shield.
Management & Governance: AWS Organizations, Control Tower, CloudTrail, Config, Trusted Advisor, Service Quotas, License Manager.
Analytics: Athena, Redshift, EMR, Kinesis (Data Streams, Firehose), MSK, QuickSight, OpenSearch Service.
AI/ML & Serverless: SageMaker, Bedrock, Rekognition, Comprehend, AppFlow, Step Functions, EventBridge.
Developer Tools: CodeCommit, CodeBuild, CodePipeline, CodeDeploy, Cloud9, X-Ray.
Team Leadership & Knowledge Sharing
Broad, practical knowledge across compute, storage, database, networking, security, and management tools (see domains above).
Proficiency in Infrastructure as Code (CloudFormation, CDK, or Terraform).
Strong scripting: Python (boto3) and shell scripting; experience with automation frameworks.
Mentor junior engineers and maintain internal knowledge base.
Contribute to internal tooling (custom dashboards, alerting, automation).
Incident Response
Proven ability to resolve critical incidents in large-scale environments (include SLA metrics).
Familiar with ITIL-style problem management and blameless post-mortems.
Lead triage, diagnosis, and resolution of critical (P1/P2) incidents across the AWS ecosystem.
Perform root cause analysis (RCA) and deliver customer-facing post-mortem reports with actionable prevention plans.
Coordinate with Technical Account Managers (TAMs), Service Teams, and customer stakeholders during live outages.
Customer-facing role for a minimum of 5+ years
3+ years designing, securing, and deploying large-scale applications and solutions
Experience with one of the major cloud vendor environments (development, access, migration)
Effectively communicate with, and present to all levels of the organization
Subject matter expert in at least one major cloud provider (AWS/Azure/GCP)
Architect level development skills in at least one development language (Java, Python, PowerShell, etc.)
Strong skills integrating third party tools and solutions
Strong project and situational awareness
Strong attention to detail
Self-starter
Well organized
Able to travel as required
AI / GenAI Knowledge Requirements
Basic understanding of Artificial Intelligence (AI), Machine Learning (ML), and Generative AI concepts.
Exposure to AWS AI/ML services such as Amazon Bedrock, SageMaker, Rekognition, Comprehend, Textract, or Lex.
Understanding AI-assisted operations, automation, and observability tools.
Awareness of Large Language Models (LLMs), prompt engineering, and AI governance/security best practices.
Ability to support AI/ML workloads on cloud infrastructure including GPU-enabled environments.
Familiarity with integrating AI services into DevOps and cloud automation workflows is an added advantage. Recognized Technical Expertise
Proven skills in multiple billable technologies (as measured by certification/approved training completion/proven abilities)
Recognized subject matter expert in professional discipline
Depth of knowledge and experience enables contribution in a more complex/critical environment
Contributions often exceed the full requirements of the expected competencies relative to one’s current job title Lead vs Contribute
Can participate with and lead a team of technical delivery resources across skillsets within a practice to help achieve customer’s business goals and technology needs
Can participate in and lead written deliverable creation across areas of expertise
Provides measurable input into AHEAD new products, processes, standards, and / or plans
Able to provide time estimates for work that needs to be accomplished based on the available information
Coordinates cross-practice and can contribute to cross-practice deliverables using current templates
Self-starter in finding work during non-scheduled work hours
Prioritizes and completes tasks independently on-time or ahead of schedule
Able to lead development and testing activities
Achieve high team standards. Contributes to the refinement of methodology and best practices to establish quality and effectiveness
Initiate and participate in authoring PCR requests when additional work is being requested Architect Skills
Depth and demonstrated expertise across at least one technology
Able to architect and lead deployment of moderately complex solutions related to cloud solutions
Understands performance, scaling and functional characteristics of software technologies
Ability to understand open-source and cloud use-cases, and recommend standard design patterns commonly used in such solutions (best practices) Certifications – good to have
AWS Certified Solutions Architect – Associate
AWS Certified SysOps Administrator – Associate
AWS Certified DevOps Engineer – Professional (Preferred)
AWS Certified Solutions Architect – Professional (Preferred)
HashiCorp Certified: Terraform Associate
Certified Kubernetes Administrator (CKA)
AWS Certified Machine Learning – Specialty (Good to Have)
Google Cloud Professional Machine Learning Engineer (Good to Have) Communications (verbal and written)
Communicates with external customers, which may include senior management, on matters that require explanation, interpretation, and/or advising
Often exceeds expectations with respect to customer expectations (communications, meets commitments, etc.)
Able to work with outside vendors on internal or customer related issues
Participate in pre-sales activities including scoping, positioning, technology adoption and maturation
Able to identify new client opportunities inside of the POD and communicate those to upper management including the Team Lead
Can communicate beyond tactics and technology and can help customers management set strategic direction
Can write and lead complex deliverables within the practice
Ability to understand and translate customer requirements into technical requirements Leadership and Influence
Works to influence direct team members, broader internal team, and external customers, to agree and accept new concepts, best practices, and approaches
Advises and trains more junior staff via mentorship while helping to drive the entire AHEAD platform
A reliable resource for marketing on thought leadership pieces for their practice for AHEAD and our customers
Has started speaking at industry conferences and/or webinars
Contributes to more complex workshops and understands how their area of expertise fits into our stitching message
Accepts responsibility for own actions and sometimes those of others as part of a team
Works well with immediate, extend and external teams, a true team player
Consistently meets expectations in terms of customer facing, writing, presentation and problem solving skills and requires little to no supervision in this area
Consistently exhibits a positive attitude towards AHEAD and its customers
Willing to go the extra mile when asked to do so (i.e., time outside of normal working hours to support a customer initiative, production roll-out, could also be > 40 hours for short intervals)
Consistently pursues self-development and drives the AHEAD business model for self and sometimes others on the team
Highly respected by peers and sales team Culture o Support initiatives beyond area of responsibility o Encourage cross team interaction o Motivate people to think creatively o Achieve success through others