JobTarget Logo

Senior AIOps Engineer in India at Jobgether

NewJob Function: Information Technology
Jobgether
India, India
Posted on
New job! Apply early to increase your chances of getting hired.

Explore Related Opportunities

Job Description

Senior AIOps Engineer

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior AIOps Engineer I in India.

This role sits at the intersection of AI, machine learning, and platform reliability, focusing on ensuring that production AI systems operate efficiently, securely, and at scale. You will be responsible for maintaining and improving the operational health of AI/ML-powered services running in production environments. The position involves working closely with data scientists, ML engineers, and platform teams to ensure smooth deployment, monitoring, and lifecycle management of AI models. You will play a key role in building observability, automation, and infrastructure that supports reliable AI delivery. The environment is highly collaborative and fast-evolving, with a strong emphasis on scalability, cost optimization, and production readiness. This is a hands-on engineering role where your work directly impacts the stability and performance of AI-driven products used at scale.

Accountabilities:
  • Own the reliability, availability, and performance of AI/ML services in production environments.
  • Define and maintain SLOs/SLIs for AI systems, ensuring alignment with user experience and business outcomes.
  • Monitor, detect, and mitigate model drift, performance degradation, and system issues in production.
  • Design and implement observability solutions including monitoring, logging, alerting, and dashboards for AI systems.
  • Support deployment workflows for ML models, including canary, blue/green, and A/B testing strategies.
  • Operate and improve AI infrastructure components such as model serving systems, LLM gateways, and RAG pipelines.
  • Manage CI/CD pipelines and automation to improve deployment reliability and reduce operational overhead.
  • Participate in incident management, on-call rotations, and post-incident reviews to improve system resilience.
  • Collaborate with cross-functional teams to ensure scalable, secure, and cost-efficient AI operations.
Requirements:
  • 4+ years of software engineering experience, including at least 3 years in production systems, SRE, DevOps, or platform engineering roles.
  • Strong experience operating distributed systems on Kubernetes and cloud platforms.
  • Hands-on experience with Google Cloud Platform services such as GKE, BigQuery, Pub/Sub, Vertex AI, Cloud SQL, and GCS.
  • Solid understanding of CI/CD pipelines, infrastructure-as-code (Terraform preferred), and deployment automation.
  • Experience with monitoring, logging, and observability tools such as Datadog, Prometheus, Grafana, or ELK stack.
  • Familiarity with containerization and Docker image lifecycle management.
  • Understanding of ML lifecycle concepts including training, deployment, evaluation, and monitoring.
  • Exposure to AI/ML tooling such as LLM gateways, vector databases, RAG systems, or embedding pipelines is a strong plus.
  • Strong Python programming skills and solid software engineering fundamentals.
  • Excellent communication skills with the ability to work across technical and non-technical stakeholders.
  • Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
Benefits:
  • Competitive annual salary aligned with experience and market standards.
  • Fully remote work with structured overlap hours for global collaboration.
  • Comprehensive health, accident, and retirement benefits.
  • Paid holidays, generous leave policies, and wellness programs.
  • Exposure to cutting-edge AI/ML infrastructure and large-scale production systems.
  • Strong culture of learning, ownership, and cross-functional collaboration.
  • Opportunity to work on high-impact AI systems used in real-world production environments.
  • Inclusive and globally distributed team environment.
How Jobgether works:
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
#LI-CL1

Job Location

India, India

Frequently asked questions about this position

Continue to apply
Enter your email to continue. You’ll be redirected to the employer’s application.
By clicking Continue, you understand and agree to JobTarget's Terms of Use and Privacy Policy.