JobTarget Logo

DevOps/Observability Engineer in United States at Jobgether

NewJob Function: Engineering
Jobgether
United States, United States
Posted on
New job! Apply early to increase your chances of getting hired.

Explore Related Opportunities

Job Description

DevOps/Observability Engineer

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a DevOps/Observability Engineer based in United States.

This role sits at the core of modern cloud infrastructure reliability, focused on building and scaling a next-generation observability platform for complex, distributed systems. You will design and implement end-to-end monitoring, logging, and telemetry pipelines that provide deep visibility across large-scale cloud environments. The position requires strong expertise in cloud-native architectures, with a focus on AWS, Kubernetes, and open-source observability tooling. You will play a key role in unifying metrics, logs, and traces using technologies such as OpenTelemetry, Prometheus, Grafana, and Splunk. Operating in a fast-paced, engineering-driven environment, you will collaborate closely with platform and DevOps teams to improve system reliability, performance, and cost efficiency. This is a highly technical, hands-on role where your work directly strengthens the stability and scalability of mission-critical systems.

Accountabilities:
  • Design and implement end-to-end observability architectures using OpenTelemetry, Prometheus, Grafana, and related tools across cloud environments.
  • Build and maintain centralized observability pipelines across multi-account AWS environments, including CloudWatch, CloudTrail, and VPC Flow Logs.
  • Develop scalable log aggregation and routing strategies, including filtering, noise reduction, and integration with systems such as Splunk HEC.
  • Create advanced alerting frameworks and high-quality dashboards using Alertmanager, CloudWatch Alarms, and Grafana with PromQL.
  • Deploy and manage observability infrastructure using Infrastructure as Code tools such as Terraform.
  • Support Kubernetes and container-based observability across EKS and ECS environments.
  • Optimize observability systems for performance, cost efficiency, and scalability in large-scale production environments.
  • Collaborate with engineering teams to improve system reliability, monitoring standards, and incident response capabilities.
Requirements:
  • 8+ years of experience in DevOps, Site Reliability Engineering, or Observability Engineering roles.
  • Strong hands-on experience designing unified observability pipelines using OpenTelemetry, Prometheus, and Grafana.
  • Deep expertise in AWS observability services including CloudWatch, CloudTrail, and cross-account telemetry strategies.
  • Proven ability to build and manage large-scale log aggregation systems and optimize high-volume data pipelines.
  • Strong experience with Kubernetes (EKS) or containerized environments (ECS) in production settings.
  • Advanced proficiency with Terraform or other Infrastructure as Code tools for infrastructure and observability deployments.
  • Experience building alerting systems, dashboards, and monitoring frameworks for distributed systems.
  • Strong understanding of cost optimization strategies for observability platforms (log filtering, metric reduction, storage tiering).
  • Excellent problem-solving, debugging, and collaboration skills in complex cloud-native environments.
Benefits:
  • Competitive compensation aligned with experience and market benchmarks.
  • Remote work flexibility within United States.
  • Opportunity to work on large-scale, AI-driven, cloud-native infrastructure systems.
  • Exposure to enterprise clients and high-impact digital transformation projects.
  • Hands-on experience with leading observability and cloud technologies in production environments.
  • Strong learning and upskilling culture in AI, cloud, and platform engineering.
  • Collaborative, high-performance engineering environment focused on innovation and reliability.
  • Opportunity to shape next-generation observability practices at scale.
How Jobgether works:
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
#LI-CL1

Job Location

United States, United States

Frequently asked questions about this position

Continue to apply
Enter your email to continue. You’ll be redirected to the employer’s application.
By clicking Continue, you understand and agree to JobTarget's Terms of Use and Privacy Policy.