What type of employment is offered for this Staff Software Engineer - Grafana Databases, Managed Services role?

Full-time or part-time position

What is the expected salary for this Staff Software Engineer - Grafana Databases, Managed Services job?

Compensation will be discussed during the hiring process.

Staff Software Engineer - Grafana Databases, Managed Services at Jobgether | Jobs and Employment

Staff Software Engineer - Grafana Databases, Managed Services

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Staff Software Engineer – Grafana Databases, Managed Services in the United Kingdom.

In this role, you will operate at the intersection of large-scale distributed systems, streaming infrastructure, and cloud database platforms, helping power mission-critical observability services used globally. You will be responsible for the reliability, scalability, and performance of multi-cloud infrastructure that underpins high-throughput metrics, logs, and traces systems. Working in a deeply technical, remote-first engineering environment, you will influence architecture decisions while remaining hands-on in production systems. Your work will directly impact the stability and efficiency of large-scale data pipelines operating across hundreds of clusters. This is a high-autonomy role where you will partner with platform and database teams to solve complex distributed systems challenges. You will also play a key role in shaping operational excellence, reliability practices, and long-term system evolution across global infrastructure.

Accountabilities

In this role, you will take ownership of large-scale streaming and database infrastructure, ensuring reliability, scalability, and performance across hundreds of production clusters while driving architectural improvements and operational excellence.

Operate and evolve large-scale multi-cloud streaming and database infrastructure across production environments
Diagnose and resolve complex cross-layer failures involving storage, compute, networking, and control-plane systems
Design and implement safe rollout, upgrade, and migration strategies across distributed systems at scale
Improve observability, automation, and operational tooling to reduce system toil and increase reliability
Define and evolve SLOs, error budgets, and reliability standards for shared infrastructure systems
Partner with engineering teams to optimize query performance, data partitioning, and system scalability
Serve as a primary escalation point for high-severity incidents and lead deep root cause analysis efforts
Drive long-term architectural improvements to reduce systemic risks across multi-cluster environments
Mentor engineers and contribute to best practices in distributed systems engineering and operational excellence

Requirements

You bring deep expertise in distributed systems, infrastructure engineering, or platform engineering, with strong experience operating high-scale production systems in cloud environments. You are highly technical, autonomous, and comfortable leading complex initiatives across global teams.

8+ years of software engineering experience in SRE, platform engineering, infrastructure, or distributed systems roles
Strong experience with large-scale streaming or database systems (e.g., Kafka, Redpanda, ClickHouse, Cassandra, or similar)
Hands-on expertise with Kubernetes in AWS, GCP, or Azure environments
Proficiency in infrastructure-as-code tools such as Terraform, Helm, or similar
Strong programming skills in systems-oriented languages (Go preferred)
Deep understanding of distributed systems behavior, failure modes, and performance trade-offs
Experience with observability, incident response, and writing post-incident reviews
Strong knowledge of Linux internals, networking, storage systems, and cloud architecture
Proven ability to lead technical initiatives and influence architectural decisions without formal authority
Excellent communication skills with the ability to work effectively in remote, cross-functional teams

Benefits

Competitive compensation package including base salary, bonus (where applicable), and equity (RSUs)
Fully remote-first working model with global collaboration across distributed teams
30 days annual leave, including designated shutdown days for full disconnection
Equity ownership in the company’s long-term success through RSU participation
Access to modern AI development tools with company-supported usage budgets
Strong emphasis on autonomy, trust, and outcome-driven engineering culture
Career growth opportunities in a fast-scaling global infrastructure organization
Exposure to cutting-edge distributed systems and large-scale observability platforms
Inclusive, transparent, and highly collaborative engineering environment

How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

#LI-CL1

Staff Software Engineer - Grafana Databases, Managed Services at Jobgether – UK

Explore Related Opportunities

About This Position

Scan to Apply

Job Location

Frequently asked questions about this position