Senior Site Reliability Engineer (B2B Contract) in UK at Jobgether
Explore Related Opportunities
Job Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer (B2B Contract) in United Kingdom.
In this role, you will contribute to the reliability, scalability, and performance of mission-critical platforms supporting global healthcare operations. You’ll work within an international engineering environment focused on cloud modernization, operational excellence, and automation at scale. The position combines hands-on technical ownership with strategic collaboration across infrastructure, platform, security, and product teams. You will play a central role in improving system resilience, streamlining deployments, and enhancing observability practices across distributed environments. This opportunity is ideal for engineers who enjoy solving complex reliability challenges while working with modern cloud-native technologies. You’ll also have the chance to mentor peers, influence best practices, and help shape long-term platform reliability strategies in a highly collaborative setting.
- Lead site reliability and operational excellence initiatives across production systems and cloud-based services.
- Define, implement, and manage reliability metrics including SLIs, SLOs, SLAs, and error budgets to ensure platform stability and performance.
- Design and maintain scalable, resilient cloud-native architectures with a strong focus on automation and infrastructure reliability.
- Build and optimize Infrastructure as Code and CI/CD pipelines to improve deployment efficiency and consistency.
- Develop and maintain monitoring, logging, tracing, and alerting capabilities to enhance system observability and proactive incident response.
- Drive incident management processes, including troubleshooting, root cause analysis, post-incident reviews, and preventive improvements.
- Collaborate with cross-functional global teams across engineering, security, product, and vendor management functions.
- Support operational maturity initiatives through documentation, runbooks, automation, and continuous process optimization.
- Mentor engineers and contribute to technical knowledge sharing and reliability best practices across teams.
- Perform capacity planning, system performance analysis, and reliability assessments to ensure long-term scalability.
- 5+ years of experience in Site Reliability Engineering, Platform Engineering, DevOps, or similar infrastructure-focused roles.
- Strong hands-on expertise with AWS cloud services; experience with Azure or GCP is considered an advantage.
- Proven experience using Infrastructure as Code tools such as Terraform or CloudFormation.
- Solid understanding of CI/CD pipelines, automation practices, and Git-based development workflows.
- Experience implementing and managing reliability frameworks including SLIs, SLOs, SLAs, and error budgets.
- Practical knowledge of observability and monitoring tools such as Prometheus, Grafana, ELK/EFK, OpenTelemetry, and distributed tracing solutions.
- Scripting or programming skills in Python, Go, Bash, or PowerShell.
- Strong understanding of networking concepts including VPCs, VPNs, load balancers, and firewalls.
- Familiarity with cloud security principles, compliance frameworks, and operational best practices.
- Excellent troubleshooting, communication, and stakeholder management skills within global and cross-functional environments.
- Experience in regulated industries such as healthcare or pharmaceuticals is a plus.
- Additional exposure to platform engineering, cloud-native analytics, AI/Copilot technologies, or Power Platform solutions is beneficial.
- Relevant certifications such as AWS Professional or Kubernetes certifications are considered an asset.
- Previous mentoring or leadership experience is highly valued.
- Fully remote contract opportunity based in Europe.
- Freelance / B2B contract arrangement.
- Full-time allocation with high-impact international projects.
- Opportunity to work on large-scale cloud modernization and reliability initiatives.
- Exposure to global engineering teams and advanced cloud-native technologies.
- Flexible and collaborative remote work environment.
- Planned project start between May and June.