JobTarget Logo

Head of Support & Service Reliability Engineering in United States at Sycurio

NewJob Function: Medical
Sycurio
United States, United States
Posted on
New job! Apply early to increase your chances of getting hired.

Explore Related Opportunities

Job Description

Head of Support & Service Reliability Engineering

We are seeking a Head of Support & Service Reliability to lead and evolve our global support function into a proactive, platform-integrated reliability capability.

This role provides an exciting and dynamic opportunity for an outcome focused individual; as Sycurio is in a critical inflection point as we transition from a single-tenant architecture to a multi-tenant SaaS platform, requiring a fundamental shift from reactive ticket handling to systemic reliability, observability, and customer experience management at scale.

You will own the end-to-end operational integrity of the platform, ensuring availability, performance, and customer trust, while partnering closely with Engineering, Product, and Customer-facing teams; being a key contributor to our GRR goal of 90%+

Sycurio employs a strategic managed service provider who provides the people, tooling, and day-to-day execution across all support tiers. The Head of Support sets the standards, governs vendor performance, and ensures every aspect of the support experience — from incident response to customer satisfaction — meets enterprise-grade expectations

Key Responsibilities:
  • Service Reliability & Platform Stability

  • Own platform availability, performance, and reliability across all tenants

  • Reduce incident frequency, severity, and blast radius

  • Establish and drive Service Reliability Engineering (SRE) principles

  • Ensure scalability and operational readiness of a multi-tenant platform

  • Incident Management & Response

  • Implement and lead a structured incident management framework (P1–P4)

  • Act as executive owner of major incidents (P1/P2)

  • Drive improvements in:

  • Mean Time to Detect (MTTD)

  • Mean Time to Resolve (MTTR)

  • Ensure clear, consistent internal and external communication during incidents

  • Observability & Monitoring

  • Define and implement a comprehensive observability strategy, including:

  • Technical telemetry (infrastructure, application, APIs)

  • Business telemetry (transactions, payment success rates, usage)

  • End-to-end customer journey visibility

  • Ensure issues are detected proactively, not customer-reported

  • Partner with Product and Engineering to embed telemetry into the platform

  • Support Operations (L1–L3)

  • Lead global support teams ensuring high-quality, SLA-driven case management

  • Define and enforce support processes, tooling, and performance standards

  • Improve key metrics:

  • First response time

  • Resolution time

  • Reopen rate

  • Escalation quality

  • Platform Operations & Change Management

  • Oversee operational aspects of the platform, including:

  • Release management and deployment safety, ensuring all releases are observable, reversible, and low-risk

  • Change control processes

  • Environment consistency across staging and production

  • Own the visibility and continuous improvement of delivery and recovery performance using the DORA metrics, in partnership with Engineering

  • Issue Management & Root Cause Discipline

  • Establish rigorous Root Cause Analysis (RCA) standards

  • Identify and eliminate systemic issues (not just symptom fixes)

  • Track and reduce recurring incidents

  • Feed insights into Product and Engineering roadmaps

  • Customer Experience & Commercial Alignment

  • Align support with Customer Success and Sales

  • Ensure coordinated communication during incidents

  • Protect customer relationships during critical events

  • Introduce tenant-aware impact assessment (ARR, strategic accounts, regulatory exposure)

  • Support enterprise-grade expectations for transparency and reliability

  • Cross-functional Leadership

  • Act as the bridge between:

  • Engineering

  • Product

  • Customer Delivery / Success

  • Embed supportability and operational readiness into:

  • Pre-sales (Stage 4/5 governance)

  • Product development

  • Deployment processes

  • Managed Service Governance

  • Chair regular operational reviews and quarterly business reviews with the managed service leadership team

  • Own the managed service scorecard — defining KPIs, reviewing performance data, and driving accountability for misses

  • Manage contract compliance, SLA adherence, and commercial exposure from managed service underperformance

  • Lead continuous improvement programs jointly with the managed service provider, including tooling upgrades, process redesigns, and training investments

  • Maintain an escalation path for systemic or persistent managed service failure, up to and including remediation planning

Key qualifications, skills, experience:

Required

  • 10+ years in Support, Platform Operations, or SRE leadership roles

  • Proven experience in multi-tenant SaaS and legacy environments

  • Strong understanding of:

  • Distributed systems

  • Incident management at scale

  • Observability frameworks

  • Track record of building and scaling high-performing operational teams

  • Experience in outsourced or hybrid operational models

  • Experience working cross-functionally with Engineering and Product

Preferred

  • Background in payments, security, or compliance-driven environments (e.g., PCI)

  • Experience with API-first platforms and telephony/payment flows

  • Familiarity with observability tools (e.g., Grafana, etc.)

Job Location

United States, United States

Frequently asked questions about this position

Continue to apply
Enter your email to continue. You’ll be redirected to the employer’s application.
By clicking Continue, you understand and agree to JobTarget's Terms of Use and Privacy Policy.