Lead Platform Reliability Engineer in Canada Creek, Nova Scotia at Jobgether
Explore Related Opportunities
Job Description
This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Lead Platform Reliability Engineer based in Canada.
In this role, you will act as a senior technical authority within a global platform services organization, shaping the reliability, scalability, and operational excellence of enterprise-grade cloud infrastructure. You will lead the evolution of Azure-based platforms, including Kubernetes (AKS), PaaS services, and CI/CD ecosystems, ensuring high availability and performance across critical business systems. This position combines deep hands-on engineering expertise with strategic platform leadership, influencing architectural decisions, governance models, and engineering standards. You will collaborate with cross-functional teams, vendors, and stakeholders to deliver resilient, secure, and cost-efficient solutions. A key part of your mission will be driving continuous improvement through observability, automation, and DevOps best practices. You will also play a critical role in mentoring engineers and elevating technical maturity across the organization. This is a high-impact role where platform reliability directly supports business value at scale.
You will be responsible for ensuring the stability, scalability, and continuous evolution of enterprise cloud platforms, with a strong focus on Azure ecosystem reliability and engineering excellence. This includes ownership of platform architecture, operational performance, and lifecycle management of critical cloud services.
- Lead the design, operation, and optimization of Azure-based platforms, including AKS, PaaS services, and CI/CD pipelines.
- Define and enforce platform standards, governance models, security practices, and reliability patterns.
- Manage platform roadmaps, upgrades, and modernization initiatives in collaboration with stakeholders and vendors.
- Oversee incident management and drive end-to-end resolution of complex platform issues in a multi-vendor environment.
- Ensure 24/7 service reliability and compliance with SLA commitments across production systems.
- Implement and improve observability practices, including monitoring, logging, and performance analytics.
- Drive cost optimization initiatives across Azure subscriptions and cloud resource usage.
- Mentor engineers and contribute to building technical capabilities within the platform organization.
- Support delivery of large-scale infrastructure and transformation projects.
- Collaborate with product, engineering, and operations teams to align platform capabilities with business needs.
You bring extensive experience in cloud platform engineering, with deep expertise in Azure infrastructure, DevOps practices, and large-scale distributed systems. You are a strong technical leader capable of influencing architecture while remaining hands-on in complex environments.
- 10+ years of experience in software engineering, platform engineering, or infrastructure roles.
- 5+ years of hands-on experience with Microsoft Azure, including AKS and core PaaS services.
- Strong expertise in Kubernetes, containerization (Docker), and cloud-native architectures.
- Solid experience with CI/CD pipelines and DevOps tooling (Terraform, GitHub Actions, Jenkins, Helm, Flux).
- Deep understanding of Azure services such as Key Vault, Functions, Databricks, Synapse, Redis, and Azure Monitor.
- Experience with microservices, distributed systems, and performance/capacity management.
- Familiarity with messaging, streaming, and event-driven architectures.
- Strong knowledge of observability, incident response, and SRE principles.
- Ability to manage technical stakeholders and influence architectural decisions.
- Experience with financial or capacity planning for cloud environments is highly desirable.
- Strong communication skills and proven leadership in cross-functional environments.
- Degree in Computer Science or related field (or equivalent experience).
- Competitive salary aligned with North American market standards
- Annual performance-based bonus and incentive programs
- Comprehensive health, dental, and vision insurance coverage
- Mental health and wellness support programs
- Retirement savings plans with employer contributions
- Paid vacation, personal days, and statutory leave entitlements
- Hybrid work model (3 days in-office, 2 days remote)
- Learning and development programs, including certifications and training support
- Inclusive and flexible work environment focused on well-being and growth
- Career progression opportunities in a global technology organization.