Senior Site Reliability Engineer in Canada Creek, Nova Scotia at Jobgether
Explore Related Opportunities
Job Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer in Canada.
This role sits at the heart of a fast-moving engineering organization, focused on building and maintaining the infrastructure that powers reliable, scalable SaaS products. You will work across multiple development teams to ensure services are resilient, observable, and efficient in production environments. The position blends software engineering with infrastructure expertise, emphasizing automation, performance, and operational excellence. You will help shape platform reliability strategies while improving developer experience through tooling and self-service systems. The environment is highly collaborative, remote-first, and values clear written communication alongside meaningful synchronous collaboration. Engineers in this role are encouraged to iterate, learn continuously, and balance long-term improvements with immediate operational needs. It is a high-impact opportunity to directly influence system reliability and customer experience at scale.
- Partner with software engineering teams to design, build, and maintain reliable and resilient services in production environments.
- Develop and improve infrastructure automation to enhance usability and reduce operational friction for internal teams.
- Build and maintain observability solutions, including monitoring, logging, and alerting systems to diagnose system performance and stability issues.
- Identify and eliminate operational toil by automating repetitive manual processes.
- Contribute to infrastructure design through documentation, technical designs, and runbooks to support engineering best practices.
- Ensure systems and processes meet security and compliance requirements while maintaining high operational standards.
- Participate in on-call rotations to support incident response and ensure system uptime and reliability.
- 5+ years of experience in Software Engineering, Site Reliability Engineering, or DevOps roles.
- Strong communication skills, both written and verbal, with experience working in collaborative tooling environments (e.g., Slack, Notion, GitHub).
- Hands-on experience with Infrastructure as Code, particularly Terraform and AWS.
- Solid understanding of containerization and orchestration tools (e.g., ECS or equivalent).
- Strong programming skills in languages such as Bash, Python, and/or Golang.
- Experience with observability tools and practices (e.g., Datadog or similar platforms).
- Strong problem-solving mindset with the ability to balance pragmatic execution and long-term engineering quality.
- Nice to have: experience with FinOps, distributed systems, multi-region architectures, SaaS operations, compliance environments (SOC2, HIPAA), or vendor management.
- Competitive compensation package (salary range: $195,700 – $225,000 USD base, depending on experience and location factors).
- Comprehensive medical, dental, and vision coverage (full coverage for employees and partial coverage for dependents in applicable regions).
- Paid vacation time and quarterly mental health days to support work-life balance.
- Retirement savings support (e.g., 401k plan where applicable).
- Provision of company-approved hardware and tools to support productivity.
- Employee Resource Groups (ERGs) supporting diversity, inclusion, and community engagement.
- Fully remote work environment with a strong emphasis on flexibility and asynchronous collaboration.