Senior Site Reliability Engineer in Canada Creek, Nova Scotia at Jobgether
Explore Related Opportunities
Job Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer in Canada.
In this role, you will help design, build, and maintain the foundational infrastructure that powers a high-scale SaaS platform used by leading global companies. You will work at the intersection of engineering, operations, and platform reliability, ensuring that systems remain scalable, secure, and highly available. Acting as a key partner to product and development teams, you will enable faster delivery cycles while improving system observability and resilience. This is a hands-on engineering role where automation, cloud infrastructure, and performance optimization are central to your daily work. You will contribute to reducing operational overhead by eliminating manual processes and building self-service tooling for internal teams. The environment is fully remote, highly collaborative, and focused on thoughtful engineering practices, strong documentation, and continuous improvement.
- Partner with engineering teams to design and implement highly reliable, scalable, and resilient services
- Build and maintain infrastructure automation tools to improve usability and operational efficiency across teams
- Develop and enhance observability systems, dashboards, and monitoring tools to detect and resolve performance issues
- Eliminate operational toil by automating repetitive processes and improving system workflows
- Design, maintain, and improve Infrastructure as Code using tools such as Terraform on AWS
- Manage containerized environments and orchestration systems (e.g., ECS) to support production workloads
- Ensure compliance, security, and reliability standards are consistently met across systems
- Contribute to technical documentation, runbooks, and design reviews to improve engineering knowledge sharing
- 5+ years of experience in Software Engineering, Site Reliability Engineering, or DevOps roles
- Strong experience with Infrastructure as Code, particularly Terraform and AWS environments
- Hands-on experience with containers and orchestration systems such as ECS
- Proficiency in programming or scripting languages such as Python, Bash, or Golang
- Experience working with observability tools such as Datadog or similar platforms
- Strong communication skills, both written and verbal, with experience using tools like Slack, Notion, and GitHub
- Ability to work effectively in distributed systems environments and collaborate across engineering teams
- Strong problem-solving mindset with a focus on reliability, scalability, and automation
- Comprehensive health coverage including medical, dental, and vision (US-specific structure where applicable)
- Paid vacation days and quarterly mental health days to support work-life balance
- Retirement savings plan (e.g., 401k where applicable)
- Company-provided hardware and tools to support productivity
- Inclusive employee resource groups supporting diversity, equity, and belonging
- Remote-first work environment with strong emphasis on flexibility and autonomy
- Opportunity to work on large-scale SaaS infrastructure serving global customers
- Strong engineering culture focused on collaboration, ownership, and continuous improvement