Senior Site Reliability Engineer in Brazil, Indiana at Jobgether
Explore Related Opportunities
Job Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer in Brazil.
This role sits at the core of a fast-scaling logistics technology environment where reliability, performance, and automation are critical to powering large-scale distributed systems. You will help design and operate the internal platform that enables engineering teams to deliver high-quality software with confidence and speed. The position blends infrastructure engineering, cloud operations, and software reliability practices in a highly collaborative global setup. You will take ownership of mission-critical systems while continuously improving observability, incident response, and system resilience. Working closely with multiple engineering squads, you will influence architectural decisions and drive platform-wide reliability initiatives. This is a high-impact role where your work directly strengthens system stability, efficiency, and scalability across the organization.
You will be responsible for ensuring the reliability, scalability, and performance of critical infrastructure and platform services while enabling engineering teams to operate efficiently in production environments.
- Design, deploy, and operate scalable cloud-based systems while balancing reliability, cost, and development velocity
- Own and improve SLIs/SLOs, ensuring platform services consistently meet reliability targets
- Lead incident response, root-cause analysis, and postmortem processes to prevent recurring issues
- Build and enhance observability through monitoring, logging, and alerting frameworks
- Support infrastructure-as-code and automation initiatives to improve deployment consistency and efficiency
- Collaborate with engineering teams to improve system design, performance, and operational readiness
- Contribute to CI/CD pipelines, deployment strategies, and release engineering practices
- Provide production support, including occasional off-hours incident handling when required
You bring strong hands-on experience in cloud infrastructure, DevOps, and site reliability engineering, with the ability to operate in complex distributed environments.
- 5+ years of experience in SRE, DevOps, or Cloud Engineering roles
- Strong expertise in AWS, Kubernetes, Docker, and modern cloud-native architectures
- Proficiency in Linux/UNIX systems administration and production troubleshooting
- Experience with infrastructure-as-code tools such as Terraform, Ansible, or Chef
- Strong programming/scripting skills (Python, Bash, or similar) for automation and tooling
- Solid understanding of networking, system design, and distributed systems principles
- Experience with monitoring, logging, and incident management tools and practices
- Familiarity with CI/CD pipelines and DevOps best practices
- Exposure to PostgreSQL or database operations is a plus
- Strong English communication skills and ability to work in global, distributed teams
- Problem-solving mindset with high ownership, initiative, and attention to detail
- Competitive base salary aligned with market standards
- Equity package with ownership opportunities in a high-growth tech environment
- Unlimited PTO and flexible time-off policy
- Remote-first setup within Brazil
- Opportunity to work on large-scale distributed systems in a global engineering organization
- Collaborative, high-impact engineering culture focused on innovation and continuous improvement.