Reliability Engineer (SRE) - Application Performance Specialist in Brazil, Indiana at Jobgether
Explore Related Opportunities
Job Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Reliability Engineer (SRE) - Application Performance Specialist in Brazil.
This role is focused on ensuring the reliability, scalability, and high performance of critical Node.js-based applications within a modern, fast-moving engineering environment. You will act at the intersection of software engineering and site reliability, identifying performance bottlenecks, improving system resilience, and ensuring seamless production operations. The position involves close collaboration with development teams to optimize application behavior across distributed systems and data-intensive workloads. You will be responsible for implementing observability, monitoring, and incident response practices that safeguard system stability. The environment is highly technical and collaborative, with a strong emphasis on continuous improvement, automation, and operational excellence. This is a hands-on role where your work directly impacts system uptime, user experience, and platform scalability.
- Design, develop, and maintain scalable and high-performance Node.js applications using frameworks such as NestJS, with PostgreSQL and MongoDB databases.
- Ensure system reliability, stability, and efficiency across application and infrastructure layers.
- Optimize application performance for scalability, responsiveness, and resource efficiency.
- Implement and manage monitoring, alerting, and observability systems to proactively detect and resolve issues.
- Conduct root cause analysis of production incidents and implement long-term preventive solutions.
- Collaborate with engineering teams to identify performance bottlenecks and improve system design.
- Develop and maintain technical documentation for system configurations, operations, and troubleshooting procedures.
- Strong experience in Software Engineering, Site Reliability Engineering (SRE), or similar roles.
- Hands-on experience developing and optimizing Node.js applications, preferably with NestJS.
- Solid knowledge of Linux systems, command-line tools, and system troubleshooting.
- Experience with monitoring, logging, and incident response tools and practices.
- Ability to write automation and operational scripts in Python, Bash, or similar languages.
- Strong English communication skills for collaboration in international environments.
- Experience working with cloud environments such as AWS is a plus.
- Familiarity with PostgreSQL and MongoDB performance tuning and database optimization.
- Knowledge of containerization and orchestration tools such as Docker and Kubernetes is a plus.
- Strong analytical mindset with excellent problem-solving and incident management skills.
- Competitive compensation package aligned with experience and market standards.
- Flexible hybrid or remote work arrangements.
- Dynamic and fast-growing engineering environment.
- Culture built on collaboration, integrity, and technical excellence.
- Opportunities for professional growth in cloud-native and distributed systems.
- Exposure to large-scale, high-availability production systems.
- Strong focus on automation, reliability engineering, and continuous improvement.