What type of employment is offered for this Data Site Reliability Engineer (SRE) – Databricks & Azure role?

Full-time or part-time position

What is the expected salary for this Data Site Reliability Engineer (SRE) – Databricks & Azure job?

Compensation will be discussed during the hiring process.

Data Site Reliability Engineer (SRE) – Databricks & Azure at Jobgether | Jobs and Employment

Data Site Reliability Engineer (SRE) Databricks & Azure

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Data Site Reliability Engineer (SRE) – Databricks & Azure in Brazil.

This role is focused on ensuring the reliability, scalability, and operational excellence of a modern cloud-based data platform built on Databricks and Microsoft Azure. You will be responsible for maintaining and optimizing data pipelines, infrastructure, and deployment workflows that power critical business data use cases. Working in a fully remote environment, you will collaborate closely with data engineers, platform engineers, and DevOps teams to enhance system performance, stability, and cost efficiency. The position requires a strong focus on automation, incident response, and continuous improvement practices. You will play a key role in safeguarding production environments and ensuring seamless release cycles across data workloads. This is an opportunity to contribute to a highly scalable data ecosystem within a fast-paced, cloud-first engineering culture.

Accountabilities:

Ensure the reliability, performance, and scalability of data pipelines and Databricks-based workloads across Azure cloud environments
Design, implement, and maintain CI/CD pipelines using Azure DevOps, including triggers, environments, and deployment workflows
Manage Azure DevOps projects, including user access, permissions, repositories, and branching strategies
Provision and maintain cloud infrastructure using Terraform and Infrastructure as Code best practices
Support Databricks environments (Unity Catalog, Delta Lake, Lakehouse Federation) and optimize performance and stability
Lead incident management processes, including troubleshooting, communication, post-mortems, and runbook creation
Coordinate and support end-to-end release management activities with multiple stakeholders
Collaborate with engineering teams to improve system observability, automation, and operational efficiency

Requirements:

Strong experience with SQL for querying, troubleshooting, and performance optimization
Proficiency in Bash and Python scripting for automation and operational tasks
Hands-on experience with Azure cloud services, including compute, storage, networking, and containerization
Proven experience building CI/CD pipelines using Azure DevOps
Solid knowledge of Git workflows, repository management, and branch policies
Strong experience with Terraform and Infrastructure as Code principles
Experience working with Databricks and related Lakehouse technologies
Background in incident management, production support, and release coordination
Familiarity with observability tools such as Prometheus or Grafana is a plus
Experience in data engineering, distributed systems, or big data ecosystems is highly desirable
Strong problem-solving, communication, and collaboration skills

Benefits:

Fully remote work model
Full-time permanent position
Opportunity to work with modern cloud and data technologies (Azure, Databricks, Terraform)
Exposure to large-scale data platforms and enterprise-grade engineering environments
Career growth in cloud engineering, DevOps, and data reliability domains
Collaborative and international engineering culture
Inclusion in a global organization with strong focus on continuous learning and innovation

How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

#LI-CL1

Data Site Reliability Engineer (SRE) – Databricks & Azure at Jobgether – Brazil, Indiana

Explore Related Opportunities

About This Position

Scan to Apply

Job Location

Frequently asked questions about this position