onsite

Database SRE Manager AUS

Site Reliability Engineer

Lead a team of Database Site Reliability Engineers, ensuring high availability, performance, and scalability of Apache Cassandra and Kafka deployments across AWS and Azure cloud environments.

About the role

Key Responsibilities

Lead, mentor, and grow a team of Database SREs responsible for Cassandra and Kafka clusters.
Design, implement, and maintain highly available, fault‑tolerant database architectures on AWS and Azure.
Develop automation and infrastructure‑as‑code solutions (e.g., Terraform, Ansible) to streamline provisioning, scaling, and disaster recovery.
Monitor system health, define SLOs/SLIs, and drive incident response and post‑mortem processes.
Collaborate with development, security, and product teams to integrate reliability best practices into the software lifecycle.

Requirements

5+ years of experience operating large‑scale Cassandra and Kafka deployments in production.
Strong expertise with AWS and Azure services, including networking, storage, and compute resources.
Proficiency in Linux system administration and scripting (Bash, Python, or similar).
Hands‑on experience with infrastructure‑as‑code tools such as Terraform or CloudFormation.
Demonstrated ability to lead technical teams, drive reliability initiatives, and communicate effectively with stakeholders.

Skills

awsazurelinuxterraform

DepartmentEngineering

LocationSydney, Australia

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 20, 2026