remote

Database SRE Manager Remote, AUS - CrowdStrike

Site Reliability Engineer

Lead a remote Database SRE team, driving reliability, automation, and performance for mission‑critical data services using Kubernetes, Prometheus, Grafana, AWS, Terraform, and Python. Own incident response, capacity planning, and continuous improvement of database infrastructure.

About the role

Key Responsibilities

Lead and mentor a distributed team of Database SREs, ensuring high availability and performance of critical data services.
Design, implement, and maintain Kubernetes‑based database clusters, leveraging Prometheus and Grafana for observability.
Automate infrastructure provisioning and configuration with Terraform, AWS services, and Python scripts.
Own incident response, root‑cause analysis, and post‑mortem processes to continuously improve reliability.
Collaborate with DevOps, security, and product teams to define SLAs, capacity plans, and disaster‑recovery strategies.

Requirements

5+ years of experience in database operations, SRE, or site reliability engineering.
Proficiency with Kubernetes, Prometheus, Grafana, AWS, Terraform, and Python.
Strong incident management and root‑cause analysis skills.
Excellent communication and leadership abilities in a remote, distributed environment.

Skills

kubernetesprometheusgrafanaawsterraformpython

CompanyCrowdStrike

DepartmentEngineering

LocationAustralia

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 19, 2026