onsite

Senior Site Reliability Engineer - NAB - National Australia Bank

Site Reliability Engineer

Lead the design, deployment, and operation of highly available, scalable cloud services using Kubernetes, Docker, and AWS, while implementing robust monitoring, alerting, and automation to ensure optimal performance and reliability.

About the role

Key Responsibilities

Architect, deploy, and maintain production-grade Kubernetes clusters and containerized workloads across AWS environments.
Design and implement CI/CD pipelines, infrastructure as code (Terraform), and automated rollouts to accelerate feature delivery.
Develop and maintain comprehensive monitoring, alerting, and observability solutions using Prometheus, Grafana, and related tooling.
Collaborate with development, security, and product teams to define reliability SLAs, SLOs, and incident response procedures.
Lead post‑mortem analyses, root‑cause investigations, and continuous improvement initiatives to reduce MTTR and prevent recurrence.

Requirements

5+ years of experience in site reliability engineering or DevOps roles, with a strong focus on cloud-native technologies.
Proficient in Kubernetes, Docker, and AWS services (EKS, EC2, S3, CloudWatch).
Hands‑on experience with Terraform, CI/CD tools (GitHub Actions, Jenkins, ArgoCD), and scripting (Bash, Python).
Deep understanding of monitoring, logging, and alerting best practices using Prometheus, Grafana, ELK, or similar stacks.
Excellent problem‑solving skills, strong communication, and a proactive, customer‑centric mindset.

Skills

kubernetesdockercicdawsprometheusgrafanaterraform

CompanyNAB - National Australia Bank

DepartmentEngineering

LocationMelbourne City Centre, Australia

Experience5+ years

Tenurefull-time

LevelSenior

Posted June 24, 2026