onsite

Systems Engineer - SRE Enablement - AutoZone

Site Reliability Engineer

Lead SRE Enablement across hybrid GCP and on‑prem environments, establishing reliability standards, building shared tooling, and coaching teams to embed operational excellence.

About the role

Key Responsibilities

Define and enforce SRE best practices, reliability standards, and incident response procedures across the organization.
Design, develop, and maintain shared automation and monitoring tools that span Google Cloud Platform and on‑prem infrastructure.
Collaborate with application, infrastructure, and architecture teams to integrate SRE principles into new and existing services.
Provide hands‑on guidance and mentorship to development teams on reliability, observability, and capacity planning.
Drive continuous improvement initiatives, including post‑mortem analysis, blameless culture, and reliability metrics.

Requirements

5+ years of experience in SRE, DevOps, or systems engineering roles.
Deep expertise with Google Cloud Platform services (Compute Engine, Kubernetes Engine, Cloud Monitoring, Cloud Logging).
Strong scripting/automation skills (Python, Bash, Terraform, or similar).
Proven track record of building and scaling monitoring, alerting, and incident response tooling.
Excellent communication skills and a collaborative mindset.

Skills

pythongojavagcpkubernetesterraformansible

CompanyAutoZone

DepartmentEngineering

LocationMemphis, Tennessee, United States

Experience5+ years

Tenurefull-time

LevelMid-Level

Posted June 21, 2026