remote

Senior Manager, Site Reliability Engineering - Brinker International

Software Engineer

Lead a high‑performing Site Reliability Engineering team, driving reliability, automation, and scalability for critical restaurant support platforms using Kubernetes, AWS, Terraform, and modern CI/CD practices.

About the role

Key Responsibilities

Lead, mentor, and grow a team of SRE engineers to deliver highly available, performant services for restaurant operations.
Design and implement cloud‑native architectures on AWS, leveraging Kubernetes, Terraform, and serverless components.
Develop and maintain CI/CD pipelines, automated testing, and release processes to accelerate safe deployments.
Establish observability standards using Prometheus, Grafana, and logging solutions; drive proactive monitoring and alerting.
Own incident response, root‑cause analysis, and post‑mortem processes to continuously improve system reliability.
Collaborate with product, security, and infrastructure teams to embed reliability and scalability into the development lifecycle.

Requirements

5+ years of hands‑on SRE or DevOps experience, with at least 2 years in a people‑management role.
Deep expertise in Kubernetes orchestration, AWS services, and infrastructure‑as‑code (Terraform or CloudFormation).
Proficiency in scripting or programming languages such as Python for automation and tooling.
Strong background in CI/CD tooling (Jenkins, GitLab CI, GitHub Actions) and automated testing frameworks.
Demonstrated ability to lead incident management, perform root‑cause analysis, and drive continuous improvement.

Skills

kubernetesawsterraformpythoncicd

CompanyBrinker International

DepartmentEngineering

LocationCoppell, Texas, United States

Experience5+ years

Tenurefull-time

LevelSenior

Posted June 24, 2026