remote
Software Engineering Manager - SRE - Marks & Spencer
Engineering Manager
Lead a high‑performing SRE team, driving reliability, automation, and cloud operations for critical digital services using Kubernetes, CI/CD pipelines, and advanced monitoring. Shape culture, processes, and technical strategy to deliver resilient, scalable systems.
About the role
Key Responsibilities
- Lead, mentor, and grow a cross‑functional SRE team focused on reliability, performance, and automation.
- Design and implement scalable, highly available cloud architectures using Kubernetes and CI/CD pipelines.
- Own incident response, post‑mortem processes, and continuous improvement of monitoring, alerting, and capacity planning.
- Collaborate with product, development, and security teams to embed reliability best practices into the software delivery lifecycle.
- Drive the adoption of new tools and technologies that enhance operational efficiency and reduce toil.
Requirements
- 5+ years of experience in SRE or DevOps roles, with 2+ years in a leadership capacity.
- Proficiency with Kubernetes, cloud platforms (AWS, GCP, or Azure), and CI/CD tooling (GitHub Actions, Jenkins, ArgoCD).
- Strong background in monitoring, alerting, and incident management (Prometheus, Grafana, PagerDuty).
- Excellent communication skills and a proven ability to influence cross‑functional teams.
- Experience with automation, scripting (Python, Bash) and infrastructure as code (Terraform, CloudFormation).