onsite
Senior Site Reliability Engineer Private Cloud - Lloyds Banking Group
Site Reliability Engineer
Senior Site Reliability Engineer driving the design, deployment, and operation of private cloud infrastructure using Kubernetes, Docker, Terraform, and AWS, while ensuring high availability, performance, and security through robust monitoring and CI/CD pipelines.
About the role
Key Responsibilities
- Design, implement, and maintain scalable Kubernetes clusters and Docker container environments for private cloud services.
- Automate infrastructure provisioning and configuration using Terraform and AWS CloudFormation.
- Develop and maintain CI/CD pipelines to streamline application deployments and updates.
- Implement comprehensive monitoring, alerting, and logging solutions to ensure system reliability and performance.
- Collaborate with development teams to optimize application architecture for resilience and scalability.
- Conduct incident response, root cause analysis, and post‑mortem reviews to continuously improve system stability.
Requirements
- 5+ years of experience in site reliability engineering or DevOps roles.
- Proficiency with Kubernetes, Docker, Terraform, and AWS services (EC2, EKS, S3, CloudWatch).
- Strong scripting skills in Python or Bash for automation and tooling.
- Experience with monitoring tools such as Prometheus, Grafana, or Datadog.
- Excellent problem‑solving skills and a proactive approach to system optimization.
Skills
kubernetesdockerterraformawspythoncicd