remote
DevOps and Site Reliability Engineer - Booz Allen Hamilton
Site Reliability Engineer
Senior DevOps and SRE engineer building secure, scalable container platforms on AWS, leveraging Kubernetes, Terraform, and CI/CD pipelines to deliver mission‑critical services with high reliability and observability.
About the role
Key Responsibilities
- Design, implement, and maintain containerized application platforms using Kubernetes on AWS.
- Build and manage CI/CD pipelines with GitOps principles to accelerate delivery cycles.
- Automate infrastructure provisioning and configuration with Terraform and other IaC tools.
- Implement robust monitoring, logging, and alerting using Prometheus, Grafana, and ELK stack.
- Ensure security compliance through vulnerability scanning, secrets management, and IAM best practices.
- Collaborate with development, security, and operations teams to troubleshoot incidents and drive continuous improvement.
Requirements
- 5+ years of experience in DevOps or SRE roles, with a strong focus on cloud-native technologies.
- Hands‑on expertise with Kubernetes, Docker, and AWS services (EKS, ECS, S3, IAM).
- Proficiency in IaC tools such as Terraform or CloudFormation.
- Solid scripting skills in Bash, Python, or Go for automation.
- Experience with CI/CD tools (Jenkins, GitLab CI, ArgoCD) and observability platforms.
Skills
kubernetesawsterraformcicd