remote
Senior Site Reliability Engineer, Enterprise Cloud Platforms - Bank of America
Site Reliability Engineer
Senior Site Reliability Engineer responsible for designing, operating, and scaling enterprise cloud platforms on AWS, leveraging Kubernetes, Terraform, Python automation, CI/CD pipelines, and advanced monitoring to ensure high availability and performance.
About the role
Key Responsibilities
- Design, implement, and maintain highly available, scalable cloud infrastructure on AWS for enterprise applications.
- Develop and manage Kubernetes clusters, including networking, security, and observability.
- Automate provisioning and configuration using Terraform and Python scripts to support continuous delivery.
- Build and maintain CI/CD pipelines that enable rapid, reliable code deployments.
- Implement robust monitoring, alerting, and incident response processes to meet SLOs and improve system reliability.
Requirements
- 5+ years of experience in site reliability or DevOps engineering, with a focus on cloud platforms.
- Strong expertise in AWS services, Kubernetes orchestration, and infrastructure-as-code (Terraform).
- Proficiency in Python for automation and scripting tasks.
- Hands‑on experience building CI/CD pipelines and implementing monitoring/observability tools.
- Solid understanding of networking, security, and performance tuning in cloud environments.
Skills
kubernetesawsterraformpythoncicd