remote

Lead Site Reliability Engineer - JPMorganChase

Site Reliability Engineer

Lead Site Reliability Engineer driving resiliency, scalability, and reliability for enterprise‑grade services using Kubernetes, AWS, Docker, Terraform, and advanced monitoring. Own design reviews, mentor teams, and shape SRE best practices across large‑scale products.

About the role

Key Responsibilities

Lead design and execution of resiliency reviews for medium to large‑sized products, ensuring high availability and fault tolerance.
Mentor and coach engineering teams on SRE principles, incident management, and automation best practices.
Architect and maintain scalable, secure infrastructure using Kubernetes, Docker, Terraform, and AWS services.
Drive continuous improvement of monitoring, alerting, and observability pipelines to reduce MTTR and improve service health.
Collaborate with cross‑functional teams to translate business requirements into reliable, maintainable technical solutions.

Requirements

5+ years of SRE or DevOps experience in a large enterprise environment.
Deep expertise in Kubernetes, Docker, Terraform, and AWS (EC2, EKS, RDS, S3).
Proven track record of designing and operating highly available, scalable systems.
Strong scripting skills (Python, Bash) and experience with CI/CD pipelines.
Excellent communication, leadership, and problem‑solving abilities.

Skills

kubernetesawsdockerterraform

CompanyJPMorganChase

DepartmentEngineering

LocationWilmington, DE, United States

Experience7+ years

Tenurefull-time

LevelLead

Posted June 19, 2026