remote
Lead Site Reliability Engineer Market Risk - JPMorganChase
Site Reliability Engineer
Lead Site Reliability Engineer driving reliability, scalability, and resilience for Market Risk products using Kubernetes, AWS, CI/CD pipelines, and advanced monitoring. Own design reviews, mentor teams, and shape SRE culture across medium‑to‑large scale services.
About the role
Key Responsibilities
- Lead SRE initiatives, championing reliability culture and best practices across the Market Risk technology stack.
- Conduct resiliency design reviews, decompose complex problems, and guide engineering teams through technical solutions.
- Own incident response, post‑mortem analysis, and continuous improvement of monitoring, alerting, and automation.
- Architect and maintain scalable, highly available services on Kubernetes and AWS, ensuring performance and cost efficiency.
- Mentor and coach engineers, fostering a collaborative environment and driving professional growth.
Requirements
- 5+ years of SRE or DevOps experience in a large, regulated environment.
- Deep expertise with Kubernetes, AWS services (EKS, EC2, S3, CloudWatch), and CI/CD tooling (GitHub Actions, Jenkins, ArgoCD).
- Strong background in monitoring, observability, and incident management (Prometheus, Grafana, PagerDuty).
- Excellent communication skills, able to influence cross‑functional teams and translate business needs into technical solutions.
- Experience with security, compliance, and audit requirements in financial services is a plus.