onsite
Director, Site Reliability Engineering - Fifth Third Bank
Software Engineer
Lead a high‑performing SRE team to build, operate, and scale a secure, highly available financial platform using cloud, observability, and automation tools.
About the role
Key Responsibilities
- Lead and mentor a team of SRE engineers, driving culture of reliability, ownership, and continuous improvement.
- Design, implement, and maintain scalable, secure, and highly available platform infrastructure across cloud environments.
- Develop and enforce incident response processes, runbooks, and post‑mortem practices to reduce MTTR and improve system resilience.
- Architect and automate observability, monitoring, and alerting solutions to provide end‑to‑end visibility into platform health.
- Collaborate with product, security, and operations teams to embed reliability best practices into the software delivery lifecycle.
Requirements
- 10+ years of experience in software engineering with 5+ years in SRE or DevOps leadership roles.
- Deep expertise in cloud platforms (AWS, Azure, or GCP), container orchestration, and infrastructure as code.
- Proven track record of building and scaling observability, incident management, and automation pipelines.
- Strong communication skills and ability to influence cross‑functional teams.
- Experience with security best practices and compliance in a regulated financial environment is a plus.
Skills
awsgcpazureprocess improvement