onsite
Senior Systems Operations Engineer - Technology Major Problem Manager - Wells Fargo
Systems Engineer
Lead end‑to‑end Technology Major Problem Management for high‑severity enterprise events, driving root‑cause analysis, systemic risk mitigation, and long‑term reliability improvements using Incident, Change, and Risk Management practices.
About the role
Key Responsibilities
- Own the end‑to‑end lifecycle of Technology Major Problem Management for high‑severity incidents, ensuring thorough root‑cause analysis and actionable remediation plans.
- Collaborate with Technology Major Incident Management, Platform Engineering, Change Management, Risk, and Technology Control teams to translate incident outcomes into long‑term reliability, restorability, and resiliency improvements.
- Lead cross‑functional problem‑management meetings, facilitate post‑mortem reviews, and drive continuous improvement initiatives across the technology stack.
- Develop and maintain problem‑management metrics, dashboards, and reporting to track trend analysis and risk exposure.
- Advise on risk mitigation strategies, change controls, and preventive measures to reduce recurrence of critical incidents.
Requirements
- Extensive experience in Problem Management and Incident Management within a large enterprise environment.
- Strong analytical skills with proven ability to conduct root‑cause analysis and implement systemic risk mitigations.
- Deep understanding of Change Management, Risk Management, and Reliability Engineering principles.
- Excellent communication and stakeholder‑management skills, able to influence cross‑functional teams.
- Experience with platform engineering and large‑scale system operations is highly desirable.