remote
Director, Data Center Reliability Engineering - Oracle
Software Engineer
Lead data‑center reliability initiatives, standardizing FMEA, RCA, and continuous improvement across sites while deploying monitoring, analytics, and automation tools to drive uptime and KPI reporting.
About the role
Key Responsibilities
- Lead reliability engineering and analytics teams across multiple data‑center sites.
- Standardize and enforce FMEA, RCA, and continuous improvement methodologies.
- Oversee deployment of monitoring, analytics, and automation tools supporting reliability programs.
- Define, track, and report reliability KPIs to executive and global operations leadership.
- Ensure corrective actions are implemented, verified, and sustained.
- Develop engineers and analysts in disciplined, data‑driven problem solving.
Requirements
- Senior experience in reliability engineering, maintenance engineering, or uptime‑critical environments.
- Strong background in analytics, RCA rigor, and data‑driven decision making.
- Proven ability to lead cross‑functional teams and influence executive stakeholders.
- Experience with monitoring, automation, and KPI reporting tools.
- Excellent communication and mentoring skills.
Skills
software developmentsystem designproblem solving