remote
Sr Administrator Tools & Automation - HCLTech
Software Engineer
Senior Administrator leading a 24x7 Global Command Center, overseeing proactive monitoring, incident response, and service continuity using monitoring platforms, job scheduling tools, and ITIL‑based processes.
About the role
Key Responsibilities
- Lead and mentor a 24x7 command center team to ensure continuous monitoring of IT infrastructure, applications, and network services.
- Drive incident detection, escalation, and resolution using event‑monitoring platforms and ITIL best practices.
- Manage job/batch scheduling workflows, ensuring timely execution and recovery of critical workloads.
- Maintain and optimize monitoring tool configurations, dashboards, and alerts to meet SLA targets.
- Coordinate with cross‑functional teams to implement service continuity plans and conduct post‑incident reviews.
Requirements
- 5+ years of experience in command‑center or NOC operations with a focus on monitoring and event management.
- Strong knowledge of monitoring solutions (e.g., Splunk, Nagios, Zabbix) and job scheduling tools (e.g., Control‑M, AutoSys).
- Hands‑on scripting ability in Python or Bash for automation and alert customization.
- ITIL certification or proven experience applying ITIL processes for incident, problem, and change management.
- Excellent leadership, communication, and stakeholder‑management skills in a global, 24x7 environment.