onsite
Support Lead - Kubernetes - HCLTech
Software Engineer
Lead L1/L2 support for production Kubernetes environments, handling incident management, root‑cause analysis, and performance tuning using Linux, SQL, REST APIs, AIOps and observability tools while ensuring SLA compliance.
About the role
Key Responsibilities
- Provide L1/L2 production support for Kubernetes‑based applications, managing the full ticket lifecycle from detection to resolution.
- Perform Linux system troubleshooting, log analysis, and SQL query debugging to identify and fix application and integration issues.
- Utilize AIOps and observability platforms to monitor events, detect anomalies, and drive proactive incident prevention.
- Conduct root‑cause analysis, create detailed incident reports, and implement corrective actions to meet SLA targets.
- Collaborate with development and infrastructure teams to design and improve monitoring, alerting, and performance‑optimization strategies.
Requirements
- 3+ years of hands‑on experience in production support for Kubernetes and containerized workloads.
- Strong Linux administration skills and proficiency in SQL query writing and debugging.
- Experience with REST APIs, messaging systems, and modern observability tools (e.g., Prometheus, Grafana, ELK).
- Familiarity with AIOps platforms and cloud environments (AWS, Azure, or GCP).
- Excellent problem‑solving, communication, and customer‑focus abilities, with a track record of meeting SLA commitments.