onsite

Manager, SRE - airasia

Site Reliability Engineer

Lead a high‑performing Site Reliability Engineering team, driving reliability, automation, and cloud‑native operations across multi‑cloud environments using Kubernetes, CI/CD pipelines, and advanced monitoring tools.

About the role

Key Responsibilities

Lead and mentor a team of SREs to design, build, and maintain highly available, scalable services across AWS and GCP.
Architect and implement CI/CD pipelines, infrastructure as code, and automated deployment workflows.
Define and enforce reliability SLAs, SLOs, and error budgets, driving continuous improvement.
Oversee incident response, root‑cause analysis, and post‑mortem processes to reduce MTTR.
Collaborate with development, security, and product teams to embed reliability best practices into the software lifecycle.

Requirements

5+ years of SRE/DevOps experience in a fast‑paced, cloud‑native environment.
Proficiency with Kubernetes, Docker, and container orchestration at scale.
Strong scripting skills (Python, Bash) and experience with IaC tools (Terraform, CloudFormation).
Hands‑on experience with monitoring/observability stacks (Prometheus, Grafana, ELK, Datadog).
Excellent communication, leadership, and problem‑solving abilities.

Skills

kubernetescicd

Companyairasia

DepartmentEngineering

LocationKL Sentral - Redstation, Kerala, India

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 21, 2026