remote

Site Reliability Engineer SRE - Tech Next

Site Reliability Engineer

Senior Site Reliability Engineer with 8+ years of experience driving reliability, scalability, and automation across mission‑critical cloud platforms, leveraging Kubernetes, AWS, and advanced observability tools to ensure high performance and operational excellence.

About the role

Key Responsibilities

Design, implement, and maintain highly available, scalable infrastructure on AWS, ensuring 99.99% uptime for mission‑critical services.
Develop and automate deployment pipelines, configuration management, and monitoring solutions using Kubernetes, Terraform, and CI/CD tools.
Lead incident response, root‑cause analysis, and post‑mortem processes to continuously improve system reliability.
Collaborate with engineering teams to embed observability, performance testing, and capacity planning into the development lifecycle.
Drive platform engineering initiatives, standardizing best practices, tooling, and documentation across the organization.

Requirements

8+ years of SRE or DevOps experience in large‑scale production environments.
Proficiency with AWS services, Kubernetes, Terraform, and CI/CD pipelines.
Strong scripting skills in Python or Bash for automation and tooling.
Deep understanding of monitoring, logging, and alerting platforms (Prometheus, Grafana, ELK).
Excellent problem‑solving, communication, and collaboration skills.

Skills

kubernetesaws

CompanyTech Next

DepartmentEngineering

LocationIndia

Experience3+ years

Tenurefull-time

LevelMid-Level

Salary90,000

Posted June 19, 2026