onsite
Site Reliability / DevOps Engineer - eClerx
Devops Engineer
Motivated SRE/DevOps Engineer to design, automate, and monitor cloud‑native platforms, driving reliability and scalability for enterprise applications using AWS, Kubernetes, Terraform, CI/CD pipelines, and observability tools.
About the role
Key Responsibilities
- Design, provision, and manage highly available AWS infrastructure using Terraform and IaC best practices.
- Build, maintain, and optimize CI/CD pipelines (e.g., Jenkins, GitHub Actions) to enable rapid, reliable software delivery.
- Implement and operate container orchestration with Kubernetes, ensuring scalability, security, and fault tolerance.
- Develop and maintain observability solutions (Prometheus, Grafana, logging) to monitor system health, detect anomalies, and drive incident response.
- Collaborate with development and product teams to embed SRE principles, define SLAs/SLOs, and improve overall platform resilience.
Requirements
- 3+ years of hands‑on experience in cloud platforms (AWS) and container orchestration (Kubernetes).
- Proficiency with infrastructure‑as‑code tools, especially Terraform.
- Strong background in CI/CD automation and scripting (Linux, Python, Bash).
- Experience implementing observability stacks, including metrics, tracing, and logging.
- Solid understanding of SRE concepts, incident management, and performance tuning of distributed systems.
Skills
awskubernetesterraformcicdlinux