onsite
Senior SRE Engineer Backend Focused - 17live
Site Reliability Engineer
Lead backend reliability initiatives, designing and maintaining scalable cloud services, automating deployments, and ensuring high availability through robust monitoring and incident response.
About the role
Key Responsibilities
- Design, implement, and maintain highly available backend services on cloud platforms.
- Automate deployment pipelines and infrastructure provisioning using IaC tools.
- Develop and maintain monitoring, alerting, and incident response workflows.
- Collaborate with development teams to embed reliability best practices into code.
- Analyze performance bottlenecks and implement scalability solutions.
Requirements
- 5+ years of experience in SRE or backend engineering roles.
- Strong proficiency with cloud services (AWS, GCP, or Azure) and container orchestration (Kubernetes).
- Hands‑on experience with CI/CD pipelines, scripting (Python, Bash), and configuration management.
- Deep understanding of monitoring tools (Prometheus, Grafana, ELK) and incident management.
- Excellent problem‑solving skills and a proactive, collaborative mindset.
Skills
kubernetesprometheusgrafana