Site Reliability Engineer
Site Reliability Engineer responsible for enhancing platform stability and reliability on Azure, implementing DevSecOps practices, IaC automation, governance policies, and observability solutions using CI/CD pipelines and Kubernetes.
About Avaya
Avaya is an enterprise software leader that helps the world’s largest organizations and government agencies forge unbreakable connections.
The Avaya Infinity™ platform unifies fragmented customer experiences, connecting the channels, insights, technologies, and workflows that together create enduring customer and employee relationships.
We believe success is built through strong connections – with each other, with our work, and with our mission. At Avaya , you'll find a community that values your contributions and supports your growth every step of the way.
Learn more at https://www.avaya.com
Description
We are seeking a Site Reliability Engineer (SRE) who will drive stability, reliability, and performance across our Azure and GCP-based platforms . This role blends operational excellence, proactive incident management, and strong collaboration with DevOps, Cloud, and Security teams.
The ideal candidate will have hands-on experience with multi-cloud environments (Azure and GCP) , IaC (Terraform/Ansible) , CI/CD (Jenkins/GitHub Actions) , and modern observability and AI-Ops systems . The engineer will also contribute to governance, cost optimization, and automation strategies that reduce toil and prevent issues before they occur. A key aspect of this role is the ability to perform deep-dive troubleshooting of application performance and errors by analyzing logs and traces in platforms like Grafana and Datadog.
This position includes 24×7 support coverage (rotational) and requires strong ownership in managing major incidents, RCA processes, and continuous service improvements.
Key Responsibilities
Reliability & Incident Management
Monitoring, AI-Ops, Alerts & Prevention
Posted June 25, 2026