remoteonsite
Manager, Cloud Support Operations - OpenText
Software Engineer
Lead cloud support operations, driving incident resolution, automation, and reliability across AWS, Azure, and GCP environments. Leverage Kubernetes, DevOps practices, and SRE principles to ensure high availability and continuous improvement of cloud services.
About the role
Key Responsibilities
- Oversee day‑to‑day cloud support operations, ensuring rapid incident detection, triage, and resolution across AWS, Azure, and GCP platforms.
- Implement and maintain automation pipelines (CI/CD, IaC) to streamline deployments and reduce manual effort.
- Collaborate with engineering, security, and product teams to define and enforce best practices for cloud architecture, monitoring, and capacity planning.
- Lead root‑cause analysis and post‑mortem processes, translating findings into actionable improvements and knowledge base updates.
- Mentor and coach support staff, fostering a culture of continuous learning and operational excellence.
Requirements
- 5+ years of experience in cloud operations or SRE roles, with hands‑on expertise in AWS, Azure, and GCP.
- Proficiency with Kubernetes, container orchestration, and infrastructure-as-code tools (Terraform, CloudFormation).
- Strong scripting skills (Python, Bash) and familiarity with monitoring/alerting platforms (Prometheus, Grafana, Datadog).
- Experience with incident management frameworks (ITIL, SRE) and post‑mortem documentation.
- Excellent communication, problem‑solving, and leadership abilities.
Skills
awsazuregcpkubernetes