remote
Senior Cloud Support Engineer - CRUSOE
Software Engineer
Senior Cloud Support Engineer responsible for designing, deploying, and maintaining scalable cloud infrastructure across AWS, Azure, and GCP, leveraging Kubernetes, Terraform, and Python to deliver high‑availability AI workloads.
About the role
Key Responsibilities
- Architect and manage multi‑cloud environments (AWS, Azure, GCP) to support AI compute workloads.
- Implement and maintain Kubernetes clusters, ensuring high availability, security, and performance.
- Automate infrastructure provisioning and configuration using Terraform and CI/CD pipelines.
- Monitor system health, troubleshoot incidents, and provide root‑cause analysis for production issues.
- Collaborate with AI research and engineering teams to optimize resource utilization and cost efficiency.
Requirements
- 5+ years of experience in cloud operations and support.
- Proficiency with AWS, Azure, and GCP services (EC2, EKS, GKE, etc.).
- Strong scripting skills in Python and experience with Terraform, Ansible, or similar tools.
- Deep understanding of Kubernetes architecture, networking, and security best practices.
- Excellent problem‑solving skills and ability to work in a fast‑paced, high‑impact environment.
Skills
awsazurekubernetesterraformpython