onsite
Cloud Engineer - AI - evoke
Devops Engineer
Design, build, and operate AI‑focused cloud infrastructure on AWS, leveraging Terraform, Kubernetes, and Python to deliver scalable, secure, and automated machine‑learning pipelines.
About the role
Key Responsibilities
- Architect and implement AI‑ready cloud solutions on AWS, ensuring high availability, security, and cost efficiency.
- Develop and maintain infrastructure‑as‑code using Terraform to provision compute, storage, networking, and ML services.
- Containerize and orchestrate machine‑learning workloads with Kubernetes, integrating CI/CD pipelines for automated deployment.
- Collaborate with data scientists to optimize model training and inference pipelines, providing reliable compute resources and monitoring.
- Implement monitoring, logging, and alerting for AI workloads, troubleshooting performance and reliability issues.
Requirements
- 3+ years of experience designing and operating cloud infrastructure, preferably on AWS.
- Proficiency with Terraform or similar IaC tools and strong scripting skills in Python.
- Hands‑on experience with Kubernetes, Docker, and CI/CD platforms (e.g., Jenkins, GitLab CI).
- Understanding of machine‑learning workflows and services such as SageMaker, TensorFlow, or PyTorch.
- Solid grasp of networking, security best practices, and cost‑optimization strategies in cloud environments.
Skills
awsterraformpythonkubernetesmachine learningcicd