remote
Automation Engineer - Nebius
QA Engineer
Automation Engineer focused on building and maintaining CI/CD pipelines, infrastructure-as-code, and container orchestration for large‑scale AI cloud platforms using Python, Kubernetes, Docker, and AWS.
About the role
Key Responsibilities
- Design, develop, and maintain automated CI/CD pipelines for AI/ML workloads, ensuring fast, reliable releases.
- Implement and manage container orchestration using Kubernetes and Docker to support GPU‑intensive training and inference services.
- Develop infrastructure‑as‑code solutions (e.g., Terraform, CloudFormation) on AWS to provision scalable compute, storage, and networking resources.
- Collaborate with software and data science teams to integrate monitoring, logging, and alerting for production‑grade AI services.
- Continuously improve automation frameworks, scripts, and tooling to reduce manual effort and increase system reliability.
Requirements
- 3+ years of experience in automation engineering or DevOps, preferably in AI/ML or cloud‑native environments.
- Strong proficiency in Python for scripting and tool development.
- Hands‑on experience with Kubernetes, Docker, and container lifecycle management at scale.
- Solid understanding of CI/CD concepts and tools such as Jenkins, GitLab CI, or GitHub Actions.
- Experience provisioning and managing resources on AWS, including EC2, S3, and IAM, using IaC frameworks.
Skills
pythonkubernetesdockercicdaws