remote
Senior DevOps/MLOps Engineer - Global Relay
MLOps Engineer
Lead end‑to‑end DevOps and MLOps initiatives, building scalable cloud pipelines, automating model deployment, and ensuring robust monitoring across AWS, Kubernetes, and Docker environments. Drive continuous delivery and operational excellence for data‑centric solutions.
About the role
Key Responsibilities
- Design, implement, and maintain CI/CD pipelines for data science and production ML models using Git, Jenkins, and Argo CD.
- Provision and manage scalable Kubernetes clusters on AWS EKS, ensuring high availability and cost efficiency.
- Automate infrastructure as code with Terraform and CloudFormation, integrating security and compliance controls.
- Collaborate with data scientists to containerize models, orchestrate training jobs, and deploy them to production.
- Implement observability, logging, and alerting for both application and ML workloads using Prometheus, Grafana, and ELK stack.
- Lead incident response, root‑cause analysis, and post‑mortem documentation to improve system reliability.
Requirements
- 5+ years of experience in DevOps/MLOps roles, with deep knowledge of cloud-native technologies.
- Proficiency in Kubernetes, Docker, and AWS services (EKS, S3, Lambda, CloudWatch).
- Strong scripting skills in Python and Bash; experience with Terraform or similar IaC tools.
- Hands‑on experience building and deploying ML pipelines, including model versioning and monitoring.
- Excellent problem‑solving skills, strong communication, and a collaborative mindset.
Skills
mlopskubernetesdockerawspythoncicd