onsite
Senior MLOps Platform Engineer
Devops Engineer
Design and operate scalable MLOps platforms on AWS, leveraging Amazon EKS, ArgoCD, and Arize AI to deliver reliable batch inference pipelines for data science teams.
About the role
Key Responsibilities
- Architect, build, and maintain a production‑grade MLOps platform on AWS, using Amazon EKS for container orchestration.
- Implement continuous delivery pipelines with ArgoCD to automate model deployment and versioning.
- Integrate Arize AI for model monitoring, drift detection, and performance analytics.
- Design and optimize batch inference workflows, ensuring low latency and cost‑effective scaling.
- Collaborate with data scientists and software engineers to translate model requirements into reliable, reproducible pipelines.
Requirements
- 5+ years of experience in cloud infrastructure, preferably AWS, with deep knowledge of EKS, IAM, and networking.
- Strong background in MLOps concepts, CI/CD for ML, and containerized deployments.
- Hands‑on experience with GitOps tools such as ArgoCD and monitoring platforms like Arize AI.
- Proficiency in scripting or programming (Python, Bash) for automation and pipeline development.
- Demonstrated ability to design high‑throughput batch inference systems and troubleshoot performance issues.