remote
Staff Machine Learning Ops Engineer - Preply
MLOps Engineer
Lead the design, deployment, and scaling of machine‑learning pipelines on cloud infrastructure, using Python, Kubernetes, Docker, and AWS to deliver reliable, production‑grade AI services.
About the role
Key Responsibilities
- Architect, build, and maintain end‑to‑end MLOps platforms that support model training, validation, and serving at scale.
- Implement robust CI/CD pipelines for data, model, and code artifacts, ensuring reproducibility and rapid iteration.
- Containerize machine‑learning workloads with Docker and orchestrate them on Kubernetes clusters across multi‑region AWS environments.
- Collaborate with data scientists and software engineers to translate research prototypes into production‑ready services.
- Monitor system performance, establish alerting, and continuously optimize cost, latency, and reliability.
Requirements
- 5+ years of hands‑on experience in MLOps, DevOps, or cloud engineering, with a strong focus on productionizing ML models.
- Proficiency in Python for scripting, automation, and integration with ML frameworks such as TensorFlow or PyTorch.
- Deep knowledge of Kubernetes, Docker, and AWS services (EKS, S3, SageMaker, IAM, CloudWatch).
- Experience designing CI/CD workflows using tools like GitHub Actions, Jenkins, or GitLab CI.
- Strong problem‑solving skills, ability to work cross‑functionally, and a passion for scaling AI‑driven products.
Skills
pythonkubernetesdockerawscicdtensorflow