remote
Staff ML Infra Engineer, Search & Discovery - Coupang
Software Engineer
Lead the design and operation of scalable machine‑learning infrastructure for search and discovery, driving performance, reliability, and innovation across a high‑traffic e‑commerce platform.
About the role
Key Responsibilities
- Architect, build, and maintain end‑to‑end ML pipelines that power search relevance and recommendation engines at scale.
- Design and operate robust, highly available infrastructure on AWS, leveraging Kubernetes, Docker, and serverless services.
- Collaborate with data scientists to translate models into production‑ready services, ensuring low latency and high throughput.
- Implement monitoring, logging, and automated alerting to guarantee system reliability and rapid incident response.
- Drive continuous improvement of CI/CD workflows, containerization strategies, and cost‑optimization practices.
Requirements
- 10+ years of experience in large‑scale ML production environments.
- Deep expertise in Python, AWS, Kubernetes, Docker, and distributed data processing (e.g., Apache Spark).
- Proven track record of delivering high‑performance search or recommendation systems.
- Strong understanding of DevOps principles, observability, and performance tuning.
- Excellent communication skills and ability to mentor junior engineers.
Skills
pythonawskubernetesdockerapache spark