onsite
Machine Learning Engineer - Inference - Mindbeam
ML Engineer
Develop user‑facing APIs, SDKs, and tools that expose cutting‑edge inference models, collaborating with research and product teams to deliver scalable, secure, and high‑performance ML interfaces.
About the role
Key Responsibilities
- Design and implement robust, user‑facing APIs and SDKs that provide seamless access to inference services.
- Collaborate with research scientists to translate complex model pipelines into reusable, production‑ready components.
- Optimize inference workloads for latency, throughput, and cost across GPU and CPU environments.
- Ensure security, authentication, and compliance of all external interfaces.
- Containerize services using Docker and orchestrate deployments with Kubernetes for high availability.
Requirements
- Strong programming experience in Python and C++.
- Hands‑on expertise with deep‑learning frameworks such as TensorFlow or PyTorch.
- Proven ability to build and maintain RESTful APIs and SDKs for ML workloads.
- Experience with containerization (Docker) and orchestration (Kubernetes) in production settings.
- Solid understanding of inference optimization techniques, including quantization, batching, and hardware acceleration.
Skills
pythonctensorflowpytorchdockerkubernetes