remote
LLM Infrastructure Engineer
LLM Infrastructure Engineer
As an LLM Infrastructure Engineer, you will be responsible for building and maintaining robust training and serving pipelines for large language models. This includes optimizing inference performance, contributing to LLM runtime code in Rust/C++, and deploying models at scale using tools like vLLM and Baseten.
About the role
Responsibilities
- Build and maintain training infrastructure, feature stores, and model serving pipelines
- Optimize LLM inference performance — compute efficiency, memory management, latency, and throughput
- Read, debug, and contribute to LLM runtime and supporting library code (Rust and/or C++)
- Deploy and manage models at scale using tools like vLLM and Baseten
- Architect scalable pipelines for model training and serving across GPU infrastructure
- Collaborate with ML and data engineers to ensure the platform meets research and production needs
Skills
LlmRustC++VllmBasetenGPU