onsite
LLM Systems Engineer
LLM Systems Engineer
As an LLM Systems Engineer, you will build and optimize infrastructure for large language models, focusing on training, serving, and inference performance. This role involves contributing to LLM runtime code in Rust/C++ and deploying models at scale.
About the role
About the Role
As an LLM Systems Engineer, you will be instrumental in building and optimizing the infrastructure that powers our large language models. This role involves deep technical work on model performance, serving pipelines, and collaborating closely with our ML and data engineering teams.
Responsibilities
- Build and maintain training infrastructure, feature stores, and model serving pipelines.
- Optimize LLM inference performance — focusing on compute efficiency, memory management, latency, and throughput.
- Read, debug, and contribute to LLM runtime and supporting library code (Rust and/or C++).
- Deploy and manage models at scale using tools like vLLM and Baseten.
- Architect scalable pipelines for model training and serving across GPU infrastructure.
- Collaborate with ML and data engineers to ensure the platform meets research and production needs.
Skills
RustC++VllmBasetenLlm InferenceLLM runtimeGPU infrastructuretraining infrastructurefeature storesmodel serving pipelinesML