onsite

LLM Systems Engineer

As an LLM Systems Engineer, you will build and optimize infrastructure for large language models, focusing on training, serving, and inference performance. This role involves contributing to LLM runtime code in Rust/C++ and deploying models at scale.

About the role

About the Role

As an LLM Systems Engineer, you will be instrumental in building and optimizing the infrastructure that powers our large language models. This role involves deep technical work on model performance, serving pipelines, and collaborating closely with our ML and data engineering teams.

Responsibilities

Build and maintain training infrastructure, feature stores, and model serving pipelines.
Optimize LLM inference performance — focusing on compute efficiency, memory management, latency, and throughput.
Read, debug, and contribute to LLM runtime and supporting library code (Rust and/or C++).
Deploy and manage models at scale using tools like vLLM and Baseten.
Architect scalable pipelines for model training and serving across GPU infrastructure.
Collaborate with ML and data engineers to ensure the platform meets research and production needs.

Skills

RustC++VllmBasetenLlm InferenceLLM runtimeGPU infrastructuretraining infrastructurefeature storesmodel serving pipelinesML

CompanyBaseten

DepartmentEngineering

LocationSan Francisco, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 8, 2026