remote

LLM Infrastructure Engineer

As an LLM Infrastructure Engineer, you will be responsible for building and maintaining robust training and serving pipelines for large language models. This includes optimizing inference performance, contributing to LLM runtime code in Rust/C++, and deploying models at scale using tools like vLLM and Baseten.

About the role

Responsibilities

Build and maintain training infrastructure, feature stores, and model serving pipelines
Optimize LLM inference performance — compute efficiency, memory management, latency, and throughput
Read, debug, and contribute to LLM runtime and supporting library code (Rust and/or C++)
Deploy and manage models at scale using tools like vLLM and Baseten
Architect scalable pipelines for model training and serving across GPU infrastructure
Collaborate with ML and data engineers to ensure the platform meets research and production needs

Skills

LlmRustC++VllmBasetenGPU

CompanyConfidential

DepartmentEngineering

LocationUnited States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 10, 2026