onsite

Inference Engineer

Cartesia is seeking an Inference Engineer to design and build low-latency, scalable, and reliable model inference and serving stacks for their cutting-edge foundation models. This role involves close collaboration with research and product teams to deliver fast, cost-effective AI solutions and build robust inference infrastructure.

About the role

About the Role

We're hiring an Inference Engineer to advance our mission of building real-time multimodal intelligence.

Your Impact

Design and build low latency, scalable, and reliable model inference and serving stack for our cutting edge foundation models using Transformers, SSMs and hybrid models.
Work closely with our research team and product engineers to serve our suite of products in a fast, cost-effective, and reliable manner.
Design and build robust inference infrastructure and monitoring for our products.
Have significant autonomy to shape our products and directly impact how cutting-edge AI is applied across various devices and applications.

What You Bring

Given the scale and difficulty of problems we work on, we value strong engineering skills at Cartesia.

Strong engineering skills, comfortable navigating complex codebases and an eye for writing clean and maintainable code.
Experience building large-scale distributed systems with high demands on performance, reliability, and observability.
Technical leadership with the ability to execute and deliver zero-to-one results amidst ambiguity.
Background in or experience working on inference pipelines with machine learning and generative models.
Experience implementing state of the art Machine Learning models and research to applied problems.
Preferable: experience with vLLM, SGLang, Continuous Batching or other inference frameworks.
Preferable: experience working in CUDA, Triton or similar.

Skills

TransformersSSMsVllmSGLangContinuous BatchingCudaTritonMachine Learninggenerative modelsDistributed Systems

CompanyCartesia

DepartmentResearch

LocationSan Francisco, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Salary21,250,000

Posted June 11, 2026