LLM Inference Engineer
Hippocratic AI is seeking an experienced LLM Inference Engineer to optimize its large language model (LLM) serving infrastructure. The role involves designing and implementing multi-node serving architectures, applying advanced quantization techniques, and optimizing multi-LoRA serving systems to deploy efficient and scalable LLM systems in production.
Hippocratic AI is the leading generative AI company in healthcare. We have the only system that can have safe, autonomous, clinical conversations with patients. We have trained our own LLMs as part of our Polaris constellation, resulting in a system with over 99.9% accuracy.
We're seeking an experienced LLM Inference Engineer to optimize our large language model (LLM) serving infrastructure. The ideal candidate has:
_Show us what you've built: Tell us about an LLM inference or training project that makes you proud! Whether you've optimized inference pipelines to achieve breakthrough performance, designed innovative training techniques, or built systems that scale to billions of parameters - we want to hear your story._
_Open source contributor? Even better! If you've contributed to projects like vllm, sglang, lmdeploy or similar LLM optimization frameworks, we'd love to see your PRs. Your contributions to these communities demonstrate exactly the kind of collaborative innovation we value._
Join a team where your expertise won't just be appreciated—it will be celebrated and amplified. Help us shape the future of AI deployment at scale!
Posted June 3, 2026