onsite

Senior Software Engineer vLLM - CommonAI C.I.C.

Software Engineer

Senior Software Engineer to design, implement, and scale the open‑source vLLM inference engine, leveraging Python, C++, CUDA, and cloud-native technologies such as Kubernetes and AWS for high‑throughput LLM serving.

About the role

Key Responsibilities

Architect, develop, and maintain the vLLM inference engine to deliver low‑latency, high‑throughput LLM serving.
Implement performance‑critical components in C++/CUDA and integrate them with Python APIs.
Design and operate cloud‑native deployment pipelines using Kubernetes and AWS services for scalable production workloads.
Collaborate with open‑source contributors and internal research teams to incorporate the latest model optimizations and safety features.
Write comprehensive tests, documentation, and monitoring tools to ensure reliability and observability in production.

Requirements

5+ years of software engineering experience, with deep expertise in Python and C++ development.
Strong background in GPU programming (CUDA) and performance optimization for large language models.
Hands‑on experience deploying distributed AI workloads on Kubernetes and cloud platforms such as AWS.
Proficiency with deep learning frameworks, especially PyTorch, and familiarity with LLM architectures.
Track record of contributing to open‑source projects and working in collaborative, fast‑paced engineering environments.

Skills

pythonccudapytorchkubernetesaws

CompanyCommonAI C.I.C.

DepartmentEngineering

LocationCambridge, United Kingdom

Experience5+ years

Tenurefull-time

LevelSenior

Posted June 26, 2026