Engineering Manager, AI Inference Systems
OpenAI is seeking an Engineering Manager for AI Inference Systems to lead critical work on their "Engine" service, which powers model inference for GPT-4 and ChatGPT. This role involves scaling inference infrastructure, hiring top AI systems engineers, and ensuring the efficient and reliable deployment of current and future LLMs.
The Applied AI team safely brings OpenAI's technology to the world. They have released ChatGPT, Plugins, DALL·E, and the APIs for GPT-4, GPT-3, embeddings, and fine-tuning. They also operate inference infrastructure at scale. The team seeks to learn from deployment and distribute the benefits of AI, while ensuring that this powerful tool is used responsibly and safely. Safety is paramount, even over unfettered growth. They serve end-users directly through ChatGPT and developers through their APIs, which power product features never before possible.
Model inference at OpenAI is powered through a single service called "Engine", which wraps the PyTorch transformers for GPT-4 and ChatGPT. OpenAI is looking for an engineering manager to help lead critical work for this service and grow the team.
As technical context: at the heart of OpenAI's infrastructure is a large-scale deployment of GPU nodes running in dozens of Kubernetes clusters across regions. Some core technologies they build with include Python, PyTorch, CUDA, Triton, Redis, Infiniband, NCCL, NVLink.
Posted June 7, 2026