onsite

Engineering Manager, AI Inference Systems

OpenAI is seeking an Engineering Manager for AI Inference Systems to lead critical work on their "Engine" service, which powers model inference for GPT-4 and ChatGPT. This role involves scaling inference infrastructure, hiring top AI systems engineers, and ensuring the efficient and reliable deployment of current and future LLMs.

About the role

About The Team

The Applied AI team safely brings OpenAI's technology to the world. They have released ChatGPT, Plugins, DALL·E, and the APIs for GPT-4, GPT-3, embeddings, and fine-tuning. They also operate inference infrastructure at scale. The team seeks to learn from deployment and distribute the benefits of AI, while ensuring that this powerful tool is used responsibly and safely. Safety is paramount, even over unfettered growth. They serve end-users directly through ChatGPT and developers through their APIs, which power product features never before possible.

About The Role

Model inference at OpenAI is powered through a single service called "Engine", which wraps the PyTorch transformers for GPT-4 and ChatGPT. OpenAI is looking for an engineering manager to help lead critical work for this service and grow the team.

In This Role, You Will

Own substantial portions of our inference stack.
Ensure the ability to run GPT-4, ChatGPT, and future models at increasingly high scale with increasing efficiency.
Hire world-class AI systems engineers in one of the most competitive hiring markets.
Coordinate the inference needs of OpenAI's teams and products.
Create a diverse, equitable, and inclusive culture that makes all feel welcome while enabling radical candor and the challenging of group think.

You Might Thrive In This Role If You

Have 3+ years of experience in engineering management and 7+ years as an IC working with high scale distributed systems and ML systems.
Have experience with ML systems, particularly high scale distributed inference for modern LLMs.
Have experience with highly available, reliable, production grade systems at scale.
Have familiarity with the latest AI research and working knowledge of how these systems are efficiently implemented.
Care deeply about diversity, equity, and inclusion, and have a track record of building inclusive teams.
Have experience closing extremely competitive candidates for your team, and the ability to craft and convey compelling visions of the future.
Have a voracious and intrinsic desire to learn and fill in missing skills—and an equally strong talent for sharing learnings clearly and concisely with others.
Are comfortable with ambiguity and rapidly changing conditions. You view changes as an opportunity to add structure and order when necessary.

As technical context: at the heart of OpenAI's infrastructure is a large-scale deployment of GPU nodes running in dozens of Kubernetes clusters across regions. Some core technologies they build with include Python, PyTorch, CUDA, Triton, Redis, Infiniband, NCCL, NVLink.

About the role

About The Team

About The Role

In This Role, You Will

Own substantial portions of our inference stack.
Ensure the ability to run GPT-4, ChatGPT, and future models at increasingly high scale with increasing efficiency.
Hire world-class AI systems engineers in one of the most competitive hiring markets.
Coordinate the inference needs of OpenAI's teams and products.
Create a diverse, equitable, and inclusive culture that makes all feel welcome while enabling radical candor and the challenging of group think.

You Might Thrive In This Role If You

Have 3+ years of experience in engineering management and 7+ years as an IC working with high scale distributed systems and ML systems.
Have experience with ML systems, particularly high scale distributed inference for modern LLMs.
Have experience with highly available, reliable, production grade systems at scale.
Have familiarity with the latest AI research and working knowledge of how these systems are efficiently implemented.
Care deeply about diversity, equity, and inclusion, and have a track record of building inclusive teams.
Have experience closing extremely competitive candidates for your team, and the ability to craft and convey compelling visions of the future.
Have a voracious and intrinsic desire to learn and fill in missing skills—and an equally strong talent for sharing learnings clearly and concisely with others.
Are comfortable with ambiguity and rapidly changing conditions. You view changes as an opportunity to add structure and order when necessary.

Engineering Manager, AI Inference Systems

About the role

About The Team

About The Role

In This Role, You Will

You Might Thrive In This Role If You

Engineering Manager, AI Inference Systems

About the role

About The Team

About The Role

In This Role, You Will

You Might Thrive In This Role If You

Skills