The Staff/Principal Engineer, AI Platform will build the core infrastructure powering Databricks' AI offerings, including data apps, AI agents, model training, and serving. This role focuses on improving reliability, latency, and efficiency of distributed AI workloads while collaborating with various teams to shape the future of AI development on the Databricks platform.
About the role
About the Role
As part of the AI Platform team, you’ll build the substrate that powers everything from data apps, AI agents, model training, model serving, and vector search. You’ll be joining a high-agency, high-visibility team operating at the frontier of AI infrastructure — with deep ties to research, product, and real-world enterprise use cases. Databricks Mosaic AI is one of our fastest-growing businesses helping thousands of our customers democratize AI within their organizations. We’re building the infrastructure that powers the next generation of AI.
The impact you will have:
Build infrastructure that powers our flagship offerings like MLflow, AI Gateway, Databricks Apps, Agent Framework, Agent Bricks, and Foundation Model APIs, to state a few.
Improve reliability, latency, and efficiency of distributed AI workloads
Collaborate with platform, infra, and ML teams to deliver seamless end-to-end experiences
Shape how developers and data scientists build and interact with AI on Databricks
What we look for:
5+ years of experience in backend or infrastructure engineering
Strong programming skills in Scala, Go, or Python
Experience with distributed systems, scalable APIs, or cloud-native infrastructure
Familiarity with service-oriented architecture, deployment pipelines, and system observability
Strong product and ownership mindset — you care about building the right solution, not just any solution
Bonus points for:
Experience with real-time serving, ML infrastructure, or GPU orchestration
Exposure to platforms like SageMaker, Vertex AI, or Azure ML
Contributions to OSS projects like MLflow, PyTorch, or Ray
Built developer platforms or internal tools supporting AI workflows