Senior Staff Machine Learning Scientist, CSxAI - Evaluation & Data Flywheel
As a Senior Staff Machine Learning Scientist, you will define the technical direction and lead the execution of ML evaluation and the end-to-end data flywheel for CSxAI products. This role involves developing strategies and frameworks to measure quality, integrate feedback into learning signals, and continuously improve AI models and products safely and efficiently, partnering with various cross-functional teams.
Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home and has since grown to over 5 million hosts who have welcomed over 2 billion guest arrivals in almost every country across the globe. Every day, hosts offer unique stays and experiences that make it possible for guests to connect with communities in a more authentic way.
AI and ML are at the heart of the Airbnb product. From Trust to Payments, and from Customer Service to Marketing, we rely on ML to ensure that guests and hosts have the best possible experience with Airbnb.
The Core ML team is responsible for driving CSxAI (Customer Support x Artificial Intelligence) initiatives by adopting Generative AI technologies to enable an intelligent, scalable, and exceptional service experience. The team develops and enhances AI models, ML services, and tools including LLM fine-tuning and optimization, RAG/Search, LLM evaluation and testing automation, feedback-based learning, and guardrails for a wide range of applications at Airbnb.
The richness of Airbnb's data, the complexity of its marketplace, and the variety innate in our product mean that we need to operate at the state of the art of AI practice. We are committed to long-term innovation to solve complex problems, and to do that we need experienced ML talent.
In this Senior Staff role, you will set technical direction and lead execution for ML evaluation and the end-to-end data flywheel powering CSxAI products (e.g., assistive agents, issue resolution, and tooling). Your work will define how we measure quality, how we turn feedback into learning signals, and how we continuously improve models and products safely and efficiently. You will partner closely with product, engineering, design, operations to build evaluation systems that are trusted, scalable, and actionable - connecting offline metrics to online outcomes.
Posted June 4, 2026