Research, Evals
Exa is seeking an ML Evals Engineer to design and build their evaluation stack for search engines in an LLM world. This role involves developing comprehensive evaluation suites, creating datasets, and collaborating with research and engineering teams to define and improve search quality. The work will directly influence the direction of the company and the research team's focus.
Exa is an applied AI lab building a search engine unlike the world has ever seen. We build massive-scale infra to crawl the entire web, train state-of-the-art embedding models to process it, and design super high performant vector databases to retrieve over it. We now power search for Cursor, Cognition, HubSpot, and over 400,000 developers and have raised $350m from Lightspeed, Benchmark, and a16z.
Our ultimate goal is to build perfect search over all the world's information, far beyond Google. If you want to build massive-scale ML systems that will define the way the new AI world consumes information, this is the place for you.
The ML organization sits at the heart of our mission. We train foundational models for search. Our goal is to build systems that can instantly filter the world's knowledge to exactly what you want, no matter how complex your query. Basically, put the web into an extremely powerful database.
And to do that well, we need to measure what “good search” actually means. That’s where you come in.
We're looking for an ML evals engineer to design and build our eval stack at Exa. The role involves investigating how to evaluate search engines in an LLM world and then building the most comprehensive, creative, and effective eval suite. You will be deciding the future of search through the evals we choose to optimize for - your work will directly influence what the research team works on and shape the direction of the company.
Posted June 9, 2026