Research Scientist - RL Training
Snorkel is seeking a Research Scientist to focus on reinforcement learning for training and aligning large language models. This role involves researching and implementing RL techniques like GRPO, RLHF, and DPO, translating them into data products, and designing data pipelines for high-quality training signals. The Research Scientist will contribute to Snorkel's data-as-a-service offering by advancing RL data capabilities and staying current with cutting-edge LLM training and alignment research.
At Snorkel, we believe meaningful AI doesn’t start with the model, it starts with the data. We’re on a mission to help enterprises transform expert knowledge into specialized AI at scale. The AI landscape has gone through incredible changes between 2015, when Snorkel started as a research project in the Stanford AI Lab, to the generative AI breakthroughs of today. But one thing has remained constant: the data you use to build AI is the key to achieving differentiation, high performance, and production-ready systems. We work with some of the world’s largest organizations to empower scientists, engineers, financial experts, product creators, journalists, and more to build custom AI with their data faster than ever before. Excited to help us redefine how AI is built? Apply to be the newest Snorkeler!
We're looking for a Research Scientist to work on reinforcement learning for training and aligning large language models. This is a foundational research role focused on one of the most consequential open data problems in AI: how to generate the data, reward signals, and training procedures that steer LLM behavior in reliable and generalizable directions — and a core capability that directly differentiates Snorkel's data-as-a-service offering. You'll work closely with Snorkel's research, engineering, and delivery teams to advance our RL data capabilities — translating research ideas into the preference datasets, reward models, and RL-ready corpora we produce for frontier AI labs, and contributing to a research agenda that is central to Snorkel's long-term differentiation as a provider of bespoke training data.
Posted May 26, 2026