Research Scientist, Safety Post Training
As a Research Scientist focusing on Safety Post-Training, you will develop and apply advanced methods and interpretability techniques to enhance the safety and understanding of frontier AI systems. Your responsibilities include designing post-training pipelines to study model safety and robustness, developing interpretability-informed evaluations for undesirable behaviors, and collaborating with stakeholders to translate findings into actionable safety standards and best practices.
As a Research Scientist working on Safety Post-Training at Scale Labs, you will develop and apply post-training methods and interpretability techniques to make frontier AI systems safer and better understood by researchers and policymakers. Scale plays an integral role in understanding the capabilities and safeguarding AI models and systems as a leading data and evaluation partner for frontier AI companies. Scale Labs has launched a new team focused on policy research, bridging the gap between AI research and global policymakers to enable informed, scientific decisions about AI risks and capabilities.
Our research addresses challenging problems in agent robustness, AI control protocols, and AI risk evaluations to help governments, industry, and the public understand and mitigate AI risk while maximizing AI adoption. This team collaborates broadly across industry, the public sector, and academia, regularly publishing findings.
Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We will not ask any LeetCode-style questions. If you’re excited about advancing AI safety and contributing to our mission, we encourage you to apply, even if your experience doesn’t perfectly align with every requirement.
Posted June 2, 2026