Research Scientist, Safety Post Training
As a Research Scientist specializing in Safety Post-Training at Scale Labs, you will develop and apply advanced post-training and interpretability methods to enhance the safety and understanding of frontier AI systems. Your work will involve designing pipelines to study model behavior, creating evaluations for unsafe behaviors, and collaborating with stakeholders to translate research into actionable safety standards.
As a Research Scientist working on Safety Post-Training, you will develop and apply post-training methods and interpretability techniques to make frontier AI systems safer and better understood by researchers and policymakers. This role involves tackling hard problems in agent robustness, AI control protocols, and AI risk evaluations, collaborating across industry, the public sector, and academia, and regularly publishing findings.
Posted June 1, 2026