onsite
Director, Data Science - AI Evaluations Platform - RBC
Data Scientist
Lead the data science team that designs and operates evaluation frameworks for AI models, overseeing dataset creation, LLM judge development, human evaluation protocols, and performance measurement to ensure safe, reliable production deployment.
About the role
Key Responsibilities
- Define and own the end‑to‑end evaluation platform for large language models, including dataset pipelines, deterministic scorers, and human‑in‑the‑loop processes.
- Build and mentor a high‑performing data science team, setting technical direction, hiring talent, and fostering a culture of rigorous measurement.
- Design LLM judges and statistical benchmarks that quantify model quality, safety, risk, and business impact.
- Collaborate with engineering, product, and risk partners to integrate evaluation results into production governance and continuous improvement loops.
- Publish clear measurement frameworks and dashboards that translate complex metrics into actionable insights for stakeholders.
Requirements
- 10+ years of experience in data science or machine learning, with a focus on model evaluation and performance measurement.
- Deep expertise in large language models, evaluation metrics, and statistical analysis techniques.
- Proficiency in Python and related data‑science libraries (e.g., pandas, NumPy, scikit‑learn, PyTorch/TensorFlow).
- Demonstrated ability to lead and scale high‑impact data‑science teams in a fast‑moving, regulated environment.
- Strong communication skills to convey technical findings to both technical and non‑technical audiences.
Skills
machine learningpython