Data Scientist (Forecasting | LLM Evaluation | Python/SQL)
Contract (OPutsideIR35)| Remote | UK Public Sector Project
Overview
We are seeking a Senior Data Scientist to work within a multi-disciplinary team to develop and deploy modelling and forecasting capability within a large-scale organisational environment. This role involves translating analyst and business requirements into well-defined modelling problems, building and evaluating time series and supervised learning solutions, and assessing the outputs of large language models for accuracy and reliability. The Data Scientist will collaborate closely with technical and non-technical stakeholders, communicating modelling approaches and results clearly at each stage of delivery.
About the Role
An experienced Data Scientist who can confidently translate ambiguous business requirements into rigorous modelling problems, build and evaluate forecasting and machine learning solutions, and assess the outputs of large language models with the same discipline applied to traditional statistical methods.
Key Responsibilities
- Translate requirements from analysts and business stakeholders into problems that can be addressed through statistical or machine learning approaches.
- Design, build, and validate time series forecasting models using both statistical methods (e.g. ARIMA) and machine learning approaches (e.g. random forest, deep learning).
- Apply supervised learning techniques to relevant business problems, with strategies such as active learning used where appropriate to improve model performance.
- Carry out feature engineering to support both forecasting and supervised learning models.
- Evaluate the outputs of large language models, applying methods such as prompt engineering, retrieval-augmented generation (RAG), and metadata to support knowledge base accuracy.
- Communicate modelling approaches and assumptions to stakeholders before development, and present results and their implications iteratively as work progresses.
- Develop and maintain Python and SQL code under version control, working to DevOps practices for testing, deployment, and collaboration; document modelling decisions, assumptions, and evaluation criteria to organisational standards.
Mandatory Requirements
- Demonstrated experience in time series analysis, including machine learning approaches (feature engineering, deep learning, random forest) and statistical approaches to forecasting (e.g. ARIMA).
- Demonstrated experience in supervised learning approaches, with experience of strategies such as active learning considered an advantage.
- Proven ability to translate requirements from analysts into a problem that can be solved through modelling, and to communicate modelling approaches and outcomes clearly before, during, and after development.