hybrid

Research Engineer - Evaluations - AI Safety Institute

The AI Safety Institute is seeking a Research Engineer - Evaluations to build and maintain scientific software, enabling high-quality research on advanced AI systems. This role involves developing and conducting evaluations, driving foundational AI safety research, and facilitating information exchange within a cross-functional team. The Research Engineer will contribute to various workstreams, from chemical/biological misuse to autonomous systems, and play a crucial role in building bespoke infrastructure and tools for research projects.

About the role

About The Job

The AI Safety Institute is the first state-backed organisation focused on advanced AI safety for the public interest. We launched at the AI Safety Summit because we believe taking responsible action on this extraordinary technology requires a capable and empowered group of technical experts within government. Our staff includes senior alumni from OpenAI, Google DeepMind, start-ups and the UK government, and ML professors from Oxford and Cambridge. We are now calling on the world’s top technical talent to build the institute from the ground up. This is a truly unique opportunity to help shape AI safety at an international level. We have ambitious goals and need to move fast.

Develop and conduct evaluations on advanced AI systems: We will characterise safety-relevant capabilities, understand the safety and security of systems, and assess their societal impacts.
Drive foundational AI safety research: We will launch moonshot research projects and convene world-class external researchers.
Facilitate information exchange: We will establish clear information-sharing channels between the Institute and other national and international actors. These include stakeholders such as policymakers and international partners.

About the Role

Research Engineers build and maintain scientific software to enable high quality research. They are uniquely placed to bridge the world of software engineering and research, and at the AI Safety Institute will be involved in challenging and diverse projects at the cutting edge of advanced AI development.

As a Research Engineer you will either be embedded within one or more of our research teams, or you will sit in a cross-cutting group of Research Engineers within the Platform Team.

In either case you will be collaborating with research scientists and people running evaluations and user studies on the one hand, and with our Platform Team on the other. You might also on-board, run and improve existing evaluations from the wider research community, as well as up-scaling new evaluation methods developed in-house.

We draw on a wide range of disciplines, and value a diversity of research expertise across our five workstreams. You will be primarily associated with one of our workstreams (please specify in your application which you’re most interested in), however, sometimes your work will intersect multiple workstreams.

Chem/bio: studying how LLMs and more specialised AI systems are advancing biological and chemical capabilities relating to harmful outcomes. This includes potential uplift to novice actors and future scenarios like design of biological agents
Cyber misuse: studying how LLMs and more specialised AI systems may aid in cyber-criminality and the adequacy of cybersecurity measures against AI systems
Safeguards: evaluating the strength and efficacy of safety and security components of advanced AI systems against diverse threats which could circumvent safeguards
Societal impacts: evaluating a range of impacts of advanced models that could have widespread implications for our societal fabric (e.g. undermining trust in information, psychological wellbeing, cognitive wellbeing, unequal outcomes)
Autonomous systems: Testing for precursors to loss of control by measuring relevant capabilities in long-horizon computer-based tasks. Examples are sub-tasks of autonomous replication, AI development and self-improvement, as well as adaptation to human attempts to intervene and the ability to profitably interact with and manipulate humans. This includes trajectories that start from a misuse event as well as cases of misalignment.

The Platform Team will be providing the foundational infrastructure for our research projects. You will build on top of our platform to create bespoke, load-bearing infrastructure and tools for individual research projects. You will be able to independently run and analyse your own experiments to diagnose problems and understand our research work and tech stack in detail.

You will spend your time working not just on infrastructure code but also in the planning and execution of research projects, such as a wide range of evaluations of cutting-edge AI systems. This includes working on analysing and visualising the outcomes of complex evaluation or fine-tuning procedures and managing large data sets.

As a research engineer it is your responsibility to make the hard trade-offs between when code needs to be load-bearing enough to support multiple experiments and when it is better to write “good enough" code to quickly prove or disprove a hypothesis. In this you will work very closely with our Research Scientists who will often be the main users for the tools you build.

Person specification

This role may be a great fit if you:

Have excellent knowledge of training, fine-tuning, scaffolding, prompting, deploying, and/or evaluating current cutting-edge machine learning systems such as large language models
Have substantial experience working in a similar role in industry, relevant open-source collectives, or academia
Have experience conducting your own research, but most importantly as part of a cross-functional team
Possess a strong curiosity in understanding AI systems and have the ability to develop data collection, analysis and visualization interfaces to do so
Have substantial experience in building software systems to meet research requirements and have led or been a significant contributor to relevant software projects, demonstrating cross-functional collaboration skills
Deeply care about the user experience of a diverse range of users, from machine learning researchers to domain experts, to wide and diverse groups of human evaluators
Work autonomously and in a self-directed way with high agency, thriving in a constantly changing environment and a steadily growing team, while figuring out the best and most efficient ways to solve a particular problem
Bring your own voice and experience but also an eagerness to support your colleagues together with a willingness to do whatever is necessary for the team’s success and find new ways of getting things done within government
Have a sense of mission, urgency, and responsibility for success, demonstrating problem-solving abilities and preparedness to acquire any missing knowledge necessary to get the job done

Core Requirements

You should be able to spend at least 4 days per week on working with us
You should be able to join us for at least 12 months
You should be able work from our office in London (Whitehall) for parts of the week, but we provide flexibility for remote work

Benefits

Alongside your salary of £85,000, Department for Science, Innovation & Technology contributes £11,473 towards you being a member of the Civil Service Defined Benefit Pension scheme. Find out what benefits a Civil Service Pension provides.

The Department for Science, Innovation and Technology offers a competitive mix of benefits including:

A culture of flexible working, such as job sharing, homeworking and compressed hours.
Automatic enrolment into the Civil Service Pension Scheme, with an average employer contribution of 27% of the base salary.
A minimum of 25 days of paid annual leave, increasing by 1 day per year up to a maximum of 30.
An extensive range of learning & professional development opportunities.

Research Engineer - Evaluations - AI Safety Institute

About the role

About The Job

About the Role

Person specification

Core Requirements

Benefits

Research Engineer - Evaluations - AI Safety Institute

About the role

About The Job

About the Role

Person specification

Core Requirements

Benefits

Skills