onsite

Research Engineer, Multimodal Reasoning For Information Literacy

As a Research Engineer at Google DeepMind, you will develop and apply multimodal reasoning systems and Vision-Language Models (VLMs) to assess the trustworthiness of online media. This role involves rapid prototyping, designing and training multimodal models, and engaging with product teams to advance research in information literacy.

About the role

About the Role

At Google DeepMind, our research team is dedicated to tackling the most complex challenges in online information quality. We strive to advance the state of the art by developing innovative solutions to detect manipulated media and misleading narratives, ensuring the integrity of digital discourse. Our interdisciplinary work spans provenance analysis and the creation of tools for AI-assisted information literacy, leveraging our technologies for the widespread public benefit of a safer online environment. We thrive in a supportive environment that encourages rapid prototyping and iteration, driving our research achievements directly into Google’s flagship models, including Gemini.

To succeed in this role, you will need to be passionate about advancing information literacy using machine learning and other computational techniques. You'll join an interdisciplinary team of domain experts, ML researchers, and engineers to research and build multimodal reasoning systems and Vision-Language Models (VLMs) to assess the trustworthiness of media (images, audio, and videos) on the internet.

Key Responsibilities

Plan and perform rapid prototyping of computer vision and multimodal machine learning techniques applied to determining authenticity of media information.
Design and train multimodal models capable of complex visual reasoning.
Undertake exploratory analysis to inform experimentation and research directions.
Engage with product teams to drive the development of our research.
Implement tools, libraries, and frameworks to speed up and enable new research.
Report and present research findings, software developments, experimental results, and data analysis clearly and efficiently.
Collaborate with internal and external scientific domain experts.

Requirements

PhD/Master’s degree in Computer Science, AI, ML, or equivalent practical experience.
At least 2 years of relevant experience developing computer vision techniques or multimodal machine learning models.
Experience in software development using Python and deep learning frameworks (e.g., Jax, TensorFlow, PyTorch), with a proven track record of building high-quality research prototypes and systems.
Quantitative skills in math and statistics.
Experience exploring, analysing and visualising data.

Preferred Qualifications

Experience in training and deployment of large-scale models.
Experience with Video Understanding.
Experience with Large Language Models, prompt engineering, few-shot learning, post-training techniques, and evaluations.
A proven track record of research or engineering achievements, such as publications in peer-reviewed conferences or journals.

When assessing technical background we will take a holistic view of the mix of scientific, ML and computational experience. We do not expect you to be an expert in all fields simultaneously.

About the role

About the Role

Key Responsibilities

Plan and perform rapid prototyping of computer vision and multimodal machine learning techniques applied to determining authenticity of media information.
Design and train multimodal models capable of complex visual reasoning.
Undertake exploratory analysis to inform experimentation and research directions.
Engage with product teams to drive the development of our research.
Implement tools, libraries, and frameworks to speed up and enable new research.
Report and present research findings, software developments, experimental results, and data analysis clearly and efficiently.
Collaborate with internal and external scientific domain experts.

Requirements

PhD/Master’s degree in Computer Science, AI, ML, or equivalent practical experience.
At least 2 years of relevant experience developing computer vision techniques or multimodal machine learning models.
Experience in software development using Python and deep learning frameworks (e.g., Jax, TensorFlow, PyTorch), with a proven track record of building high-quality research prototypes and systems.
Quantitative skills in math and statistics.
Experience exploring, analysing and visualising data.

Preferred Qualifications

Experience in training and deployment of large-scale models.
Experience with Video Understanding.
Experience with Large Language Models, prompt engineering, few-shot learning, post-training techniques, and evaluations.
A proven track record of research or engineering achievements, such as publications in peer-reviewed conferences or journals.

When assessing technical background we will take a holistic view of the mix of scientific, ML and computational experience. We do not expect you to be an expert in all fields simultaneously.

Research Engineer, Multimodal Reasoning For Information Literacy

About the role

About the Role

Key Responsibilities

Requirements

Preferred Qualifications

Research Engineer, Multimodal Reasoning For Information Literacy

About the role

About the Role

Key Responsibilities

Requirements

Preferred Qualifications

Skills