Software Engineer
Responsible for generating targeted prompts to evaluate an AI agent’s responses, ensuring it does not produce harmful, offensive, or toxic content. Works remotely, applying NLP and safety testing techniques in Portuguese.
Project Objective
This project aims to test and evaluate one AI agent in the development stage. By asking questions and requests, based on various categories (the training is provided), you will try to force the AI agent to say something harmful, offensive, dangerous or toxic an evaluate it’s responses.
The end goal is to prevent AI agent from sharing any harmful, offensive, dangerous or toxic content to the end users and make it safer.
This role is fully remote, full-time with a duration of several months (depending on the project progress it might be prolonged).
Full online training before the project will be provided.
Responsibilities:
Requirements:
Originally posted on Himalayas
Posted June 22, 2026