Software Engineer
Full‑time remote role focused on testing an AI agent for harmful or toxic outputs. Responsibilities include crafting targeted prompts, evaluating responses, and ensuring the agent’s safety and compliance with content guidelines.
Project Objective
This project aims to test and evaluate one AI agent in the development stage. By asking questions and requests, based on various categories (the training is provided), you will try to force the AI agent to say something harmful, offensive, dangerous or toxic an evaluate it’s responses.
The end goal is to prevent AI agent from sharing any harmful, offensive, dangerous or toxic content to the end users and make it safer.
This role is fully remote, full-time with a duration of several months (depending on the project progress it might be prolonged).
Full online training before the project will be provided.
Responsibilities:
Requirements:
Originally posted on Himalayas
Posted June 22, 2026