remoteonsite

AI Benchmarking Specialist - Amazon.com

Software Engineer

Lead AI benchmarking for Gen‑AI/LLM tools, designing tests, analyzing model quality, compliance, robustness, and fairness to drive seller growth on a global platform.

About the role

Key Responsibilities

Design and execute comprehensive benchmarking and audit protocols for Gen‑AI and LLM solutions used by international sellers.
Collect, annotate, and analyze data to evaluate model performance, compliance, robustness, and fairness.
Collaborate with cross‑functional teams to refine AI tools, ensuring they meet business and regulatory standards.
Document findings, create detailed reports, and present actionable insights to stakeholders.
Continuously improve benchmarking frameworks and tools based on emerging AI research and industry best practices.

Requirements

Strong background in Machine Learning and experience with Large Language Models.
Ability to work independently and collaborate across global teams.

Skills

pythonmachine learning

CompanyAmazon.com

DepartmentEngineering

LocationKarnataka, India

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 23, 2026