onsite
AI Researcher - Agentic AI
Research Engineer
Lead research on agentic AI, designing benchmarking and evaluation methodologies for Anthropic and Gemini APIs, driving insights through advanced ML and NLP techniques.
About the role
Key Responsibilities
- Develop and refine benchmarking frameworks for agentic AI systems using Anthropic and Gemini APIs.
- Design evaluation metrics that capture autonomy, safety, and performance of AI agents.
- Implement experiments in Python, analyze results, and iterate on methodology.
- Collaborate with cross‑functional teams to integrate findings into product roadmaps.
- Publish research findings and present at conferences and internal workshops.
Requirements
- PhD or equivalent experience in Machine Learning, NLP, or related fields.
- Strong background in designing and conducting AI benchmarks and evaluation studies.
- Proficiency in Python and experience with large language model APIs.
- Excellent analytical, communication, and publication skills.
Skills
machine learningnatural language processingpython