- Detail-oriented, hands-on, and metric-driven.
- Curious about LLMs, vector search, and retrieval systems.
- Collaborative with engineers, ETL teams, and business users.
- Comfortable designing evaluation pipelines and experiments.
- Strong communicator: turns feedback into actionable insights
- Define evaluation sets across clients, content types, and languages.
- Build labelling pipelines and dashboards for manual or UI-based feedback.
- Measure and tune hybrid retrieval (semantic + keyword), top-k, rerankers, and filters.
- Evaluate embedding models and chunking strategies for accuracy and coverage.
- Assess LLM answer quality: grounded factuality, hallucination rate, completeness.
- Analyze failure patterns across queries, long contexts, attachments vs. emails.
- Collaborate with backend, ETL, and frontend teams to align telemetry, schema, and feedback capture.
- Translate user feedback into actionable metrics and backlog items.
- Evaluate cost/performance trade-offs for summarization, spillover, and indexing.
- Applied ML/Data Science with strong evaluation discipline.
- Python (pandas, Jupyter/Notebooks) + SQL.
- Information retrieval / RAG evaluation: Recall@K, MRR, nDCG, citation grounding.
- Experience with LLMs: prompting, grounding, long-context handling, cost/latency awareness.
- Building reproducible pipelines and dashboards.
- Strong communicator: translating business questions into metrics and experiments.
- Hands-on with Azure AI Search vector/hybrid tuning, index schema.
- Experience designing human-in-the-loop annotation pipelines.
- Knowledge of enterprise access-control / security trimming.
- Experience tracking embedding drift and retrieval failures.
- Working with large-scale document/email datasets.
Why Join Us
If you have any questions contact our Talent Acquisition team on ta.admin@acuityanalytics.
For more details about Acuity Analytics please see here: Read here .
Originally posted on Himalayas