onsite
AI Engineer - RAG & Semantic Search
AI Engineer
Lead the design and implementation of Retrieval-Augmented Generation (RAG) pipelines and semantic search solutions, building scalable APIs with Chroma and Cohere while applying Agile practices to deliver high‑quality AI services.
About the role
Key Responsibilities
- Design, develop, and maintain RAG pipelines that integrate large language models with vector databases for real‑time semantic search.
- Build and expose robust APIs for data ingestion, chunking, and retrieval, ensuring low latency and high throughput.
- Collaborate with cross‑functional teams using Agile methodologies to iterate on features and improve model performance.
- Implement and tune vector embeddings with Chroma, optimizing storage and query efficiency.
- Integrate Cohere’s language models, managing prompt engineering and fine‑tuning for domain‑specific use cases.
- Monitor system health, troubleshoot issues, and continuously refine models based on user feedback and analytics.
Requirements
- Proven experience in building RAG systems and semantic search solutions.
- Strong programming skills in Python and familiarity with API frameworks (FastAPI, Flask).
- Hands‑on experience with vector databases such as Chroma or Pinecone.
- Knowledge of large language models and prompt engineering, preferably with Cohere or similar providers.
- Experience working in Agile/Scrum environments and collaborating with product and data teams.