onsite

AI Engineer - RAG & Semantic Search

AI Engineer

Lead the design and implementation of Retrieval-Augmented Generation (RAG) pipelines and semantic search solutions, building scalable APIs with Chroma and Cohere while applying Agile practices to deliver high‑quality AI services.

About the role

Key Responsibilities

Design, develop, and maintain RAG pipelines that integrate large language models with vector databases for real‑time semantic search.
Build and expose robust APIs for data ingestion, chunking, and retrieval, ensuring low latency and high throughput.
Collaborate with cross‑functional teams using Agile methodologies to iterate on features and improve model performance.
Implement and tune vector embeddings with Chroma, optimizing storage and query efficiency.
Integrate Cohere’s language models, managing prompt engineering and fine‑tuning for domain‑specific use cases.
Monitor system health, troubleshoot issues, and continuously refine models based on user feedback and analytics.

Requirements

Proven experience in building RAG systems and semantic search solutions.
Strong programming skills in Python and familiarity with API frameworks (FastAPI, Flask).
Hands‑on experience with vector databases such as Chroma or Pinecone.
Knowledge of large language models and prompt engineering, preferably with Cohere or similar providers.
Experience working in Agile/Scrum environments and collaborating with product and data teams.

Skills

ragagile

DepartmentEngineering

LocationSao Paulo, Brazil

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 25, 2026