remoteonsite
Senior Data Scientist - GenAI - Corridor Platforms
Data Scientist
Lead the design, development, and deployment of large language model pipelines, integrating retrieval‑augmented generation, memory, caching, and external APIs to power compliant, scalable GenAI solutions for regulated financial services.
About the role
Key Responsibilities
- Architect and implement end‑to‑end LLM pipelines, selecting and fine‑tuning foundational models for domain‑specific tasks.
- Integrate retrieval‑augmented generation (RAG) components, building efficient memory and caching layers to support real‑time inference.
- Collaborate with data engineering and product teams to ingest, preprocess, and secure large datasets, ensuring compliance with regulatory standards.
- Develop and maintain robust monitoring, logging, and performance‑tuning workflows for production GenAI services.
- Document best practices, model governance policies, and provide technical mentorship to junior team members.
Requirements
- 5+ years of experience in data science or machine learning engineering, with a strong focus on NLP and LLMs.
- Hands‑on experience building RAG systems, managing vector stores, and deploying models at scale on cloud platforms (AWS, GCP, or Azure).
- Solid understanding of data privacy, security, and compliance requirements in regulated industries.
- Excellent communication skills and a proven ability to translate complex technical concepts into actionable business insights.
Skills
pythonmachine learninggenerative ai