remote

Principal LLM Engineer Productionization - a5labs

LLM Engineer

Lead the end‑to‑end productionization of large language models, designing scalable MLOps pipelines, deploying on cloud infrastructure, and ensuring robust monitoring and performance tuning for enterprise AI solutions.

About the role

Key Responsibilities

Architect and implement production‑ready pipelines for training, fine‑tuning, and serving large language models at scale.
Design and maintain CI/CD workflows, containerization (Docker) and orchestration (Kubernetes) for model deployment.
Collaborate with data scientists to translate research prototypes into reliable, high‑throughput services.
Implement monitoring, logging, and alerting to ensure model performance, drift detection, and compliance.
Optimize resource utilization and cost across cloud platforms (AWS, GCP, or Azure).

Requirements

10+ years of software engineering experience with a focus on AI/ML systems.
Deep expertise in large language models, transformer architectures, and related frameworks (e.g., Hugging Face, TensorFlow, PyTorch).
Proven track record in MLOps, containerization, and cloud deployment at scale.
Strong programming skills in Python and familiarity with infrastructure as code.
Excellent problem‑solving, communication, and leadership abilities.

Skills

mlopspythondockerkubernetes

Companya5labs

DepartmentResearch

LocationIndia

Experience7+ years

Tenurefull-time

LevelLead

Posted June 21, 2026