remote
LLM Ops Engineer - Heidi
Research Engineer
Build, deploy, and maintain large language model pipelines for an AI Care Partner, leveraging Python, cloud infrastructure, and MLOps best practices to ensure reliable, secure, and scalable AI services for clinicians.
About the role
Key Responsibilities
- Design, implement, and operate end‑to‑end LLM inference pipelines supporting real‑time clinical workflows.
- Develop CI/CD workflows and automated testing for model versioning, containerization, and deployment on Kubernetes clusters.
- Collaborate with data scientists and product teams to optimize prompts, fine‑tune models, and monitor performance metrics.
- Ensure compliance with healthcare data security standards (HIPAA, GDPR) through robust access controls, logging, and monitoring.
- Maintain cloud infrastructure on AWS, including serverless components, networking, and cost‑optimization.
Requirements
- 3+ years of software engineering experience with Python and container technologies (Docker, Kubernetes).
- Hands‑on experience building and scaling LLM services (e.g., GPT, LLaMA) in production.
- Proficiency with MLOps tools such as MLflow, Kubeflow, or Terraform for infrastructure as code.
- Strong understanding of prompt engineering, model evaluation, and latency optimization.
- Experience implementing security and compliance controls for protected health information.
Skills
pythonmlopskubernetesawscicddocker