onsite
LLM Engineer LLM Training - 42dot
LLM Engineer
Lead the design and optimization of large language model training pipelines, enhancing efficiency, accuracy, and self‑refinement capabilities using Python, PyTorch, and distributed GPU frameworks.
About the role
Key Responsibilities
- Design and implement end‑to‑end LLM training pipelines that meet production quality standards.
- Optimize pre‑training and post‑training workflows to reduce training time and resource consumption.
- Develop and integrate self‑refine mechanisms to continuously improve model output quality.
- Collaborate with research teams to experiment with novel training methodologies and evaluate their impact.
- Monitor and troubleshoot GPU‑based training jobs, ensuring high availability and performance.
Requirements
- 3+ years of experience in deep learning or NLP, with a strong research background.
- Expertise in GPU utilization and troubleshooting for large‑scale model training.
- Strong analytical skills and a passion for continuous improvement of model quality.