onsite

Senior AI/ML Engineer Postdoc - LLM Infrastructure - Technische Informationsbibliothek

ML Engineer

Lead the design, deployment, and scaling of Large Language Model infrastructure, leveraging Python, deep‑learning frameworks, and cloud/Kubernetes technologies to support cutting‑edge research and services.

About the role

Key Responsibilities

Architect, implement, and maintain scalable infrastructure for training and serving Large Language Models (LLMs) in a cloud‑native environment.
Develop robust pipelines using Python, PyTorch or TensorFlow for data preprocessing, model fine‑tuning, and evaluation.
Containerize applications with Docker and orchestrate workloads on Kubernetes clusters, ensuring high availability and resource efficiency.
Integrate infrastructure with AWS services (e.g., S3, EC2, SageMaker) and on‑premise HPC resources to meet performance and cost targets.
Collaborate with research scientists to translate experimental prototypes into production‑ready services.

Requirements

Ph.D. in Computer Science, Machine Learning, or a related field, with a strong publication record in LLMs or deep learning.
Extensive hands‑on experience with Python and major deep‑learning frameworks (PyTorch, TensorFlow).
Proven expertise in containerization (Docker) and orchestration (Kubernetes) for AI workloads.
Solid understanding of cloud platforms, preferably AWS, and experience with large‑scale distributed training.
Ability to work independently, mentor junior team members, and communicate complex technical concepts effectively.

Skills

pythonpytorchtensorflowkubernetesdockeraws

CompanyTechnische Informationsbibliothek

DepartmentResearch

LocationHannover, Germany

Experience5+ years

Tenurefull-time

LevelSenior

Posted June 24, 2026