hybrid
Research Engineer, Machine Learning
Research Engineer, Machine Learning
As a Research Engineer, ML track, you will build and optimize large-scale learning systems for Mistral AI's open-weight models, working closely with Research Scientists. You will contribute to enhancing shared training frameworks and data pipelines, or integrate into research squads to scale fresh ideas into repeatable code. This role involves accelerating researchers, interfacing cutting-edge research with production, conducting experiments, and designing and implementing ML algorithms.
About the role
About the Research Engineering team
The team spans Platform (shared infra & clean code) and Embedded (inside research squads). Engineers can move along the research↔production spectrum as needs or interests evolve.
As a Research Engineer – ML track, you’ll build and optimise the large-scale learning systems that power our open-weight models. Working hand-in-hand with Research Scientists, you’ll either join:
- Platform RE Team: Enhance the shared training framework, data pipelines and cluster tooling used by every team; or
- Embedded RE Team: Sit inside a research squad (Alignment, Pre-training, Multimodal, Safety …) and turn fresh ideas into repeatable, scalable code.
What will you do
- Accelerate researchers by taking on the heavy parts of large-scale ML pipelines and building robust tools.
- Interface cutting-edge research with production: integrate checkpoints, streamline evaluation, and expose APIs.
- Conduct experiments on the latest deep-learning techniques (sparsified 70 B + runs, distributed training on thousands of GPUs).
- Design, implement and benchmark ML algorithms; write clear, efficient code in Python.
- Deliver prototypes that become production-grade components for Le Chat and our enterprise API.
About you
- Master’s or PhD in Computer Science (or equivalent proven track record).
- 4 + years working on large-scale ML codebases.
- Hands-on with PyTorch, JAX or TensorFlow; comfortable with distributed training (DeepSpeed / FSDP / SLURM / K8s).
- Experience in deep learning, NLP or LLMs; bonus for CUDA or data-pipeline chops.
- Strong software-design instincts: testing, code review, CI/CD.
- Self-starter, low-ego, collaborative.