onsite
Senior Machine Learning Engineer - Graphcore
ML Engineer
Senior ML Engineer responsible for validating and benchmarking Graphcore's accelerator stack, ensuring numerical precision, performance, and correctness across modern frameworks, models, and distributed execution environments.
About the role
Key Responsibilities
- Design and execute comprehensive validation tests for the ML stack that integrates hardware accelerators with software frameworks.
- Develop automated benchmarking pipelines to measure performance, precision, and scalability of open‑source models on Graphcore IPUs.
- Identify, reproduce, and diagnose regressions, correctness issues, and performance bottlenecks across TensorFlow, PyTorch, and custom runtimes.
- Create targeted low‑level tests for numerical precision, quantisation, attention mechanisms, and distributed execution.
- Collaborate with hardware, compiler, and runtime teams to provide actionable feedback and drive improvements.
Requirements
- 5+ years of experience in machine learning engineering, with strong proficiency in Python and C++.
- Deep understanding of ML frameworks such as TensorFlow and PyTorch, and experience optimizing models for accelerator hardware.
- Hands‑on experience with CUDA, low‑level performance profiling, and Linux‑based development environments.
- Proven ability to build and maintain automated benchmarking and testing infrastructure.
- Strong analytical skills to investigate complex system behavior and communicate findings clearly.
Skills
pythonctensorflowpytorchcudalinux