onsite
Principal Hardware Diagnostics Engineer - Graphcore
Software Engineer
Lead development of diagnostic frameworks and tools for AI accelerator hardware, using Python, C++, Linux, and ASIC/FPGA expertise to ensure high‑performance, reliable AI compute platforms.
About the role
Key Responsibilities
- Design and implement comprehensive hardware diagnostic methodologies for AI accelerator silicon and board‑level products.
- Develop automated test and validation software using Python and C++ on Linux platforms to detect performance regressions and failure modes.
- Collaborate with ASIC and FPGA design teams to integrate built‑in self‑test (BIST) features and improve signal‑integrity monitoring.
- Lead root‑cause analysis of field failures, creating detailed debug plans and corrective action reports.
- Mentor junior engineers and establish best practices for hardware debugging, data collection, and reporting.
Requirements
- 10+ years of experience in hardware diagnostics, validation, or reliability engineering for high‑performance compute silicon.
- Strong programming skills in Python and C++ on Linux environments.
- Deep understanding of ASIC/FPGA design flows, signal integrity, and board‑level debugging techniques.
- Proven ability to develop automated test frameworks and analyze large volumes of hardware telemetry data.
- Excellent communication skills and experience guiding cross‑functional teams.