onsite
Senior Staff Software Development Engineer - Collectives and Network Optimization - AMD
Software Engineer
Lead the design and implementation of high‑performance collectives and network‑optimization software for next‑generation AI and data‑center GPUs, driving scalability, latency reduction, and integration with AMD's hardware stack.
About the role
Key Responsibilities
- Architect, develop, and optimize collective communication libraries and network‑stack components for AMD GPU platforms.
- Collaborate with hardware, driver, and compiler teams to ensure tight integration and maximal performance.
- Design and implement scalable algorithms for data movement, reduction, and synchronization across multi‑node systems.
- Profile, benchmark, and tune software to meet stringent latency and throughput targets in AI and HPC workloads.
- Mentor junior engineers and drive best practices in code quality, testing, and documentation.
Requirements
- 10+ years of software engineering experience, with deep expertise in C++ and Python for performance‑critical systems.
- Strong background in GPU programming (CUDA, HIP, or similar) and low‑level Linux kernel/network stack development.
- Proven track record of optimizing distributed/collective communication algorithms for AI, HPC, or data‑center environments.
- Experience with performance analysis tools (e.g., VTune, perf, Nsight) and ability to translate metrics into code improvements.
- Excellent problem‑solving skills, collaborative mindset, and ability to lead complex, cross‑functional projects.