onsite
Senior Network Solution Engineer Weekend Coverage - NVIDIA
Software Engineer
Lead weekend support for NVIDIA’s cutting‑edge InfiniBand, NVLink, and Spectrum‑X network systems, diagnosing complex code and logic issues in AI cluster environments using Python and C++.
About the role
Key Responsibilities
- Provide expert weekend coverage for InfiniBand, NVLink, and Spectrum‑X network systems, ensuring high availability for AI clusters.
- Analyze and debug deep code and logic issues in production environments, collaborating with R&D and customer support teams.
- Perform root‑cause analysis, develop patches, and document solutions for recurring network problems.
- Maintain and update network configuration scripts and automation tools using Python and C++.
- Participate in on‑call rotations, delivering rapid response and resolution for critical incidents.
Requirements
- 5+ years of experience in network engineering or software development focused on high‑performance computing.
- Proficient in Python and C++ with strong debugging and profiling skills.
- Deep knowledge of InfiniBand, NVLink, and Spectrum‑X technologies and their integration in AI infrastructure.
- Hands‑on experience with production network operations, incident response, and performance tuning.
- Excellent communication skills and ability to work independently during weekend shifts.