remote
Senior Technical Program Manager - DGX Cloud - NVIDIA
Product Manager
Senior Technical Program Manager driving end‑to‑end delivery of NVIDIA DGX Cloud infrastructure, coordinating global cloud providers and engineering teams to ensure reliable, high‑performance platforms for AI researchers.
About the role
Key Responsibilities
- Lead cross‑functional programs that integrate NVIDIA DGX Cloud services with major public cloud providers (AWS, Azure, GCP).
- Define, track, and communicate project milestones, risks, and deliverables to stakeholders across engineering, product, and operations.
- Collaborate with site reliability, networking, and security teams to design and implement scalable, secure cloud infrastructure using Kubernetes, Docker, and CI/CD pipelines.
- Drive continuous improvement of deployment processes, automation, and monitoring to meet performance and availability targets for research workloads.
- Facilitate technical reviews, post‑mortems, and knowledge‑sharing sessions to align engineering efforts with customer experience goals.
Requirements
- 5+ years of technical program or project management experience in cloud infrastructure or high‑performance computing environments.
- Strong understanding of container orchestration (Kubernetes), containerization (Docker), and CI/CD tooling.
- Hands‑on experience with at least one major cloud platform (AWS, Azure, or GCP) and scripting/automation using Python.
- Proven ability to manage complex, multi‑team initiatives, communicate clearly with technical and non‑technical audiences, and deliver on tight timelines.
- Excellent problem‑solving skills, a data‑driven mindset, and a passion for enabling cutting‑edge AI research.
Skills
kubernetesdockercicdpythonaws