onsite
Senior Solutions Architect, Generative AI Research - NVIDIA
Research Engineer
Senior Solutions Architect driving generative AI research collaborations with universities, focusing on foundation models, multimodal AI, and scalable inference systems using Python, PyTorch, CUDA, and cloud‑based distributed architectures.
About the role
Key Responsibilities
- Partner with faculty and graduate researchers to design, prototype, and optimize large language and vision‑language models on NVIDIA accelerated platforms.
- Provide technical guidance for pre‑training, fine‑tuning, evaluation, and inference pipelines, ensuring performance, efficiency, and scalability.
- Architect distributed training and inference solutions leveraging CUDA, multi‑GPU clusters, and cloud resources (AWS/GCP).
- Develop reference implementations, best‑practice guides, and demo applications that showcase advanced agent behaviors such as tool use, planning, memory, and multi‑agent coordination.
- Collaborate with internal product and engineering teams to translate research needs into product features and roadmap inputs.
Requirements
- 5+ years of hands‑on experience with Python and deep‑learning frameworks, especially PyTorch.
- Deep understanding of large language models, vision‑language models, and generative AI research workflows.
- Proven expertise in CUDA programming, GPU‑accelerated computing, and building distributed training/inference systems.
- Experience deploying AI workloads on cloud platforms and managing high‑performance compute clusters.
- Strong communication skills and a track record of successful collaborations with academic researchers.