Deep Learning Software Engineer, TensorRT Performance - New College Grad 2026
NVIDIA is seeking a Deep Learning Software Engineer, TensorRT Performance to analyze and enhance the performance of its deep learning inference ecosystem. The role involves developing benchmarking methodologies, contributing to inference frameworks like TensorRT, and optimizing deep learning models across various NVIDIA accelerators to achieve gold standards in Generative AI performance.
NVIDIA is looking for a Deep Learning Software Engineer, TensorRT Performance to join their rapidly growing research and development team for Deep Learning Inference. This role focuses on analyzing and improving the performance of NVIDIA’s inference ecosystem. Companies worldwide leverage NVIDIA GPUs for deep learning, driving breakthroughs in Generative AI, Recommenders, and Vision. The successful candidate will join a team dedicated to building software for performance optimization, deployment, and serving of DL inference solutions, specializing in GPU-accelerated deep learning inference software like TensorRT, DL benchmarking, and performant model deployment solutions.
You will collaborate with the deep learning community to integrate TensorRT into OSS frameworks like TensorRT-EdgeLLM and PyTorch. Key responsibilities include identifying performance opportunities, optimizing state-of-the-art models across NVIDIA accelerators (from datacenter GPUs to edge SoCs), and implementing graph compiler algorithms, frontend operators, and code generators within NVIDIA’s inference ecosystem. You will also work with various teams on workflow improvements, performance modeling, analysis, kernel development, and inference software development.
Posted June 9, 2026