CUDA Kernel Optimization Specialist - AI Trainer
Analyze and optimize GPU kernels for performance, efficiency, and hardware utilization, using profiler metrics and expertise in CUDA, HIP, and shader programming.
Role Overview
Analyze and optimize GPU kernels for performance, efficiency, and hardware utilization. Use profiler metrics to guide kernel improvements. Review GPU kernel implementations to identify bottlenecks without needing extensive algorithmic background.
What You Will Do
Write, modify, and reason about C++17, Python, and GPU programming code. Apply CUDA, HIP, and shader programming expertise to improve performance outcomes. Document optimization decisions clearly.
Why It Might Be a Fit
Must have at least 1 year of professional or graduate-level research experience with GPUs. Strong understanding of GPU profiler performance metrics for kernel optimization. Ability to optimize GPU kernels without deep prior context on every algorithm.
Requirements
Originally posted on Himalayas
Posted June 6, 2026