onsite
AI Software Engineer - NPU/Hexagon DSP Kernel Optimization - Qualcomm
Software Engineer
Embedded AI software engineer focused on high‑performance neural network kernels for Qualcomm NPU and Hexagon DSP processors, driving runtime efficiency and new operator support on next‑generation AI platforms.
About the role
Key Responsibilities
- Design, develop, and optimize AI software components for Qualcomm NPU and Hexagon DSP processors.
- Implement high‑performance neural network kernels and operators, ensuring minimal latency and maximal throughput.
- Collaborate closely with architecture and hardware teams to integrate new AI features and platform enhancements.
- Profile, benchmark, and tune code for peak performance on embedded DSP and NPU targets.
- Document kernel interfaces, performance metrics, and best‑practice guidelines for internal use.
Requirements
- Strong proficiency in C++ and low‑level Assembly for embedded systems.
- Experience with DSP programming, Hexagon architecture, and NPU acceleration.
- Deep understanding of neural network models, AI frameworks, and performance optimization techniques.
- Ability to work cross‑functionally with hardware, firmware, and software teams.
- Excellent problem‑solving skills and a passion for pushing AI inference to new performance limits.