onsite

ML Kernel Performance Engineer, Edge AI and Science - Amazon.com

Software Engineer

Design and optimize high‑performance ML kernels for a next‑generation edge AI compression platform, leveraging CUDA, C++, and Linux profiling tools to maximize compute efficiency on custom neural accelerator silicon.

About the role

Key Responsibilities

Develop, benchmark, and tune machine‑learning kernels for a proprietary edge AI compression platform.
Collaborate with hardware architects to align software optimizations with custom neural accelerator silicon capabilities.
Implement performance‑critical code in C++/CUDA and create Python tooling for rapid experimentation.
Use Linux profiling suites (e.g., perf, Nsight) to identify bottlenecks and drive 20‑100x compression efficiency improvements.
Contribute to cross‑functional design reviews, providing data‑driven recommendations for kernel and system level enhancements.

Requirements

Strong programming experience in C++ and CUDA, with solid Python scripting skills.
Deep understanding of performance profiling, low‑level optimization, and memory hierarchy on Linux systems.
Hands‑on experience with machine‑learning frameworks and edge AI workloads.
Proven ability to work with hardware teams to co‑design software that exploits custom accelerator features.
BS/MS in Computer Science, Electrical Engineering, or related field; advanced degree preferred.

Skills

pythonccudalinuxmachine learning

CompanyAmazon.com

DepartmentEngineering

LocationSunnyvale, California, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 24, 2026