remote

Senior Systems Development Engineer - AWS Generative AI & ML Servers - Amazon Web Services

AI Engineer

Design and operate high‑performance AWS server platforms for generative AI, ML training, and HPC workloads, delivering continuous price‑performance improvements for large language models and next‑generation cloud services.

About the role

Key Responsibilities

Architect, develop, and launch server hardware and firmware solutions that power AWS generative AI and ML workloads.
Collaborate with product, software, and data‑science teams to optimize performance, scalability, and cost for large‑scale model training and inference.
Drive continuous improvement of instance types, integrating the latest CPU, GPU, and accelerator technologies.
Implement monitoring, automation, and debugging tools to ensure high availability and reliability of AI/ML services.
Contribute to technical roadmaps, evaluate emerging hardware trends, and prototype innovative solutions for future AWS offerings.

Requirements

5+ years of experience in systems development, hardware engineering, or low‑level software for high‑performance compute platforms.
Strong proficiency in C++ and Python for firmware, driver, and automation development.
Deep knowledge of Linux operating systems, networking, and performance tuning for AI/ML workloads.
Hands‑on experience with GPU/accelerator architectures, HPC clusters, and large‑scale distributed training systems.
Demonstrated ability to work cross‑functionally in a fast‑moving cloud environment and deliver production‑grade solutions.

Skills

awslinuxcpythonmachine learning

CompanyAmazon Web Services

DepartmentEngineering

LocationSeattle, Washington, United States

Experience5+ years

Tenurefull-time

LevelSenior

Posted June 24, 2026