remote

Systems Development Engineer - AWS Generative AI & ML Servers - Amazon.com

AI Engineer

Design, build, and operate high‑performance AWS cloud services for generative AI, machine‑learning training and inference, delivering continuous price‑performance improvements for large‑scale LLM workloads.

About the role

Key Responsibilities

Design and implement core AWS services that power generative AI and ML workloads, focusing on performance, scalability, and cost efficiency.
Develop and maintain server‑side software for AI/ML accelerators, integrating with AWS instance types and hardware stacks.
Collaborate with hardware, systems, and ML teams to optimize training and inference pipelines for multi‑billion‑parameter models.
Drive continuous improvement of cloud offerings through performance benchmarking, profiling, and automated testing.
Participate in the full lifecycle of service delivery, from prototype and proof‑of‑concept to production deployment and operations.

Requirements

Strong programming experience in Python and C++ for systems‑level development.
Deep understanding of AWS services, cloud infrastructure, and virtualization technologies.
Hands‑on experience with machine‑learning frameworks, large‑scale model training, and generative AI concepts.
Proven ability to work on high‑performance computing (HPC) workloads and optimize for latency, throughput, and cost.
Excellent problem‑solving skills and ability to collaborate across hardware, software, and research teams.

Skills

awspythoncmachine learninggenerative ai

CompanyAmazon.com

DepartmentEngineering

LocationAustin, Texas, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 22, 2026