remote

Software Development Engineer, Machine Learning Networking Performance - Amazon

ML Engineer

Develop and optimize machine‑learning driven networking performance solutions for AWS infrastructure, leveraging Python, C++, and cloud services to improve latency, throughput, and reliability at global scale.

About the role

Key Responsibilities

Design, implement, and deploy ML models that predict and enhance network performance across AWS data centers.
Develop high‑performance C++ and Python code for real‑time telemetry collection, analysis, and automated remediation.
Collaborate with networking, systems, and data‑science teams to integrate ML solutions into existing AWS services and tooling.
Build scalable data pipelines and feature stores using AWS services (e.g., S3, Kinesis, SageMaker) to support model training and inference.
Monitor, evaluate, and continuously improve model accuracy, latency, and resource utilization in production environments.

Requirements

Bachelor's or higher in Computer Science, Electrical Engineering, or related field with 3+ years of software development experience.
Strong proficiency in Python and C++ and experience building production‑grade ML systems.
Deep understanding of networking concepts (TCP/IP, routing, congestion control) and performance metrics.
Hands‑on experience with AWS services such as EC2, S3, Lambda, and SageMaker.
Proven ability to work on large‑scale, distributed systems and solve complex, data‑intensive problems.

Skills

pythoncmachine learningaws

CompanyAmazon

DepartmentResearch

LocationSanta Clara, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 21, 2026