remote
Software Development Engineer, FSx for Lustre - Amazon Web Services
Software Engineer
Software Development Engineer building and scaling the high‑performance FSx for Lustre storage service, focusing on low‑latency, high‑throughput solutions for GPU‑driven AI/ML and HPC workloads using C++, Java, and AWS technologies.
About the role
Key Responsibilities
- Design, develop, and maintain core components of the FSx for Lustre service that deliver terabyte‑per‑second throughput and sub‑millisecond latency.
- Implement scalable, fault‑tolerant distributed systems on Linux platforms, optimizing for I/O performance and resource utilization.
- Collaborate with cross‑functional teams (product, operations, security) to define feature requirements and ensure seamless integration with other AWS services.
- Drive performance testing, profiling, and tuning to meet stringent AI/ML and HPC workload benchmarks.
- Participate in code reviews, mentor junior engineers, and contribute to best‑practice engineering standards.
Requirements
- Strong programming experience in C++ and Java, with a deep understanding of Linux system internals.
- Hands‑on experience building and operating large‑scale distributed storage or file‑system services.
- Proficiency in networking concepts, performance optimization, and troubleshooting high‑throughput workloads.
- Familiarity with AWS services and cloud‑native development practices.
- BS/MS in Computer Science or related field, or equivalent professional experience.