remote
Software Development Engineer - Big Data, AWS Elastic MapReduce EMR , EMR EKS - Amazon.com
Software Engineer
Senior engineer building next‑generation cluster management and real‑time processing for AWS EMR, leveraging Hadoop, Spark, and Python to deliver scalable, high‑performance big data solutions.
About the role
Key Responsibilities
- Design, develop, and maintain the next‑generation EMR cluster management system, ensuring high availability and scalability for millions of customer clusters.
- Implement real‑time data processing pipelines using Spark, Hive, and Presto to enable instant insights on massive datasets.
- Collaborate with cross‑functional teams to define feature requirements, optimize performance, and integrate new services into the EMR ecosystem.
- Write clean, well‑tested code in Python and Scala, and contribute to open‑source components used by the broader AWS community.
- Participate in code reviews, performance tuning, and troubleshooting to continuously improve system reliability and user experience.
Requirements
- 5+ years of software engineering experience with a focus on big data platforms.
- Strong proficiency in AWS services, especially EMR, S3, and EC2.
- Hands‑on experience with Hadoop ecosystem components (HDFS, Hive, Pig, Impala, Spark, Presto, HBase).
- Proficient in Python and Scala; familiarity with Java is a plus.
- Excellent problem‑solving skills and a passion for building scalable, high‑performance distributed systems.