onsite
Software Development Engineer, AWS Step Functions - Amazon.com
Software Engineer
Lead the design and implementation of large‑scale, fault‑tolerant distributed applications using AWS Step Functions, ensuring consistency, durability, and high availability across thousands of nodes.
About the role
Key Responsibilities
- Architect and develop scalable distributed workflows with AWS Step Functions, integrating multiple microservices and third‑party APIs.
- Design and enforce consistency, durability, and availability guarantees across distributed components.
- Implement fault‑tolerance strategies for infrastructure failures, network partitions, and service disruptions.
- Collaborate with cross‑functional teams to translate business requirements into robust, low‑code orchestration solutions.
- Monitor, troubleshoot, and optimize workflow performance and cost efficiency at scale.
Requirements
- Strong experience with AWS Step Functions and related AWS services (Lambda, SQS, SNS, DynamoDB).
- Proven background in building and operating large‑scale distributed systems.
- Deep understanding of consistency models, durability, and high availability principles.
- Excellent problem‑solving skills and ability to work in a fast‑paced, collaborative environment.
- Proficiency in at least one programming language (Python, Java, or Node.js) for workflow logic.