onsite
Hadoop Platform Support Engineer - Cloudious LLC
Software Engineer
Senior engineer focused on supporting and maintaining a production Hadoop environment, ensuring high availability, performance tuning, and troubleshooting across HDFS, YARN, and Spark components.
About the role
Key Responsibilities
- Provide L3 support for Hadoop Distributed File System (HDFS), YARN, and Spark clusters, diagnosing and resolving performance and reliability issues.
- Monitor cluster health, perform capacity planning, and implement proactive maintenance tasks such as data rebalancing and node replacement.
- Collaborate with data engineering and DevOps teams to optimize job scheduling, resource allocation, and data pipeline performance.
- Develop and maintain automation scripts (Python, Bash) for routine cluster operations and incident response.
- Document troubleshooting procedures, create knowledge base articles, and conduct knowledge transfer sessions for junior staff.
Requirements
- 5+ years of hands‑on experience supporting enterprise Hadoop clusters in production environments.
- Deep knowledge of HDFS architecture, YARN resource management, and Spark execution models.
- Proficiency with Cloudera Manager, Ambari, or similar cluster management tools.
- Strong scripting skills in Python or Bash for automation and monitoring.
- Excellent problem‑solving abilities and a customer‑focused mindset.