Software Engineer - Apache Spark
Staff Software Engineer - Apache Spark position — see original posting for full details.
Business Area:
Seniority Level:
Job Description:
At Cloudera , we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.
The Data Platform Pillar is the bedrock of Cloudera ’s technology, where we design and build the core components that let our customers store, manage, and process data with unmatched scalability, security, and performance.
Are you ready to architect the future of big data? Cloudera is searching for a visionary Staff Software Engineer with deep expertise in distributed systems to join the Apache Spark Team. You will be at the forefront of innovation, building our next-generation, enterprise-grade system designed to conquer data challenges at a massive scale—running Spark on thousands of nodes and crunching petabytes of data for the world's largest companies. This is your chance to directly influence the open-source community as a key contributor to Apache Spark while collaborating with a high-impact, distributed team that includes multiple Spark committers. If you're passionate about pushing the boundaries of distributed data processing, come build the impossible with us.
As a Staff Engineer you will:
Pioneer Scalable Solutions: Architect, implement, and deliver next-generation features for Cloudera ’s Data Engineering Experience, operating at a massive scale on thousands of production nodes.
Drive Open-Source Innovation: Be a core contributor to Apache Spark, directly shaping the future of distributed data processing in the open-source community.
Build with Modern Stacks: Develop high-performance features using Scala, Java, and Python on modern data platforms.
Deepen Technical Mastery: Gain and apply expert-level knowledge in core distributed data processing concepts, including:
SQL Planners and Optimizers
Data layout and modern table formats like Apache Parquet and Iceberg
Fault tolerance and resilience in large-scale distributed systems.
Own the Technology Stack: Develop a deep technical understanding of components across the Cloudera Data Engineering Experience, with a focus on Iceberg and Spark, applying this knowledge to your daily tasks.
Conquer Large-Scale Challenges: Work hands-on with massive distributed systems, scaling from hundreds to thousands of nodes in live production clusters.
Ensure System Integrity: Conduct thorough root cause analysis, debug complex system-level deployment issues, and resolve failures to maintain high system quality.
Enhance Engineering Velocity: Improve internal infrastructure and tooling to streamline de
Posted June 13, 2026