onsite
GCP Application Senior Developer - HCLTech
Software Engineer
Senior GCP Application Developer with deep expertise in Cloudera Data Platform, PySpark, and data governance. Lead design and implementation of scalable data pipelines, cataloging, and lineage solutions using SQL, Iceberg, and Open Data Contract Standard.
About the role
Key Responsibilities
- Design, develop, and maintain large‑scale data pipelines on Cloudera Data Platform (CDP) using PySpark and SQL.
- Implement data cataloging, lineage, and governance frameworks with Apache Ranger and Hive Metastore.
- Integrate and manage file formats such as Iceberg and Parquet, optimizing partitioning and bucketing strategies.
- Apply Open Data Contract Standard (ODCS) to ensure data quality and compliance across services.
- Collaborate with cross‑functional teams to translate business requirements into robust data solutions.
Requirements
- 5+ years of experience in big data engineering and GCP environments.
- Proven expertise in PySpark, CDP components (CDE, CDW, Ozone, Airflow, SDX), and Apache Ranger.
- Strong knowledge of Hive Metastore, data cataloging, and governance best practices.
- Hands‑on experience with Iceberg, Parquet, and advanced partitioning techniques.
- Excellent problem‑solving skills and ability to work in a fast‑paced, collaborative setting.