onsite
Data Engineer - ADF & Databricks - Genesis NGN Inc.
Data Engineer
Lead the design and optimization of large‑scale data pipelines using Spark, PySpark, Hive, Azure Data Factory and Databricks, ensuring high performance, reliability and scalability in a cloud environment.
About the role
Key Responsibilities
- Design, develop and maintain scalable data pipelines with Apache Spark, PySpark and Hive.
- Implement and orchestrate data workflows in Azure Data Factory and Databricks notebooks.
- Optimize ETL processes for performance, cost and reliability across cloud storage and compute resources.
- Collaborate with data scientists and analysts to deliver clean, high‑quality data for analytics and reporting.
- Monitor pipeline health, troubleshoot issues and implement automated alerts and logging.
Requirements
- 3+ years of experience building data pipelines with Spark, PySpark and Hive.
- Proficiency in Azure Data Factory, Databricks, and cloud data services.
- Strong SQL and Python programming skills.
- Experience with data modeling, schema design and performance tuning.
- Excellent problem‑solving skills and a collaborative mindset.
Skills
apache sparkhivedatabrickspythonsql