onsite

Data Engineer Python, Spark

Data Engineer

Data Engineer responsible for building and maintaining scalable data pipelines using Python, Spark, AWS Glue, Airflow, Apache Flink, and Hive on AWS.

About the role

We are looking for a Data Engineer to build and maintain high‑performance data pipelines on AWS. The role focuses on Python, Spark, AWS Glue, Airflow, Apache Flink, and Hive to ingest, transform, and serve data for analytics and machine‑learning workloads.

Key Responsibilities

Design, develop, and deploy scalable ETL pipelines using Spark and Python on AWS Glue.
Orchestrate data workflows with Airflow, ensuring reliability and observability.
Implement real‑time streaming solutions with Apache Flink and batch processing with Hive.
Optimize job performance, monitor resource usage, and troubleshoot failures.
Collaborate with data scientists and product teams to translate business requirements into technical solutions.

Requirements

3+ years of experience building data pipelines in a cloud environment.
Strong proficiency in Python, Spark, and SQL.
Hands‑on experience with AWS Glue, Airflow, Apache Flink, and Hive.
Solid understanding of data modeling, partitioning, and performance tuning.
Excellent problem‑solving skills and a proactive, collaborative mindset.

Skills

pythonairflowapache flink

DepartmentEngineering

LocationBengaluru, Karnataka, India

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 21, 2026