remoteonsite
Director, Data Engineering - Genpact
Software Engineer
Lead a high‑performing data engineering organization, designing and scaling cloud‑native pipelines, real‑time streaming, and AI‑enabled data platforms using Python, Spark, Kafka, and AWS.
About the role
Key Responsibilities
- Define and execute the data engineering strategy to support AI and analytics initiatives across the enterprise.
- Architect, build, and optimize large‑scale, cloud‑native data pipelines and streaming solutions using Apache Spark, Kafka, and AWS services.
- Lead a multidisciplinary team of engineers, fostering a culture of innovation, best‑in‑class code quality, and continuous delivery.
- Collaborate with data scientists, product owners, and business stakeholders to translate complex requirements into robust data models and ETL processes.
- Establish governance, security, and performance standards for data platforms, ensuring compliance and reliability.
Requirements
- 10+ years of experience in data engineering, with at least 5 years in a leadership role.
- Deep expertise in Python, Apache Spark, Kafka, and AWS (e.g., S3, Redshift, EMR, Lambda).
- Proven track record designing and scaling real‑time streaming and batch processing pipelines for AI/ML workloads.
- Strong knowledge of data modeling, ETL best practices, and data governance frameworks.
- Excellent communication and stakeholder management skills, with the ability to drive cross‑functional initiatives.
Skills
pythonapache sparkkafkaawsmachine learning