remoteonsite
Sr. Data Architect - Data Engineering 3 - Genpact
Software Engineer
Lead the design and implementation of enterprise‑scale data platforms, driving AI‑enabled analytics using Python, Spark, SQL, and cloud services such as AWS while shaping data models and ETL pipelines.
About the role
Key Responsibilities
- Architect and build robust, scalable data platforms that support AI and advanced analytics workloads.
- Design end‑to‑end data pipelines using Python, SQL, and Apache Spark for batch and real‑time processing.
- Define and maintain enterprise data models, data governance standards, and metadata management practices.
- Collaborate with data scientists, engineers, and business stakeholders to translate analytical requirements into technical solutions.
- Evaluate, select, and integrate cloud services (e.g., AWS S3, Redshift, Glue) to optimize storage, compute, and cost efficiency.
- Mentor junior engineers and promote best practices in data engineering, testing, and documentation.
Requirements
- 10+ years of experience in data architecture or data engineering, with a strong focus on large‑scale, cloud‑native solutions.
- Proficiency in Python, SQL, and Apache Spark for building high‑performance data pipelines.
- Deep knowledge of AWS services (S3, Redshift, EMR, Glue) and experience designing data lakes/warehouses.
- Expertise in data modeling, ETL design, and data governance frameworks.
- Strong problem‑solving skills, ability to work in fast‑paced, innovative environments, and excellent communication with cross‑functional teams.
Skills
pythonsqlapache sparkaws