onsite
Lead Data Engineer - TPG Telecom
Data Engineer
Lead the design and implementation of scalable data pipelines and lakehouse architecture using Python, Spark, and AWS, driving the execution of a company‑wide AI strategy and delivering high‑quality data assets for analytics and machine learning.
About the role
Key Responsibilities
- Architect, develop, and maintain end‑to‑end data pipelines that ingest, transform, and load large volumes of structured and unstructured data into a unified data lakehouse.
- Collaborate with data scientists, analysts, and product teams to define data requirements, quality standards, and governance policies.
- Optimize pipeline performance using Spark, SQL, and AWS services (Glue, Redshift, S3, Lake Formation) to ensure low latency and high throughput.
- Implement robust data quality checks, monitoring, and alerting to guarantee data reliability for downstream AI and analytics workloads.
- Mentor and guide junior engineers, fostering best practices in coding, testing, and documentation.
Requirements
- 5+ years of experience in data engineering with a strong focus on big data technologies.
- Proficiency in Python, Apache Spark, and SQL; experience with AWS data services.
- Hands‑on experience building data lakehouse architectures and implementing ETL/ELT pipelines.
- Solid understanding of data modeling, metadata management, and data governance.
- Excellent communication skills and a proven ability to work independently in a fast‑paced environment.
Skills
pythonapache sparksqlaws