onsite
Lead Data Engineer - dentsu
Data Engineer
Lead data engineering initiatives focused on identity resolution, handling PII data, building robust pipelines, and improving matching engine accuracy using Python, Spark, SQL, and AWS services.
About the role
Key Responsibilities
- Design, develop, and maintain scalable data pipelines for ingesting and processing PII‑class identity data.
- Analyze and improve matching engine logic, ensuring high accuracy and compliance with data privacy standards.
- Implement data quality frameworks, monitoring, and remediation processes to guarantee reliable identity resolution.
- Collaborate with data scientists, product owners, and security teams to translate business requirements into technical solutions.
- Optimize performance of Spark and SQL workloads on AWS, leveraging services such as EMR, Redshift, and S3.
Requirements
- 5+ years of experience in data engineering, with a focus on identity or consumer data.
- Strong proficiency in Python, SQL, and Apache Spark for large‑scale data processing.
- Hands‑on experience with AWS data services (EMR, Redshift, S3, Lambda) and infrastructure as code.
- Demonstrated ability to implement data quality controls and resolve complex data matching issues.
- Excellent problem‑solving skills and a passion for understanding the nuances of identity data.
Skills
pythonsqlapache sparkaws