onsite
Lead Data Engineer - Data Publication and Transformation - Capital One
Data Engineer
Lead Data Engineer driving data publication and transformation, architecting scalable pipelines with Python, Spark, and AWS to deliver high‑quality data products across the organization.
About the role
Key Responsibilities
- Design, build, and maintain end‑to‑end data pipelines that ingest, transform, and publish data to downstream analytics and reporting systems.
- Collaborate with cross‑functional Agile teams to define data requirements, data models, and performance targets.
- Implement best practices for data quality, lineage, and governance using AWS services and open‑source tooling.
- Mentor and coach junior engineers, fostering a culture of continuous improvement and knowledge sharing.
- Optimize pipeline performance and cost through efficient use of Spark, SQL, and cloud resources.
Requirements
- 5+ years of experience in data engineering, with a strong background in Python, SQL, and Spark.
- Proven expertise in AWS data services (Glue, Redshift, S3, Athena) and data lake architecture.
- Deep understanding of data modeling, ETL design, and data quality best practices.
- Experience leading technical initiatives and mentoring team members.
- Excellent communication skills and a collaborative mindset.
Skills
pythonsqlapache sparkaws