onsite
Senior Associate - Data Engineer - Capital One
Data Engineer
Lead data engineering initiatives, designing scalable pipelines and data models using Python, Spark, and AWS to transform complex business problems into actionable insights.
About the role
Key Responsibilities
- Design, develop, and maintain large-scale data pipelines using Python, SQL, and Apache Spark on AWS infrastructure.
- Collaborate with data scientists, analysts, and product teams to understand data requirements and deliver high‑quality, reusable data assets.
- Implement robust data modeling, schema design, and metadata management to support analytics and reporting.
- Optimize ETL processes for performance, reliability, and cost efficiency across cloud and on‑prem environments.
- Ensure data quality, governance, and security compliance through automated testing and monitoring.
Requirements
- 5+ years of experience in data engineering with a strong background in Python and SQL.
- Hands‑on expertise with Apache Spark, AWS services (S3, Redshift, Glue, EMR), and data lake architectures.
- Proven ability to design scalable data models and implement end‑to‑end ETL workflows.
- Strong analytical skills and experience working with large, complex datasets.
- Excellent communication skills and a collaborative mindset.
Skills
pythonsqlapache sparkaws