remote
Data Engineer II - Mastercard
Data Engineer
Senior data engineer building scalable pipelines in Python and Spark on AWS, designing data models and orchestrating ETL workflows to support real‑time payment analytics and reporting.
About the role
Key Responsibilities
- Design, develop, and maintain large‑scale data pipelines using Python, Apache Spark, and AWS services (Glue, S3, Redshift).
- Implement robust ETL processes, ensuring data quality, lineage, and performance for transactional and analytical workloads.
- Collaborate with data scientists and product teams to translate business requirements into scalable data models and schema designs.
- Optimize query performance and storage costs across data warehouses and lakehouse architectures.
- Monitor pipeline health, troubleshoot issues, and continuously improve automation and documentation.
Requirements
- 3+ years of experience in data engineering, with strong proficiency in Python and SQL.
- Hands‑on experience with Apache Spark, AWS Glue, and data lake/warehouse solutions.
- Solid understanding of data modeling, ETL best practices, and performance tuning.
- Experience with version control (Git) and CI/CD pipelines for data workflows.
- Excellent problem‑solving skills and ability to work collaboratively in a fast‑paced environment.
Skills
pythonsqlapache sparkaws