onsite
Senior Data Engineer - Big Data Platform - PNC Financial Services Group
Data Engineer
Senior Data Engineer responsible for designing, building, and optimizing large‑scale data pipelines and platforms using Python, Spark, Hadoop, and cloud services to enable analytics and data products.
About the role
Key Responsibilities
- Design, develop, and maintain high‑performance data pipelines that ingest, transform, and store terabytes of structured and unstructured data.
- Implement scalable big‑data solutions using Apache Spark, Hadoop ecosystem, and cloud services (AWS S3, Redshift, EMR).
- Build and orchestrate workflows with Apache Airflow, ensuring reliable scheduling, monitoring, and error handling.
- Integrate real‑time streaming data using Apache Kafka and related technologies.
- Collaborate with data scientists, analysts, and product teams to deliver data products that meet business requirements.
- Establish best practices for data quality, governance, and performance tuning.
Requirements
- 5+ years of professional experience in data engineering, with a focus on big‑data platforms.
- Strong proficiency in Python and SQL for data manipulation and pipeline development.
- Hands‑on experience with Apache Spark, Hadoop, and related distributed processing frameworks.
- Proven expertise in cloud environments, preferably AWS, including services such as S3, Redshift, and EMR.
- Familiarity with streaming technologies (Kafka) and workflow orchestration tools (Airflow).
Skills
pythonsqlapache sparkhadoopawskafkaairflow