onsite
Senior Data Engineer - Big Data Platform - PNC Bank
Data Engineer
Senior Data Engineer responsible for designing, building, and optimizing large‑scale data pipelines and platforms using Python, Spark, Hadoop, and AWS services to enable real‑time analytics and data products.
About the role
Key Responsibilities
- Design, develop, and maintain high‑performance data pipelines on Hadoop and Spark clusters.
- Implement data ingestion and streaming solutions using Kafka and AWS services (S3, Glue, Redshift).
- Collaborate with data scientists and product teams to create scalable data models and enable self‑service analytics.
- Optimize SQL queries and ETL processes for reliability, latency, and cost efficiency.
- Establish best practices for data quality, monitoring, and documentation across the big‑data platform.
Requirements
- 5+ years of professional experience in data engineering, with strong expertise in Python and SQL.
- Hands‑on experience with Apache Spark, Hadoop ecosystem, and streaming technologies such as Kafka.
- Proficiency in AWS cloud services (S3, EMR, Redshift, Glue) and infrastructure‑as‑code concepts.
- Solid understanding of data modeling, warehousing, and performance tuning for large datasets.
- Excellent problem‑solving skills and ability to work cross‑functionally in an agile environment.
Skills
pythonsqlapache sparkhadoopawskafka