remoteonsite
Senior Big Data Engineer - MetLife
Data Engineer
Senior Big Data Engineer responsible for designing, building, and optimizing large‑scale data pipelines and analytics platforms using Hadoop, Spark, Kafka, Python, SQL, and AWS within a hybrid work environment.
About the role
Key Responsibilities
- Design, develop, and maintain high‑performance data pipelines on Hadoop and Spark clusters.
- Implement real‑time streaming solutions using Kafka to ingest and process event data.
- Collaborate with data scientists and analysts to provide clean, reliable datasets for advanced analytics and machine‑learning models.
- Optimize SQL queries and data models for performance and cost efficiency on AWS services (S3, EMR, Redshift).
- Establish best practices for data governance, security, and reproducibility across the analytics platform.
- Mentor junior engineers and contribute to architectural decisions and technical roadmaps.
Requirements
- 5+ years of hands‑on experience with Hadoop ecosystem (HDFS, Hive, Pig) and Spark (Scala or PySpark).
- Strong programming skills in Python and proficiency in writing complex SQL queries.
- Experience building and operating streaming pipelines with Kafka.
- Solid understanding of AWS data services (S3, EMR, Redshift, Glue) and cloud‑native deployment patterns.
- Demonstrated ability to troubleshoot performance issues, implement data quality checks, and work in an Agile, hybrid team setting.
Skills
apache sparkkafkapythonsqlaws