remote
Staff Data Engineer - National Grid
Data Engineer
Lead the design, development, and optimization of large‑scale data pipelines and platforms, leveraging Python, Spark, and AWS to enable real‑time analytics and robust data models for the energy sector.
About the role
Key Responsibilities
- Architect, build, and maintain scalable data pipelines using Python, Apache Spark, and Kafka to ingest, transform, and deliver high‑volume energy data.
- Design and implement data models and warehouses on AWS services (Redshift, S3, Glue) to support analytics, reporting, and machine‑learning workloads.
- Collaborate with product, engineering, and business teams to translate requirements into robust ETL solutions and ensure data quality and governance.
- Optimize performance, reliability, and cost of data platforms through monitoring, tuning, and automation.
- Mentor junior engineers, establish best practices, and drive continuous improvement of data engineering standards.
Requirements
- 5+ years of professional experience building large‑scale data pipelines and warehouses.
- Strong proficiency in Python, SQL, and Apache Spark (or similar distributed processing frameworks).
- Hands‑on experience with AWS data services (Redshift, S3, Glue, Lambda) and infrastructure‑as‑code tools.
- Solid understanding of data modeling, ETL design patterns, and streaming technologies such as Kafka.
- Excellent problem‑solving skills, ability to work cross‑functionally, and a track record of mentoring technical teams.
Skills
pythonsqlapache sparkawskafka