remote
Principal Data Engineer - Metropolitan Council of the Twin Cities
Data Engineer
Lead the design, development, and optimization of large‑scale data pipelines and platforms, leveraging Python, SQL, Spark, and AWS to deliver reliable, high‑performance data solutions for regional planning and services.
About the role
Key Responsibilities
- Architect, build, and maintain scalable data pipelines and lakehouse solutions supporting regional transportation, wastewater, and housing analytics.
- Design data models and schemas that enable self‑service analytics and reporting for cross‑departmental stakeholders.
- Lead the migration and optimization of on‑premise workloads to AWS services such as S3, Redshift, and Glue.
- Implement robust ETL/ELT processes using Python, SQL, and Apache Spark, ensuring data quality, lineage, and governance.
- Mentor junior engineers, establish best practices, and drive continuous improvement of the data engineering stack.
Requirements
- 10+ years of hands‑on experience in data engineering, with a proven track record of delivering enterprise‑grade data platforms.
- Deep expertise in Python, SQL, and distributed processing frameworks (e.g., Apache Spark, Flink).
- Extensive experience designing and operating data solutions on AWS, including S3, Redshift, Glue, and Lambda.
- Strong knowledge of data modeling, schema design, and ETL/ELT architecture.
- Excellent problem‑solving skills, ability to work cross‑functionally, and experience mentoring technical teams.
Skills
pythonsqlapache sparkaws