remote
Principal Data Engineer - Empower
Data Engineer
Lead enterprise‑scale data engineering initiatives, architecting robust pipelines and data lakes using Python, Spark, and AWS services, while driving best practices in data quality, governance, and automation.
About the role
Key Responsibilities
- Design, build, and maintain large‑scale data pipelines and lakehouse architectures that support analytics, reporting, and machine learning workloads.
- Lead the migration of legacy data systems to modern cloud platforms, ensuring high availability, scalability, and cost efficiency.
- Collaborate with data scientists, product managers, and business stakeholders to translate business requirements into technical specifications and data solutions.
- Implement robust data quality, lineage, and governance frameworks using tools such as Airflow, Delta Lake, and AWS Glue.
- Mentor and coach junior engineers, fostering a culture of continuous learning and technical excellence.
Requirements
- 10+ years of experience in data engineering, with a proven track record of delivering production‑grade data solutions.
- Expertise in Python, SQL, and Apache Spark for batch and streaming data processing.
- Deep knowledge of AWS data services (S3, Redshift, Glue, Athena, EMR) and experience with Airflow orchestration.
- Strong understanding of data modeling, ETL best practices, and data governance principles.
- Excellent communication skills and ability to influence cross‑functional teams.
Skills
pythonsqlapache sparkawsairflow