remote
IT Principal Data Engineering - Save A Lot
Software Engineer
Lead the design, build, and operation of scalable data pipelines and platforms that enable AI and analytics across the organization, blending deep engineering expertise with a data‑science mindset to deliver reliable, high‑performance data flows.
About the role
Key Responsibilities
- Architect, develop, and maintain end‑to‑end data pipelines using Python, Apache Spark, and SQL on AWS services (Glue, Redshift, S3).
- Design and implement a robust data lake and lakehouse architecture that supports real‑time and batch analytics for data science teams.
- Collaborate with data scientists to operationalize machine learning models, ensuring seamless data access and model monitoring.
- Establish data governance, quality, and security standards across all data assets.
- Mentor and guide a small team of data engineers, fostering best practices and continuous improvement.
Requirements
- 10+ years of experience in data engineering, with a strong background in large‑scale distributed processing.
- Proficiency in Python, Spark, and SQL; experience with AWS data services (Glue, Redshift, Athena, S3).
- Deep understanding of data lake and lakehouse concepts, data modeling, and metadata management.
- Hands‑on experience deploying and monitoring ML models in production environments.
- Excellent communication skills and a proven ability to translate business needs into technical solutions.
Skills
pythonapache sparksqlaws