remote
Principal Software Engineer - Data Lakes - Fivetran
Software Engineer
Lead the design and implementation of scalable data lake solutions using Python, SQL, Spark, and AWS, driving reliable data ingestion and transformation for enterprise customers.
About the role
Key Responsibilities
- Architect and develop high‑throughput data lake pipelines that ingest, transform, and store terabytes of data across cloud platforms.
- Collaborate with data scientists and product teams to define schema, data quality standards, and performance benchmarks.
- Optimize Spark jobs and SQL queries for cost and latency, leveraging AWS services such as S3, Glue, and Redshift.
- Mentor and code‑review junior engineers, fostering best practices in version control, testing, and documentation.
- Drive continuous improvement of data ingestion frameworks, ensuring resilience, scalability, and compliance with security policies.
Requirements
- 10+ years of software engineering experience with a focus on data engineering and large‑scale distributed systems.
- Proficiency in Python, SQL, and Spark, with hands‑on experience building production‑grade data pipelines.
- Deep knowledge of AWS data services (S3, Glue, Redshift, Athena) and data lake architecture patterns.
- Strong problem‑solving skills, ability to translate business requirements into robust technical solutions.
- Excellent communication skills and a track record of mentoring and leading engineering teams.