remote
Staff Software Engineer - Semantic Data Lake - WEX
Software Engineer
Lead the design and implementation of a semantic data lake, transforming raw enterprise data into reusable, trusted assets using Python, AWS, SQL, and Spark. Drive architecture, data modeling, and performance at scale.
About the role
Key Responsibilities
- Architect and build a scalable semantic data lake on AWS, ensuring high availability, security, and performance.
- Design and implement data ingestion pipelines using Python, SQL, and Apache Spark to transform raw data into semantically enriched assets.
- Collaborate with data scientists and product teams to define data models, metadata standards, and governance policies.
- Optimize query performance and storage costs through partitioning, indexing, and cost‑effective data lakehouse solutions.
- Mentor and lead a small team of engineers, driving best practices in code quality, CI/CD, and automated testing.
Requirements
- 10+ years of software engineering experience with a focus on data platforms.
- Proficiency in Python, SQL, and Apache Spark for large‑scale data processing.
- Deep experience with AWS services (S3, Glue, Athena, Redshift, Lake Formation).
- Strong understanding of data modeling, semantic enrichment, and data governance.
- Excellent communication skills and ability to influence cross‑functional teams.
Skills
pythonawssqlapache spark