remote
Senior Data Engineer - abstra
Data Engineer
Lead end‑to‑end data pipeline development using Python, SQL, and Spark on AWS, designing scalable data models and ensuring high‑quality, reliable data for analytics and ML.
About the role
Key Responsibilities
- Design, build, and maintain large‑scale data pipelines using Python, SQL, and Apache Spark on AWS.
- Develop and optimize ETL processes to ingest, transform, and load data from diverse sources into data lakes and warehouses.
- Collaborate with data scientists and analysts to define data models, schemas, and performance metrics.
- Implement data quality checks, monitoring, and alerting to ensure data reliability and compliance.
- Document architecture, code, and best practices for future maintenance and onboarding.
Requirements
- 5+ years of experience in data engineering or related roles.
- Proficiency in Python, SQL, and Spark for large‑scale data processing.
- Hands‑on experience with AWS services (S3, Redshift, Glue, EMR).
- Strong understanding of data modeling, ETL design, and performance tuning.
- Excellent problem‑solving skills and ability to work collaboratively in a fast‑paced environment.
Skills
pythonsqlawsapache spark