onsite
Data Engineer - Data Foundations for AI - Serrala Group GmbH
Data Engineer
Build and maintain scalable data pipelines that power AI initiatives, leveraging Python, SQL, AWS, and Spark to transform raw data into reliable, high‑quality datasets for advanced analytics.
About the role
Key Responsibilities
- Design, develop, and optimize data pipelines using Python, SQL, and Spark to ingest, transform, and store large volumes of structured and unstructured data.
- Implement and maintain data models and schemas in AWS data services (Redshift, S3, Glue) ensuring data quality and consistency.
- Collaborate with data scientists and ML engineers to provide clean, well‑documented datasets for model training and inference.
- Automate workflow orchestration with Airflow, monitor pipeline health, and troubleshoot performance bottlenecks.
- Document data lineage, metadata, and best practices to support governance and compliance.
Requirements
- 3+ years of experience as a data engineer in a cloud environment.
Skills
pythonsqlawsapache sparkairflow