remote
Senior Data Engineer - Elsevier
Data Engineer
Senior Data Engineer building scalable, high‑performance data pipelines for large‑scale search and analytics, leveraging Python, Spark, Airflow, and AWS to deliver reliable, business‑impactful solutions.
About the role
Key Responsibilities
- Design, develop, and maintain robust data pipelines that ingest, transform, and serve large volumes of structured and unstructured data for search and AI products.
- Implement scalable ETL workflows using Apache Spark, Python, and SQL, ensuring data quality, performance, and reliability.
- Orchestrate pipeline execution and monitoring with Airflow, creating DAGs that support real‑time and batch processing.
- Collaborate with data scientists, product managers, and platform teams to define data models, schemas, and metadata standards.
- Mentor junior engineers, conduct code reviews, and promote best practices in data engineering and DevOps.
Requirements
- 5+ years of experience in data engineering, with a strong background in Python, SQL, and Spark.
- Proven expertise in building and managing Airflow DAGs and orchestrating complex data workflows.
- Hands‑on experience with AWS services (S3, Redshift, EMR, Glue) and data lake architecture.
- Solid understanding of data modeling, schema design, and performance tuning for large‑scale analytics.
- Excellent problem‑solving skills, strong communication, and a collaborative mindset.
Skills
pythonsqlapache sparkairflowaws