onsite

Senior Data Engineer - Data Science Platform

Data Scientist

Senior Data Engineer responsible for designing, building, and optimizing large‑scale data pipelines on Azure, leveraging Apache Spark and Azure Data Factory to support a data science platform.

About the role

Key Responsibilities

Design and implement end‑to‑end data pipelines using Apache Spark and Azure Data Factory to ingest, transform, and store massive datasets.
Develop and maintain data lake architectures on Azure Data Lake and Azure Data Lake Storage, ensuring high availability and security.
Collaborate with data scientists and analysts to provide reliable, well‑documented data sources for machine‑learning models and analytics.
Optimize performance and cost of data processing jobs, monitoring workloads and tuning Spark configurations.
Implement data governance, lineage, and quality checks across the platform.

Requirements

5+ years of experience building data pipelines on Azure, with deep expertise in Apache Spark.
Proficiency in Azure Data Factory, Azure Data Lake, and Azure Data Lake Storage services.
Strong SQL and programming skills (Python/Scala) for data transformation and automation.
Experience with data modeling, ETL best practices, and performance tuning in distributed environments.
Solid understanding of data security, governance, and CI/CD for data engineering workflows.

Skills

apache sparkazure

DepartmentResearch

LocationLondon, United Kingdom

Experience5+ years

Tenurefull-time

LevelSenior

Posted June 25, 2026