Senior Data Engineer - The Clatterbridge Cancer Centre NHS Foundation Trust
Data Engineer
Senior Data Engineer driving data pipelines and analytics for cancer research, leveraging Python, Spark, and AWS to transform clinical data into actionable insights.
About the role
Key Responsibilities
Design, develop, and maintain scalable data pipelines using Python and Apache Spark to ingest, transform, and load large volumes of clinical and research data.
Implement robust ETL processes and data quality checks within AWS services (S3, Glue, Redshift, Athena) to support research analytics.
Collaborate with clinical researchers and data scientists to model data, create data marts, and deliver high‑quality datasets for studies.
Optimize query performance and storage costs through effective data warehousing and partitioning strategies.
Document architecture, data flows, and best practices; mentor junior engineers on data engineering principles.
Requirements
5+ years of experience in data engineering, with a strong background in Python and SQL.
Hands‑on experience with Apache Spark, AWS Glue, Redshift, and related cloud services.
Proven ability to design and implement ETL pipelines and data models for large, complex datasets.
Excellent problem‑solving skills and a collaborative mindset, especially with clinical and research stakeholders.
Strong communication skills and a passion for improving patient outcomes through data.