remote
Genomic Data Scientist - Genomics England
Data Scientist
Lead the analysis of large‑scale whole‑genome datasets, develop machine‑learning pipelines and statistical models, and translate genomic insights into clinical decision support using Python, R and cloud platforms.
About the role
Key Responsibilities
- Design, implement and optimise data‑processing pipelines for whole‑genome sequencing data at national scale.
- Develop and apply machine‑learning and statistical models to identify clinically relevant variants and biomarkers.
- Collaborate with clinicians, researchers and software engineers to integrate genomic analyses into NHS diagnostic workflows.
- Maintain and scale cloud‑based infrastructure (e.g., AWS) for secure storage, compute and reproducible research.
- Produce clear visualisations, reports and scientific publications that communicate findings to both technical and non‑technical audiences.
Requirements
- Advanced degree (MSc/PhD) in Bioinformatics, Computational Biology, Statistics, Computer Science or related field.
- Proficiency in Python and R for data manipulation, statistical analysis and model development.
- Experience building and deploying machine‑learning pipelines on large genomic datasets.
- Strong knowledge of genomics concepts, variant annotation and clinical interpretation.
- Hands‑on experience with cloud platforms (AWS) and containerisation (Docker, Kubernetes) for scalable analysis.
Skills
pythonmachine learningaws