onsite
Data Coordination & Integration Lead - Research Data & AI Readiness
Software Engineer
Lead data coordination and integration for a leading cancer research institute, driving AI readiness through robust ETL pipelines, data governance, and advanced analytics using SQL and Python.
About the role
Key Responsibilities
- Design, implement and maintain scalable ETL pipelines to ingest, transform and harmonise research data across multiple domains.
- Collaborate with scientists, data stewards and IT to define data standards, metadata schemas and governance policies.
- Lead AI readiness initiatives, ensuring data quality, lineage and compliance for machine‑learning projects.
- Provide technical mentorship to junior team members and promote best practices in data engineering.
- Monitor pipeline performance, troubleshoot issues and optimise for efficiency and reliability.
Requirements
- Proven experience in data integration, ETL design and execution (SQL, Python, Airflow or similar).
- Strong understanding of data governance, metadata management and regulatory compliance in a research environment.
- Familiarity with AI/ML workflows and the ability to translate research needs into data solutions.
- Excellent communication skills and a collaborative mindset.
- Experience in a life‑sciences or academic setting is highly desirable.