onsite
Data Analyst - Ancestry & Health Genomics Lab - University of Sydney
Data Analyst
Data Analyst role supporting the Ancestry & Health Genomics Lab, applying Python, R, and Bash to process large‑scale genomic datasets on HPC clusters, develop pipelines, and generate analytical reports.
About the role
Key Responsibilities
- Develop, maintain, and optimise data processing pipelines for genomic and health‑related datasets using Python, R, and Bash.
- Execute large‑scale analyses on high‑performance computing (HPC) clusters, ensuring efficient use of resources and reproducibility.
- Integrate and query diverse data sources (e.g., sequencing files, clinical records) using SQL and data‑engineering best practices.
- Collaborate with bioinformaticians and researchers to translate scientific questions into analytical workflows.
- Produce clear visualisations, summary statistics, and documentation for internal and external stakeholders.
Requirements
- Honours or Master’s degree in Bioinformatics, Computational Science, Data Engineering, or a related field.
- Proficiency in Python and R for statistical analysis and scripting.
- Strong command‑line skills (Bash) and experience working in Linux/HPC environments.
- Hands‑on experience with SQL databases and handling large, complex datasets.
- Familiarity with bioinformatics tools and workflows for genomics data processing.