onsite
Principal Data Engineer - NSW Health
Data Engineer
Lead the design, development, and maintenance of large-scale data pipelines and analytics solutions using Python, SQL, AWS, and Spark to empower mental health services with actionable insights.
About the role
Key Responsibilities
- Architect and implement scalable data pipelines and ETL processes on AWS to ingest, transform, and store health data.
- Develop and maintain Spark jobs and Python scripts for data processing and feature engineering.
- Collaborate with data scientists and analysts to design data models that support predictive analytics and reporting.
- Ensure data quality, governance, and compliance with health data regulations.
- Optimize performance and cost of data workflows, leveraging AWS services such as S3, Redshift, Glue, and Athena.
Requirements
- Extensive experience in Python, SQL, and Spark for large-scale data processing.
- Proven expertise in AWS data services (S3, Redshift, Glue, Athena, EMR).
- Strong understanding of data modeling, ETL design, and data governance principles.
- Excellent problem‑solving skills and ability to work collaboratively in a multidisciplinary team.
- Experience in the health or public sector is an advantage.