onsite
Senior Data Engineer - PathWest Laboratory Medicine WA
Data Engineer
Lead end‑to‑end data engineering for large‑scale healthcare analytics, building robust pipelines on AWS, optimizing SQL and Spark workloads, and designing scalable data models to support advanced analytics and reporting.
About the role
Key Responsibilities
- Design, develop, and maintain scalable data pipelines using Python, SQL, and Apache Spark on AWS services (S3, Redshift, Glue).
- Implement data ingestion, transformation, and quality checks for high‑volume clinical datasets.
- Collaborate with data scientists and analysts to deliver reliable data assets for predictive modeling and reporting.
- Optimize query performance and storage costs through data modeling, partitioning, and compression techniques.
- Ensure compliance with data governance, security, and privacy standards in a regulated healthcare environment.
Requirements
- 5+ years of data engineering experience in a large enterprise or healthcare setting.
- Proficiency in Python, SQL, and Spark for batch and streaming data processing.
- Hands‑on experience with AWS data services (S3, Redshift, Glue, Athena).
- Strong understanding of data modeling, ETL best practices, and performance tuning.
- Excellent problem‑solving skills and ability to work cross‑functionally in a fast‑paced environment.
Skills
pythonsqlawsapache spark