onsite
Sr Data Engineer - Integrations - Health Catalyst
Data Engineer
Senior Data Engineer focused on building scalable integration pipelines, leveraging Python, SQL, and AWS services to deliver high‑quality data solutions for healthcare analytics.
About the role
Key Responsibilities
- Design, develop, and maintain robust data integration pipelines using Python, SQL, and Spark to ingest, transform, and load data from diverse healthcare sources.
- Collaborate with data scientists and business analysts to understand data requirements and translate them into efficient ETL workflows.
- Implement and manage Airflow DAGs for automated, scheduled data processing and monitoring.
- Optimize data storage and retrieval in AWS data services (Redshift, S3, Glue) to support large‑scale analytics workloads.
- Ensure data quality, lineage, and compliance with healthcare regulations through rigorous testing and documentation.
Requirements
- 5+ years of experience in data engineering, with a strong focus on integration and pipeline development.
- Hands‑on experience with AWS data services (Redshift, S3, Glue, Athena) and orchestration tools like Airflow.
- Solid understanding of data modeling, ETL best practices, and data governance in a regulated environment.
- Excellent problem‑solving skills and ability to work collaboratively in a fast‑paced, cross‑functional team.
Skills
pythonsqlawsapache sparkairflow