onsite
Senior Data Engineer - MCKESSON
Data Engineer
Lead end‑to‑end data pipeline development, architect scalable data solutions on AWS, and drive data quality and governance for healthcare analytics.
About the role
Key Responsibilities
- Design, build, and maintain robust data pipelines using Python, SQL, and Apache Spark on AWS services (S3, Redshift, Glue).
- Collaborate with data scientists and product teams to translate business requirements into scalable data models and ETL processes.
- Implement data quality checks, monitoring, and alerting to ensure high data integrity and availability.
- Optimize query performance and storage costs through advanced indexing, partitioning, and compression techniques.
- Document architecture, data flows, and best practices for future maintainability.
Requirements
- 5+ years of experience in data engineering, preferably in healthcare or large enterprise environments.
- Proficiency in Python, SQL, and experience with Spark or similar distributed processing frameworks.
- Hands‑on experience with AWS data services (Redshift, Glue, Athena, S3, Lambda).
- Strong understanding of data modeling, ETL design, and data governance principles.
- Excellent problem‑solving skills and ability to work cross‑functionally in a fast‑paced setting.
Skills
pythonsqlawsapache spark