onsite
Principal Data Engineer - CVS Health
Data Engineer
Lead the design, development, and maintenance of cloud-native data platforms using Python, SQL, and AWS services, driving scalable ETL pipelines and advanced analytics for enterprise health solutions.
About the role
Key Responsibilities
- Architect and implement end‑to‑end data pipelines on AWS, leveraging services such as S3, Redshift, Glue, and Lambda to ingest, transform, and store large volumes of health data.
- Develop and maintain scalable, cloud‑native microservices in Python, ensuring high availability, performance, and security across production environments.
- Collaborate with data scientists, product managers, and business stakeholders to translate analytical requirements into robust data models and dashboards.
- Optimize SQL queries and Spark jobs for performance, cost, and reliability, applying best practices in data partitioning, indexing, and caching.
- Implement data governance, lineage, and quality checks, ensuring compliance with regulatory standards such as HIPAA.
- Mentor junior engineers, conduct code reviews, and promote a culture of continuous improvement and knowledge sharing.
Requirements
- 10+ years of experience in data engineering, with a strong focus on cloud-native architectures.
- Proficiency in Python, SQL, and Spark for large‑scale data processing.
- Hands‑on experience with AWS services (S3, Redshift, Glue, Lambda, EMR).
- Deep understanding of ETL design patterns, data modeling, and performance tuning.
- Excellent communication skills and ability to work cross‑functionally in a fast‑paced environment.