remote
Senior Data Engineer - Optum
Data Engineer
Lead end‑to‑end data engineering for large healthcare datasets, building scalable pipelines on AWS, optimizing SQL and Spark workloads, and designing robust data models to support analytics and AI initiatives.
About the role
Key Responsibilities
- Design, develop, and maintain scalable data pipelines using Python, SQL, and Apache Spark on AWS services (Glue, EMR, Redshift).
- Implement data ingestion, transformation, and quality checks for high‑volume healthcare datasets.
- Collaborate with data scientists and analysts to deliver clean, well‑documented data assets for modeling and reporting.
- Optimize query performance and resource utilization across distributed environments.
- Ensure compliance with data governance, security, and privacy regulations.
Requirements
- 5+ years of data engineering experience in a cloud environment.
- Proficiency in Python, SQL, and Spark for large‑scale data processing.
- Hands‑on experience with AWS data services (Glue, EMR, Redshift, S3).
- Strong understanding of data modeling, ETL best practices, and performance tuning.
- Excellent problem‑solving skills and ability to work cross‑functionally in a fast‑paced setting.
Skills
pythonsqlawsapache spark