onsite
Data Engineer - Amgen
Data Engineer
Data Engineer building and operating large‑scale healthcare data pipelines using Python, Spark, Airflow, and AWS to ingest, transform, and publish data services for analytics.
About the role
Key Responsibilities
- Design, develop, and maintain batch and streaming data pipelines for healthcare datasets using Python and Apache Spark.
- Implement metadata‑driven ingestion workflows and automate data quality checks with Airflow.
- Deploy and manage data services on AWS, ensuring scalability, reliability, and security.
- Collaborate with senior engineers, product owners, and business analysts to translate requirements into technical solutions.
- Optimize pipeline performance, monitor job health, and troubleshoot issues in production.
Requirements
- 3+ years of experience in data engineering, preferably in healthcare or life sciences.
- Strong proficiency in Python, SQL, and Spark for large‑scale data processing.
- Hands‑on experience with Airflow, AWS services (S3, Redshift, EMR), and data lake architectures.
- Solid understanding of ETL concepts, data modeling, and data quality best practices.
- Excellent communication skills and ability to work collaboratively in a fast‑moving environment.
Skills
pythonsqlapache sparkairflowaws