remote
Senior Data Engineer - Upstream Biologics - MSD
Data Engineer
Senior Data Engineer responsible for designing, building, and maintaining scalable data pipelines and analytics platforms supporting upstream biologics research, leveraging Python, SQL, AWS, Snowflake, and Airflow.
About the role
Key Responsibilities
- Design, develop, and optimize end‑to‑end data pipelines that ingest, transform, and store large‑scale biologics research data.
- Implement and maintain data warehouse solutions on Snowflake, ensuring high performance and data integrity.
- Automate workflow orchestration using Apache Airflow, integrating with AWS services such as S3, Redshift, and Lambda.
- Collaborate with scientists, bioinformaticians, and product teams to translate research requirements into robust data models and analytics solutions.
- Monitor, troubleshoot, and improve existing ETL processes, applying best practices for data quality, security, and compliance.
Requirements
- 5+ years of professional experience in data engineering, preferably in biopharma or life‑science environments.
- Strong proficiency in Python and SQL for data manipulation and pipeline development.
- Hands‑on experience with AWS cloud services (S3, EC2, Lambda, Glue) and Snowflake data warehousing.
- Demonstrated expertise in building and scheduling workflows with Apache Airflow or similar orchestration tools.
- Solid understanding of ETL/ELT design patterns, data modeling, and performance tuning.
Skills
pythonsqlawssnowflake