remote

Senior Specialist, Data Engineer Upstream Biologics - Merck

Data Engineer

Senior Data Engineer leading design, development, and optimization of scalable data pipelines and analytics platforms for upstream biologics, leveraging Python, Spark, SQL, and AWS services.

About the role

Key Responsibilities

Design, build, and maintain robust, high‑performance data pipelines that ingest, transform, and store large‑scale biologics research data.
Develop and optimize Spark jobs and SQL queries to support downstream analytics, machine‑learning models, and reporting.
Implement data‑modeling standards and data‑warehouse architectures on AWS (e.g., Redshift, S3, Glue) to ensure data integrity and accessibility.
Collaborate with scientists, bio‑informaticians, and IT teams to translate research requirements into scalable data solutions.
Automate ETL workflows, monitor performance, and troubleshoot production issues using Linux‑based tools and cloud monitoring services.

Requirements

5+ years of professional experience in data engineering, preferably in biopharma or life‑science environments.
Strong proficiency in Python, SQL, and Apache Spark for large‑volume data processing.
Hands‑on experience with AWS services (Redshift, S3, Glue, Lambda) and infrastructure‑as‑code concepts.
Solid understanding of data modeling, schema design, and ETL best practices.
Excellent problem‑solving skills, ability to work cross‑functionally, and effective communication of technical concepts to non‑technical stakeholders.

Skills

pythonsqlapache sparkawslinux

CompanyMerck

DepartmentEngineering

LocationBoston, Massachusetts, United States

Experience5+ years

Tenurefull-time

LevelSenior

Salary184,200

Posted June 26, 2026