remote

Principal Research Data Engineer - Bayer

Data Engineer

Lead advanced data engineering initiatives, architecting scalable pipelines and data models to support research analytics. Leverage Python, Spark, and AWS to deliver high‑performance, reproducible data solutions for scientific discovery.

About the role

Key Responsibilities

Design, develop, and maintain large‑scale data pipelines that ingest, transform, and serve research data across multiple domains.
Collaborate with data scientists and domain experts to define data models, schemas, and metadata standards that enable reproducible research.
Implement performance‑optimized solutions using Apache Spark, Python, and SQL on AWS infrastructure (EMR, Redshift, S3).
Ensure data quality, lineage, and governance through automated testing, monitoring, and documentation.
Mentor and guide junior engineers, fostering best practices in coding, version control, and DevOps.

Requirements

10+ years of experience in data engineering, with a strong focus on research or scientific data environments.
Proficiency in Python, SQL, and Apache Spark for large‑scale data processing.
Hands‑on experience deploying and managing data solutions on AWS (EMR, Redshift, S3, Glue).
Deep understanding of data modeling, ETL design, and data governance principles.
Excellent communication skills and a proven ability to translate complex technical concepts for cross‑functional teams.

Skills

pythonsqlapache sparkawsmachine learning

CompanyBayer

DepartmentEngineering

LocationSt. Louis, Missouri, United States

Experience5+ years

Tenurefull-time

LevelLead

Salary185,000

Posted June 21, 2026