onsite
Senior Data Engineer - Bristol Myers Squibb
Data Engineer
Senior Data Engineer responsible for designing, building, and optimizing large‑scale data pipelines and warehouses using Python, Spark, SQL, and AWS services, while ensuring reliable orchestration with Airflow and robust data models.
About the role
Key Responsibilities
- Design, develop, and maintain scalable data pipelines that ingest, transform, and load high‑volume datasets from diverse sources.
- Implement data models and warehouse solutions on AWS (Redshift, S3, Glue) to support analytics and machine‑learning teams.
- Automate workflow orchestration and monitoring using Apache Airflow, ensuring reliability and timely delivery.
- Optimize Spark jobs and SQL queries for performance, cost efficiency, and scalability.
- Collaborate with cross‑functional stakeholders to translate business requirements into technical specifications and data solutions.
Requirements
- 5+ years of professional experience in data engineering or related fields.
- Strong proficiency in Python and SQL, with hands‑on experience building ETL pipelines.
- Deep knowledge of Apache Spark and distributed data processing concepts.
- Extensive experience with AWS data services (Redshift, S3, Glue, Lambda) and infrastructure as code.
- Proven ability to design data models, implement Airflow DAGs, and ensure data quality and governance.
Skills
pythonsqlapache sparkawsairflow