onsite
Principal Data Engineer - Roche
Data Engineer
Lead the design and implementation of scalable data pipelines and platforms, driving data strategy with Python, Spark, SQL, and AWS while mentoring a team of engineers.
About the role
Key Responsibilities
- Architect, develop, and maintain high‑performance data pipelines and lakehouse solutions on AWS.
- Design robust data models and schemas to support analytics, machine‑learning, and reporting workloads.
- Implement CI/CD practices and automated testing to ensure reliable, repeatable data deployments.
- Collaborate with data scientists, product owners, and business stakeholders to translate requirements into scalable data solutions.
- Mentor and lead a team of data engineers, fostering best practices and continuous improvement.
Requirements
- 5+ years of hands‑on experience building data pipelines using Python, SQL, and Apache Spark.
- Strong expertise in AWS services (e.g., S3, Redshift, Glue, Lambda, EMR) and infrastructure‑as‑code tools.
- Proven ability to design and optimize data models for large‑scale analytics environments.
- Experience with CI/CD pipelines, containerization (Docker/Kubernetes), and version control (Git).
- Excellent problem‑solving skills and ability to lead technical discussions with cross‑functional teams.
Skills
pythonsqlapache sparkawscicd