onsite
Data Engineer - Technologies
Data Engineer
Data Engineer responsible for designing, building, and maintaining scalable data pipelines and warehouses using Python, SQL, AWS, and Spark to support advanced manufacturing analytics and decarbonization initiatives.
About the role
Key Responsibilities
- Design, develop, and maintain robust ETL pipelines to ingest and transform manufacturing data from diverse sources.
- Build and optimize data warehouses and data lakes on AWS services such as Redshift, S3, and Glue.
- Implement scalable processing workflows using Apache Spark and Python to support real‑time and batch analytics.
- Collaborate with data scientists and product teams to provide clean, well‑documented datasets for machine‑learning models and reporting.
- Monitor pipeline performance, troubleshoot issues, and ensure data quality, security, and compliance.
Requirements
- 3+ years of hands‑on experience building data pipelines and warehouses in a cloud environment, preferably AWS.
- Proficiency in Python and SQL for data manipulation and automation.
- Experience with Apache Spark (or similar distributed processing frameworks) and ETL tools.
- Strong understanding of data modeling, schema design, and best practices for data governance.
- Ability to work cross‑functionally, communicate technical concepts clearly, and adapt to fast‑moving manufacturing technology needs.
Skills
pythonsqlawsapache spark