remote
IDCS Data Engineer - AI & Infrastructure Enablement - SHI
Data Engineer
Lead data engineering for AI and infrastructure projects, building scalable pipelines on AWS, optimizing data lakes, and collaborating with ML teams to deliver high‑performance analytics solutions.
About the role
Key Responsibilities
- Design, develop, and maintain robust data pipelines using Python, SQL, and Apache Spark on AWS.
- Implement and manage data lake architecture, ensuring data quality, lineage, and security.
- Collaborate with ML and AI teams to ingest, transform, and serve data for model training and inference.
- Automate workflow orchestration with Airflow, monitoring performance and troubleshooting issues.
- Optimize storage and compute costs while meeting SLAs for data processing.
Requirements
- 3+ years of data engineering experience in cloud environments.
Skills
pythonsqlawsmachine learningapache sparkairflow