remote
IDCS Data Engineer - AI & Infrastructure Enablement - SHI International
Data Engineer
Lead data engineering for AI initiatives, building scalable pipelines on AWS, optimizing Spark workloads, and integrating machine learning models into production data lakes.
About the role
Key Responsibilities
- Design, develop, and maintain large‑scale data pipelines using Python and Apache Spark on AWS.
- Collaborate with data scientists to deploy ML models into production data lakes.
- Optimize ETL processes for performance, reliability, and cost efficiency.
- Implement data governance, security, and compliance best practices.
- Monitor pipeline health, troubleshoot issues, and provide proactive improvements.
Requirements
- 3+ years of data engineering experience in cloud environments.
Skills
pythonapache sparkawssqlmachine learning