onsite
Agentic AI Data Engineer - Kyndryl
Data Engineer
Build and operationalize data pipelines for Agentic AI solutions, leveraging Python, Spark, and AWS to deliver scalable, high‑quality data for advanced machine‑learning models.
About the role
Key Responsibilities
- Design, develop, and maintain robust data pipelines that ingest, transform, and store large‑scale datasets for Agentic AI applications.
- Collaborate with AI researchers and product teams to define data requirements and ensure data quality, consistency, and governance.
- Implement scalable processing solutions using Apache Spark and serverless services on AWS (e.g., Lambda, Glue, S3).
- Optimize SQL queries and data models for performance and cost efficiency in cloud environments.
- Monitor pipeline health, troubleshoot issues, and continuously improve reliability through automation and best practices.
Requirements
- 5+ years of experience in data engineering, with strong proficiency in Python and SQL.
- Hands‑on experience building ETL/ELT pipelines using Apache Spark and cloud services (AWS preferred).
- Solid understanding of data modeling, warehousing concepts, and data governance.
- Familiarity with machine‑learning workflows and ability to support model training data pipelines.
- Excellent problem‑solving skills and ability to work cross‑functionally in a fast‑paced, innovative environment.
Skills
pythonsqlapache sparkawsmachine learning