remote

Senior AI Data Pipeline Engineer - 42dot

Software Engineer

Lead the design and scaling of high‑throughput, petabyte‑scale data pipelines that feed global AI workloads, leveraging Python, Spark, AWS, Kubernetes and GPU infrastructure to ensure reliability and multi‑region availability.

About the role

Key Responsibilities

Architect and build high‑performance, scalable data pipelines to ingest and process petabyte‑scale data for AI workloads.
Design multi‑region data infrastructure ensuring global availability and seamless synchronization.
Implement flexible branching and logic isolation to support concurrent AI projects.
Operate and optimize GPU‑enabled data pipelines on large‑scale cloud infrastructure.
Collaborate with data scientists and ML engineers to meet evolving data needs.

Requirements

Extensive experience with Python and Apache Spark for large‑scale data processing.
Proficiency in AWS services (S3, EMR, Glue, Redshift) and Kubernetes orchestration.
Strong background in GPU infrastructure and high‑throughput system design.
Hands‑on experience with multi‑region architecture and data synchronization.
Excellent problem‑solving skills and ability to work in a fast‑paced, global environment.

Skills

pythonapache sparkawskubernetes

Company42dot

DepartmentEngineering

LocationPangyo, Korea, Republic of

Experience5+ years

Tenurefull-time

LevelSenior

Posted June 21, 2026