remote

Senior AI Data Pipeline Engineer Autonomous Driving - 42dot

Software Engineer

Lead the design and implementation of scalable AI data pipelines for autonomous driving, leveraging Python, Spark, and AWS to ingest, process, and serve high‑volume sensor data to downstream ML models.

About the role

Key Responsibilities

Architect and develop end‑to‑end data pipelines that ingest raw vehicle sensor streams, perform real‑time preprocessing, and store processed data in cloud data lakes.
Optimize Spark jobs and SQL queries for performance and cost efficiency on AWS EMR and Redshift.
Collaborate with ML teams to expose clean, labeled datasets for training autonomous driving models.
Implement CI/CD workflows using Docker and Kubernetes to deploy pipeline components with zero downtime.
Monitor pipeline health, troubleshoot failures, and implement automated alerting and recovery mechanisms.

Requirements

5+ years of experience in data engineering, with a strong focus on large‑scale batch and streaming pipelines.
Proficiency in Python, Apache Spark, and SQL; experience with AWS services (S3, EMR, Redshift, Glue).
Hands‑on experience building containerized services and orchestrating them on Kubernetes.
Solid understanding of data quality, lineage, and metadata management.
Excellent problem‑solving skills and ability to work in a fast‑paced, cross‑functional team.

Skills

pythonapache sparkawsdockerkubernetessql

Company42dot

DepartmentEngineering

LocationSunnyvale, United States

Experience7+ years

Tenurefull-time

LevelSenior

Salary254,000

Posted June 21, 2026