onsite

Senior Data Engineer - Large Driving Model Autonomy - Rivian

Data Engineer

Senior Data Engineer responsible for designing and scaling data pipelines that feed large autonomous driving models, leveraging Python, Spark, SQL, and AWS to deliver reliable, high‑throughput data for machine‑learning workloads.

About the role

Key Responsibilities

Design, build, and maintain robust, scalable data pipelines that ingest, transform, and store sensor and simulation data for autonomous driving models.
Collaborate with ML scientists and software engineers to define data requirements and ensure data quality, latency, and availability.
Implement data processing solutions using Apache Spark and SQL on AWS services such as S3, Redshift, and Glue.
Develop monitoring, alerting, and automated testing frameworks to guarantee pipeline reliability and performance at scale.
Optimize storage and compute costs while maintaining compliance with security and governance standards.

Requirements

5+ years of experience building large‑scale data pipelines in Python and Spark.
Strong proficiency with SQL and relational/columnar data stores (e.g., Redshift, Snowflake, BigQuery).
Hands‑on experience with AWS data services (S3, EMR, Glue, Lambda) and infrastructure‑as‑code tools.
Familiarity with machine‑learning data workflows and versioning for autonomous vehicle datasets.
Excellent problem‑solving skills and ability to work cross‑functionally in a fast‑paced, innovative environment.

Skills

pythonapache sparksqlaws

CompanyRivian

DepartmentEngineering

LocationStanford, United States

Experience5+ years

Tenurefull-time

LevelSenior

Posted June 23, 2026