onsite

Staff Machine Learning Engineer - VLM/LLM Evaluation - Waymo

ML Engineer

Lead the design and implementation of evaluation frameworks for vision‑language and large language models, driving metrics, data pipelines, and scalable infrastructure to improve autonomous driving perception and decision systems.

About the role

Key Responsibilities

Design and build robust evaluation pipelines for Vision‑Language Models (VLMs) and Large Language Models (LLMs) used in perception and planning stacks.
Develop and maintain metrics, benchmarks, and automated testing suites to assess model performance, safety, and reliability at scale.
Collaborate with research, simulation, and product teams to integrate evaluation results into the continuous improvement loop of the autonomous driving system.
Implement distributed training and inference workflows using Python, TensorFlow, and PyTorch on large‑scale compute clusters.
Analyze failure cases, generate insights, and propose model or data enhancements to meet safety and accuracy targets.

Requirements

Ph.D. or Master’s in Computer Science, Electrical Engineering, or related field with 7+ years of hands‑on ML experience.
Deep expertise in LLMs, VLMs, and modern deep‑learning frameworks (TensorFlow, PyTorch).
Proven track record building large‑scale evaluation infrastructure, metrics, and data pipelines.
Strong programming skills in Python and experience with distributed computing platforms (e.g., Kubernetes, Ray, Spark).
Excellent problem‑solving ability and communication skills to work across cross‑functional teams.

Skills

pythonmachine learningtensorflowpytorch

CompanyWaymo

DepartmentResearch

LocationHayes Valley, United States

Experience7+ years

Tenurefull-time

LevelLead

Posted June 24, 2026