onsite

Distributed Systems Engineer - Menlo

Systems Engineer

Distributed Systems Engineer at Menlo designing and scaling the cloud infrastructure that supports fleets of humanoid robots, focusing on Kubernetes orchestration, gRPC communication, and robust networking across global deployments.

About the role

About Menlo

Menlo Research is an Applied R&D lab building Asimov, an open-source humanoid robot platform, and the full software stack that powers it. Our mission is to make humanoid labor economically viable, turning software into physical labor at scale. We build across the full stack: hardware architecture, locomotion, autonomy, simulation, and infrastructure. We move fast, ship to real robots, and open-source everything we can. If you want your work to matter beyond a paper or a demo, this is the place.

The Role

We are looking for a Distributed Systems Engineer to architect and scale the infrastructure that powers fleets of humanoid robots operating across the world. You will work across the full stack of robotics infrastructure, from low-latency streaming and cloud simulation to large-scale training and telemetry pipelines. You will work directly with the founders and technical leadership to design the systems that let hundreds of robots learn, share, and act as one.

What You Will Do

Architect and scale distributed systems that handle petabytes of sensory, telemetry, and control data across cloud and edge environments

Design data ingestion and streaming pipelines connecting fleets of robots to the cloud in real time (video, LiDAR, joint states, audio)

Build large-scale training and inference platforms for multimodal foundation models powering robot autonomy and teleoperation

Collaborate with ML and Robotics engineers to support hardware-in-the-loop simulation, policy rollout, and continuous learning

Develop internal observability systems for fleet monitoring, reliability, and performance tuning

Lead infrastructure decisions, from distributed storage and consensus protocols to GPU orchestration and network reliability

What You Will Bring

7+ years of professional software engineering experience, with deep expertise in distributed systems, networking, or data infrastructure

Proven ability to build and operate production-grade distributed systems handling massive scale and mission-critical workloads

Proficiency in Go, Rust, C++, or Python, with strong fundamentals in concurrency, networking, and systems performance

Experience with cloud-native architectures (Kubernetes, gRPC, Kafka, S3, Ray, or similar frameworks)

Strong understanding of data consistency, replication, and fault tolerance across heterogeneous environments

Experience with GPU-based workloads, model training, or edge compute orchestration is a strong plus

Excellent analytical skills and a bias toward building fast, measurable, and reliable systems

Bonus Points

Experience building distributed training or large-scale simulation systems

Familiarity with real-time robotics workloads, including streaming from physical sensors and actuators

Prior work with telemetry, obser

About the role

About Menlo

The Role

What You Will Do

Architect and scale distributed systems that handle petabytes of sensory, telemetry, and control data across cloud and edge environments

Design data ingestion and streaming pipelines connecting fleets of robots to the cloud in real time (video, LiDAR, joint states, audio)

Build large-scale training and inference platforms for multimodal foundation models powering robot autonomy and teleoperation

Collaborate with ML and Robotics engineers to support hardware-in-the-loop simulation, policy rollout, and continuous learning

Develop internal observability systems for fleet monitoring, reliability, and performance tuning

Lead infrastructure decisions, from distributed storage and consensus protocols to GPU orchestration and network reliability

What You Will Bring

7+ years of professional software engineering experience, with deep expertise in distributed systems, networking, or data infrastructure

Proven ability to build and operate production-grade distributed systems handling massive scale and mission-critical workloads

Proficiency in Go, Rust, C++, or Python, with strong fundamentals in concurrency, networking, and systems performance

Experience with cloud-native architectures (Kubernetes, gRPC, Kafka, S3, Ray, or similar frameworks)

Strong understanding of data consistency, replication, and fault tolerance across heterogeneous environments

Experience with GPU-based workloads, model training, or edge compute orchestration is a strong plus

Excellent analytical skills and a bias toward building fast, measurable, and reliable systems

Bonus Points

Experience building distributed training or large-scale simulation systems

Familiarity with real-time robotics workloads, including streaming from physical sensors and actuators

Prior work with telemetry, obser

Distributed Systems Engineer - Menlo

About the role

Distributed Systems Engineer - Menlo

About the role

Skills