Systems Engineer
Distributed Systems Engineer at Menlo designing and scaling the cloud infrastructure that supports fleets of humanoid robots, focusing on Kubernetes orchestration, gRPC communication, and robust networking across global deployments.
About Menlo
Menlo Research is an Applied R&D lab building Asimov, an open-source humanoid robot platform, and the full software stack that powers it. Our mission is to make humanoid labor economically viable, turning software into physical labor at scale. We build across the full stack: hardware architecture, locomotion, autonomy, simulation, and infrastructure. We move fast, ship to real robots, and open-source everything we can. If you want your work to matter beyond a paper or a demo, this is the place.
The Role
We are looking for a Distributed Systems Engineer to architect and scale the infrastructure that powers fleets of humanoid robots operating across the world. You will work across the full stack of robotics infrastructure, from low-latency streaming and cloud simulation to large-scale training and telemetry pipelines. You will work directly with the founders and technical leadership to design the systems that let hundreds of robots learn, share, and act as one.
What You Will Do
Architect and scale distributed systems that handle petabytes of sensory, telemetry, and control data across cloud and edge environments
Design data ingestion and streaming pipelines connecting fleets of robots to the cloud in real time (video, LiDAR, joint states, audio)
Build large-scale training and inference platforms for multimodal foundation models powering robot autonomy and teleoperation
Collaborate with ML and Robotics engineers to support hardware-in-the-loop simulation, policy rollout, and continuous learning
Develop internal observability systems for fleet monitoring, reliability, and performance tuning
Lead infrastructure decisions, from distributed storage and consensus protocols to GPU orchestration and network reliability
What You Will Bring
7+ years of professional software engineering experience, with deep expertise in distributed systems, networking, or data infrastructure
Proven ability to build and operate production-grade distributed systems handling massive scale and mission-critical workloads
Proficiency in Go, Rust, C++, or Python, with strong fundamentals in concurrency, networking, and systems performance
Experience with cloud-native architectures (Kubernetes, gRPC, Kafka, S3, Ray, or similar frameworks)
Strong understanding of data consistency, replication, and fault tolerance across heterogeneous environments
Experience with GPU-based workloads, model training, or edge compute orchestration is a strong plus
Excellent analytical skills and a bias toward building fast, measurable, and reliable systems
Bonus Points
Experience building distributed training or large-scale simulation systems
Familiarity with real-time robotics workloads, including streaming from physical sensors and actuators
Prior work with telemetry, obser
Posted June 18, 2026