onsite

Software Engineer, Inference Platform - Cerebras Systems

Software Engineer

Develop high‑performance inference software for a wafer‑scale AI chip, leveraging Python, C++, and CUDA to optimize distributed workloads and accelerate large‑scale ML applications.

About the role

Key Responsibilities

Design, implement, and maintain inference pipelines that run on a wafer‑scale AI chip, ensuring optimal utilization of the device’s compute resources.
Collaborate with hardware and systems teams to translate architectural specifications into efficient, scalable software solutions.
Optimize performance-critical code using CUDA, C++, and low‑level profiling tools, targeting latency and throughput improvements.
Integrate machine‑learning frameworks (e.g., TensorFlow, PyTorch) with the inference platform, providing seamless deployment for end users.
Develop automated testing, continuous integration, and deployment pipelines to guarantee reliability and rapid iteration.

Requirements

Strong experience in C++ and CUDA for high‑performance computing.
Proficiency in Python and familiarity with ML frameworks.
Solid understanding of distributed systems and parallel programming concepts.
Experience with profiling, debugging, and performance tuning on GPU architectures.
Excellent problem‑solving skills and a collaborative mindset.

Skills

pythonccudamachine learning

CompanyCerebras Systems

DepartmentEngineering

LocationSunnyvale, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 21, 2026