onsite

Machine Learning - Tetramem

ML Engineer

Internship focused on developing and optimizing Python and C++ tools for compressing, converting, and deploying neural network models on cutting‑edge analog compute‑in‑memory hardware.

About the role

Key Responsibilities

Develop and maintain Python and C++ software for neural network model compression, conversion, and deployment.
Optimize runtime performance on analog compute‑in‑memory chips, ensuring efficient execution of deep learning workloads.
Collaborate with hardware engineers to integrate software pipelines with novel analog hardware architectures.
Debug, profile, and benchmark models to meet latency and accuracy targets.
Document code, create user guides, and contribute to internal knowledge bases.

Requirements

Strong programming skills in Python and C++.
Familiarity with neural network frameworks (e.g., PyTorch, TensorFlow) and model compression techniques.
Basic understanding of hardware‑software co‑design and analog computing concepts.
Excellent problem‑solving abilities and attention to detail.
Enthusiasm for working at the intersection of software, hardware, and AI.

Skills

pythonc

CompanyTetramem

DepartmentResearch

LocationSan Jose, United States

Experience3+ years

Tenurefull-time

LevelMid-Level

Posted June 21, 2026