onsite
Machine Learning - Tetramem
ML Engineer
Internship focused on developing and optimizing Python and C++ tools for compressing, converting, and deploying neural network models on cutting‑edge analog compute‑in‑memory hardware.
About the role
Key Responsibilities
- Develop and maintain Python and C++ software for neural network model compression, conversion, and deployment.
- Optimize runtime performance on analog compute‑in‑memory chips, ensuring efficient execution of deep learning workloads.
- Collaborate with hardware engineers to integrate software pipelines with novel analog hardware architectures.
- Debug, profile, and benchmark models to meet latency and accuracy targets.
- Document code, create user guides, and contribute to internal knowledge bases.
Requirements
- Strong programming skills in Python and C++.
- Familiarity with neural network frameworks (e.g., PyTorch, TensorFlow) and model compression techniques.
- Basic understanding of hardware‑software co‑design and analog computing concepts.
- Excellent problem‑solving abilities and attention to detail.
- Enthusiasm for working at the intersection of software, hardware, and AI.