onsite
Software Inference Deployment Engineer - LUMAI
Software Engineer
Lead the deployment of high‑performance AI inference on a novel 3D optical accelerator, integrating Python, C++, and CUDA pipelines with containerized, cloud‑native infrastructure.
About the role
Key Responsibilities
- Design, implement, and optimize inference pipelines for the 3D optical accelerator using Python, C++, and CUDA.
- Collaborate with hardware and software teams to translate algorithmic models into efficient, low‑latency deployment code.
- Build and maintain CI/CD pipelines with Docker and Kubernetes to ensure rapid, reliable releases.
- Profile and troubleshoot performance bottlenecks, applying advanced profiling tools and techniques.
- Document best practices, create technical guides, and provide training to internal stakeholders.
Requirements
- 5+ years of software engineering experience in high‑performance computing or AI inference.
- Proficiency in Python, C++, and CUDA programming.
- Hands‑on experience with containerization (Docker) and orchestration (Kubernetes).
- Strong understanding of GPU/accelerator architectures and memory management.
- Excellent problem‑solving skills and a passion for cutting‑edge technology.
Skills
pythonccudadockerkubernetes