remote
Inference Engineer
Inference Engineer
Inference Engineer to build and optimize high-performance AI inference systems for scalable deployment.
About the role
Key Responsibilities
- Develop high-performance inference engines for AI models
- Optimize model architectures for low-latency and high-throughput inference
- Implement GPU-accelerated computing solutions
- Collaborate with ML teams to integrate optimized models into production systems
- Profile and benchmark inference performance
- Ensure compatibility across diverse hardware platforms
Requirements
- 2+ years of experience in systems programming or AI inference
- Strong proficiency in C++ and Python
- Experience with GPU computing (CUDA/OpenCL) and model optimization
- Knowledge of neural network architectures and performance tuning
- Familiarity with Linux and performance profiling tools
Skills
cpythongpu computingmodel optimizationcudainference systems