Hire AI Talent For Employees Top 3%Jobs

Sign in Join free Employer Login

The Cohire for AI engineers — and the hiring partner for the teams building frontier intelligence.

Features

AI Job Match
Resume AI
Application Autofill
Cohire

For engineers

Browse jobs
AI Research roles
ML Engineering roles
Applied AI roles
Early-career track
Salary data

Resources

Blog
Events
Interview guides
Frontier lab insights

Company

About
For employees
Careers
Partners
Contact
Privacy · Terms

© 2026 Gravity Engineering Services Pvt. Ltd. All rights reserved.hello@opentalent.in

remote

Inference Engineer - Deepinfra Inc.

Inference Engineer

Inference Engineer to optimize and deploy high-performance AI models, focusing on low-latency systems and hardware acceleration.

About the role

Key Responsibilities

Optimize and deploy ML models for high-performance inference at scale
Develop low-latency systems for real-time AI applications
Implement quantization, pruning, and other optimization techniques
Collaborate with hardware teams to maximize hardware utilization
Benchmark and profile inference performance across different platforms
Ensure reliability and efficiency of production inference pipelines

Requirements

3+ years in systems programming or ML inference optimization
Expertise in C++ and Python for performance-critical applications
Experience with GPU computing and CUDA programming
Knowledge of model optimization techniques and hardware acceleration
Strong debugging and profiling skills for performance tuning

Skills

cpythongpu computingmodel optimizationcudainference systems

Sign Up to Apply

Sign Up to Apply

DepartmentEngineering

LocationIndia

Experience5+ years

Tenurefull-time

LevelSenior

Posted June 1, 2026

Inference Engineer - Deepinfra Inc. | OpenTalent