hybrid

AI Inference Engineer

As an Applied AI Inference Engineer, you will be responsible for architecting, building, and deploying high-scale production AI inference systems for Evergrid's customers. This hands-on role involves owning customer projects from initial exploration to production deployment, optimizing AI/ML inference pipelines, and collaborating with both internal and customer engineering teams.

About the role

About Evergrid

Evergrid builds the infrastructure that powers advanced artificial intelligence at scale, with the Most Advanced Neocloud. We design and operate critical GPU and power infrastructure for frontier AI workloads — environments where performance, reliability, and execution are non-negotiable. Our customers are building systems at the edge of what’s possible, and they depend on infrastructure that scales quickly and works under sustained pressure. We care deeply about outcomes, ownership, and building durable systems and long-term partnerships.

The Role

As an Applied AI Inference Engineer, you will work directly with Evergrid’s customers to architect, build, and deploy high-scale production AI inference systems on our infrastructure. You will own customer projects end to end — from early exploration through production deployment and monitoring — translating ambiguous goals into observable, reliable services with clear quality, latency, and cost outcomes. This is a hands-on engineering role that blends software development, inference performance engineering, and customer-facing execution. You’ll work closely with internal platform and infrastructure teams while embedding with customer engineering organizations.

What You’ll Do

Design, build, and maintain production-grade software systems and inference services, with a strong emphasis on Python
Own customer engagements end-to-end: problem framing, evaluation, proof-of-concept, production deployment, and monitoring
Collaborate directly with customer engineering teams across sales, implementation, and expansion phases
Turn ambiguous objectives into clear technical specifications and well-scoped PoCs
Optimize AI/ML inference pipelines for latency, throughput, reliability, and cost
Contribute improvements to Evergrid's inference stack, tooling, and platform capabilities
Operate with high ownership — acting as engineer, technical lead, and execution driver for customer-facing initiatives
Navigate ambiguity and make sound tradeoffs to deliver simple, maintainable solutions

What We’re Looking For

Bachelor’s, Master’s, or PhD in Computer Science, Engineering, Mathematics, or a related field (or equivalent practical experience)
1+ years of professional experience in a fast-paced engineering environment
Experience with SGlang, vLLM, and other inference engines & schedulers.
Strong experience writing production-level software, with a preference for Python
Familiarity with AI/ML pipelines and the lifecycle of model development, deployment, and monitoring
Strong communication skills, particularly when discussing complex technical topics
Experience building, deploying, or optimizing AI/ML systems is highly valued
Comfort working directly with customers and owning outcomes end-to-end

About the role

About Evergrid

The Role

What You’ll Do

Design, build, and maintain production-grade software systems and inference services, with a strong emphasis on Python
Own customer engagements end-to-end: problem framing, evaluation, proof-of-concept, production deployment, and monitoring
Collaborate directly with customer engineering teams across sales, implementation, and expansion phases
Turn ambiguous objectives into clear technical specifications and well-scoped PoCs
Optimize AI/ML inference pipelines for latency, throughput, reliability, and cost
Contribute improvements to Evergrid's inference stack, tooling, and platform capabilities
Operate with high ownership — acting as engineer, technical lead, and execution driver for customer-facing initiatives
Navigate ambiguity and make sound tradeoffs to deliver simple, maintainable solutions

What We’re Looking For

Bachelor’s, Master’s, or PhD in Computer Science, Engineering, Mathematics, or a related field (or equivalent practical experience)
1+ years of professional experience in a fast-paced engineering environment
Experience with SGlang, vLLM, and other inference engines & schedulers.
Strong experience writing production-level software, with a preference for Python
Familiarity with AI/ML pipelines and the lifecycle of model development, deployment, and monitoring
Strong communication skills, particularly when discussing complex technical topics
Experience building, deploying, or optimizing AI/ML systems is highly valued
Comfort working directly with customers and owning outcomes end-to-end

AI Inference Engineer

About the role

About Evergrid

The Role

What You’ll Do

What We’re Looking For

AI Inference Engineer

About the role

About Evergrid

The Role

What You’ll Do

What We’re Looking For

Skills