The Cohire for AI engineers — and the hiring partner for the teams building frontier intelligence.
© 2026 Gravity Engineering Services Pvt. Ltd. All rights reserved. hello@opentalent.in All jobsonsite
Senior GenAI Platform Engineer / Senior LLM Infrastructure Engineer (On-Prem AI Platform) Senior GenAI Platform Engineer / Senior LLM Infrastructure Engineer (On-Prem AI Platform)
AsceticVoyage is looking for a Senior GenAI Platform Engineer / Senior LLM Infrastructure Engineer to focus on on-premise AI platforms. This role requires extensive experience with both cloud and on-premises technologies related to AI/ML infrastructure, including GPU orchestration, inference optimization, and MLOps/LLMOps.
About the role About the Role AsceticVoyage is seeking a Senior GenAI Platform Engineer / Senior LLM Infrastructure Engineer with expertise in On-Prem AI Platforms to join our team in Charlotte, NC.
Requirements Cloud Requirements Arize AI Claude Cowork GCP Terraform Azure Cloud Networking Landing Zones Org Policy / Governance HashiCorp Vault Hybrid Connectivity Kubernetes GKE OpenShift (OCP) Platform Engineering Observability SRE / SLOs Python Internal Developer Portals GenAI Platforms LLMs RAG MLOps/LLMOps On-premises Requirements vLLM TensorRT‑LLM Triton Inference Server SGLang Inference Optimization Continuous Batching Speculative Decoding KV Cache / Prefix Caching FP8 / AWQ / GPTQ Tensor Parallelism Kubernetes ML Serving KServe OpenShift AI Helm / Operators GPU Orchestration Run:AI Performance Benchmarking CUDA / NCCL / MIG Prometheus / Grafana ML Observability GuideLLM Locust Arize AI Claude Cowork Skills Arize AI Claude Cowork GCP Terraform Azure Cloud Networking Landing Zones Org Policy / Governance HashiCorp Vault Hybrid Connectivity Kubernetes GKE OpenShift (OCP) Platform Engineering Observability SRE / SLOs Python Internal Developer Portals GenAI Platforms Llms Rag MLOps/LLMOps
Company AsceticVoyage
Department Engineering
Location Charlotte, United States
Experience 5+ years
Tenure contractual
Level Senior
Posted June 7, 2026
Senior GenAI Platform Engineer / Senior LLM Infrastructure Engineer (On-Prem AI Platform) - Charlotte | OpenTalent Vllm
TensorRT‑LLM
Triton Inference Server
SGLang
Inference Optimization
Continuous Batching
Speculative Decoding
KV Cache / Prefix Caching
FP8 / AWQ / GPTQ
Tensor Parallelism
Kubernetes ML Serving
KServe
OpenShift AI
Helm / Operators
Gpu Orchestration
Run:AI
Performance Benchmarking
CUDA / NCCL / MIG
Prometheus / Grafana
ML Observability
GuideLLM
Locust