onsite
AI/LLM Network Software Development Engineer
Research Engineer
Lead the design and implementation of high‑performance network services for AI/LLM workloads, leveraging C++, Docker, gRPC, Kubernetes, and Linux to deliver scalable, resilient infrastructure.
About the role
Key Responsibilities
- Architect and develop distributed network services in C++ to support AI/LLM inference pipelines.
- Containerize applications using Docker and orchestrate deployments with Kubernetes for high availability.
- Implement efficient gRPC interfaces for low‑latency communication between microservices.
- Optimize Linux kernel and system parameters to meet stringent performance and reliability targets.
- Collaborate with cross‑functional teams to integrate new features and troubleshoot production issues.
Requirements
- 5+ years of professional C++ development experience.
- Proven expertise in Docker, Kubernetes, and Linux system administration.
- Strong understanding of gRPC and network protocol design.
- Experience with performance profiling, debugging, and tuning in distributed environments.
- Excellent problem‑solving skills and a passion for building scalable AI infrastructure.
Skills
cdockergrpckuberneteslinux