onsite
Staff Software Engineer, Inference Platform
Software Engineer
Lead the design and maintenance of a high‑availability inference platform, focusing on C++ performance, CI/CD pipelines, and robust active/active alerting for mission‑critical services.
About the role
Key Responsibilities
- Architect and implement scalable, low‑latency inference services in C++ for real‑time analytics.
- Design and maintain end‑to‑end CI/CD pipelines, ensuring rapid, reliable deployments across active/active clusters.
- Develop and refine active/active alerting mechanisms to detect, diagnose, and recover from failures with minimal downtime.
- Collaborate with cross‑functional teams to translate business requirements into technical solutions and performance benchmarks.
- Mentor junior engineers, conduct code reviews, and promote best practices in software quality and maintainability.
Requirements
- 10+ years of software engineering experience, with 5+ in C++ and distributed systems.
- Proven expertise in CI/CD tooling (Jenkins, GitLab CI, ArgoCD) and container orchestration (Kubernetes).
- Deep understanding of active/active architectures, high‑availability patterns, and fault‑tolerance.
- Strong debugging skills, including profiling, memory analysis, and performance tuning.
- Excellent communication, leadership, and problem‑solving abilities.