Senior Staff Engineer, AI Platform and Infrastructure
OKX is seeking a Senior Staff Engineer to lead the design and development of large-scale AI infrastructure for mission-critical machine learning and generative AI workloads. This role involves setting technical direction, building platforms for AI model development and deployment, and driving performance and reliability across compute, data, and ML systems.
At OKX, we believe that the future will be reshaped by crypto, and ultimately contribute to every individual's freedom. OKX is a leading crypto exchange, and the developer of OKX Wallet, giving millions access to crypto trading and decentralized crypto applications (dApps). OKX is also a trusted brand by hundreds of large institutions seeking access to crypto markets. We are safe and reliable, backed by our Proof of Reserves. Across our multiple offices globally, we are united by our core principles: We Before Me, Do the Right Thing, and Get Things Done. These shared values drive our culture, shape our processes, and foster a friendly, rewarding, and diverse environment for every OK-er. OKX is part of OKG, a group that brings the value of Blockchain to users around the world, through our leading products OKX, OKX Wallet, OKLink and more.
The AI Engineering team is responsible for integrating AI models with different business lines, across teams such as Compliance, Trading, Financial Products, and Business Intelligence.
We are looking for a Senior Staff Engineer to lead the design, development, and evolution of large-scale AI infrastructure that powers mission-critical machine learning and generative AI workloads. In this role, you will operate at the intersection of systems engineering, distributed computing, and applied AI, setting technical direction and building platforms that enable teams across the company to develop, train, deploy, and operate AI models reliably at scale.
You will be a hands-on technical leader, shaping long-term platform strategy while also diving deep into architecture, performance, and reliability challenges across compute, data, and ML systems.
Posted June 2, 2026