Machine Learning Engineer, Integrity
HackerRank is seeking a Machine Learning Engineer specializing in Integrity to enhance their assessment fraud detection systems. This role involves defining and measuring model quality, improving signal performance, and developing new ML strategies for signals like audio analysis and gaze tracking in an adversarial environment.
Hiring is one of the most consequential decisions a company makes. 3,000+ enterprises rely on HackerRank to get it right. We are now reinventing how that works for the agentic era. The integrity of that system is not a feature. It is the foundation.
The fraud gets smarter every quarter. The models have to keep up.
Integrity isn't about whether you use AI or not; it's about whether you are following the rules. Our integrity system is a portfolio of signals spanning vision, code analysis, browser telemetry, and behavioral sequences. Each signal is independently trained, operates on a different modality, and has its own precision-recall tradeoff and failure distribution. That heterogeneity is not a design choice we made lightly. It reflects the fact that no single modality is sufficient, and that the most informative signal depends heavily on context. A model at 92% precision on today's proxy attempt patterns can sit at 74% within two quarters, not because the model degraded, but because the attack surface shifted and the training distribution did not.
The fusion problem is where current approaches hit a ceiling. Naive aggregation across signals does not work because the signals are not independent and their reliability varies with context. What is needed is calibrated uncertainty at the signal level, a principled way to weight evidence depending on conditions, and the ability to detect when a signal has drifted out of its reliable operating range. The field has good solutions for static multimodal fusion. It does not have good solutions for adversarially non-stationary multimodal fusion where the ground truth labels are expensive, delayed, and partially unobservable.
Latency adds a constraint that rules out many otherwise viable architectures. Inference runs during a live assessment, which bounds what is deployable regardless of accuracy. That means the model quality problem and the systems problem have to be solved together, not sequentially. And the system has to generalize across populations and environments it was not trained on, without encoding the kinds of demographic biases that make a technically accurate classifier practically indefensible.
The honest summary: the current generation of integrity tooling, ours and the market's, has a precision ceiling that the next wave of fraud techniques will break through. Raising that ceiling requires building past what exists, not fine-tuning it.
Posted June 8, 2026