onsite

Machine Learning Engineer, Integrity

HackerRank is seeking a Machine Learning Engineer specializing in Integrity to enhance their assessment fraud detection systems. This role involves defining and measuring model quality, improving signal performance, and developing new ML strategies for signals like audio analysis and gaze tracking in an adversarial environment.

About the role

About The Role

Hiring is one of the most consequential decisions a company makes. 3,000+ enterprises rely on HackerRank to get it right. We are now reinventing how that works for the agentic era. The integrity of that system is not a feature. It is the foundation.

Open Problem

The fraud gets smarter every quarter. The models have to keep up.

Integrity isn't about whether you use AI or not; it's about whether you are following the rules. Our integrity system is a portfolio of signals spanning vision, code analysis, browser telemetry, and behavioral sequences. Each signal is independently trained, operates on a different modality, and has its own precision-recall tradeoff and failure distribution. That heterogeneity is not a design choice we made lightly. It reflects the fact that no single modality is sufficient, and that the most informative signal depends heavily on context. A model at 92% precision on today's proxy attempt patterns can sit at 74% within two quarters, not because the model degraded, but because the attack surface shifted and the training distribution did not.

The fusion problem is where current approaches hit a ceiling. Naive aggregation across signals does not work because the signals are not independent and their reliability varies with context. What is needed is calibrated uncertainty at the signal level, a principled way to weight evidence depending on conditions, and the ability to detect when a signal has drifted out of its reliable operating range. The field has good solutions for static multimodal fusion. It does not have good solutions for adversarially non-stationary multimodal fusion where the ground truth labels are expensive, delayed, and partially unobservable.

Latency adds a constraint that rules out many otherwise viable architectures. Inference runs during a live assessment, which bounds what is deployable regardless of accuracy. That means the model quality problem and the systems problem have to be solved together, not sequentially. And the system has to generalize across populations and environments it was not trained on, without encoding the kinds of demographic biases that make a technically accurate classifier practically indefensible.

The honest summary: the current generation of integrity tooling, ours and the market's, has a precision ceiling that the next wave of fraud techniques will break through. Raising that ceiling requires building past what exists, not fine-tuning it.

What You Will Do

Standardize how model quality is defined, measured, and reported across all integrity signals. Build the evaluation infrastructure, golden datasets, and benchmarking pipelines that give us and our customers genuine confidence in what we ship.
Own the performance improvement strategy for each signal. Explore newer architectures, emerging research, and different training paradigms. The approach will not be one-size-fits-all; it will be grounded in each signal's maturity, data quality, and what the science actually supports.
Define the ML strategy for new signals from scratch: audio analysis, gaze tracking, behavioral anomalies. Set the architecture, data requirements, and a clear bar for what production-ready looks like before anything ships.
Continuously monitor how assessment fraud tooling is evolving. Evaluate new models as they emerge. Know when to abandon a strategy that is no longer moving the needle.
Systematically surface edge cases, build training data around them, and turn every customer-reported failure into a model that is harder to fool.
Drive strategy-level decisions: which new signals to build, whether to use models at all, and what the evidence says.

Who You Are

You have shipped ML systems in production that real users and real businesses depend on.
You have deep intuition for where precision leaks happen and how to find them systematically, not by luck.
You think in systems. A signal's accuracy number, its data pipeline, its serving infrastructure, and its customer-facing outcome are one problem to you.
You care as much about evaluation methodology as model performance. You know that a metric measured wrong is worse than no metric.
You are genuinely curious about adversarial dynamics. The fact that your model will be attacked is interesting to you, not exhausting.

Even better if you have

Experience with multimodal systems in production: vision, audio, or behavioral signal pipelines.
Background in adversarial ML or fraud/anomaly detection.
Publications or open-source work in detection, robustness, or model evaluation.
Prior experience defining what production-ready means for a new signal category from scratch.

You will thrive here if

You find the performance improvement problem as interesting as the model problem.
You are energized by adversarial environments where the threat landscape shifts under your feet and the only valid response is to stay ahead of it.
You do not inherit clean problems and optimize within them.
You define what the problem even is, and you hold yourself to a higher bar than the one the market has set.

About the role

About The Role

Open Problem

The fraud gets smarter every quarter. The models have to keep up.

What You Will Do

Standardize how model quality is defined, measured, and reported across all integrity signals. Build the evaluation infrastructure, golden datasets, and benchmarking pipelines that give us and our customers genuine confidence in what we ship.
Own the performance improvement strategy for each signal. Explore newer architectures, emerging research, and different training paradigms. The approach will not be one-size-fits-all; it will be grounded in each signal's maturity, data quality, and what the science actually supports.
Define the ML strategy for new signals from scratch: audio analysis, gaze tracking, behavioral anomalies. Set the architecture, data requirements, and a clear bar for what production-ready looks like before anything ships.
Continuously monitor how assessment fraud tooling is evolving. Evaluate new models as they emerge. Know when to abandon a strategy that is no longer moving the needle.
Systematically surface edge cases, build training data around them, and turn every customer-reported failure into a model that is harder to fool.
Drive strategy-level decisions: which new signals to build, whether to use models at all, and what the evidence says.

Who You Are

You have shipped ML systems in production that real users and real businesses depend on.
You have deep intuition for where precision leaks happen and how to find them systematically, not by luck.
You think in systems. A signal's accuracy number, its data pipeline, its serving infrastructure, and its customer-facing outcome are one problem to you.
You care as much about evaluation methodology as model performance. You know that a metric measured wrong is worse than no metric.
You are genuinely curious about adversarial dynamics. The fact that your model will be attacked is interesting to you, not exhausting.

Even better if you have

Experience with multimodal systems in production: vision, audio, or behavioral signal pipelines.
Background in adversarial ML or fraud/anomaly detection.
Publications or open-source work in detection, robustness, or model evaluation.
Prior experience defining what production-ready means for a new signal category from scratch.

You will thrive here if

You find the performance improvement problem as interesting as the model problem.
You are energized by adversarial environments where the threat landscape shifts under your feet and the only valid response is to stay ahead of it.
You do not inherit clean problems and optimize within them.
You define what the problem even is, and you hold yourself to a higher bar than the one the market has set.

Machine Learning Engineer, Integrity

About the role

About The Role

Open Problem

What You Will Do

Who You Are

Even better if you have

You will thrive here if

Machine Learning Engineer, Integrity

About the role

About The Role

Open Problem

What You Will Do

Who You Are

Even better if you have

You will thrive here if

Skills