level: research
within-dataset class-split evaluation is often used to test anomaly detection models. one class is held out as the anomaly, and the rest form the normal set. this setup is meant to approximate real-world out-of-distribution detection. however, the approach breaks down when the anomaly class overlaps with the normal classes in the feature space. in such cases, anomaly scores can become random or even flip direction, making the test meaningless.
the problem is called score-direction instability. it happens because the model cannot reliably separate the anomaly from normal data when their representations mix. the preferred direction for anomaly scores can change depending on which class is held out. this means a model might appear to work well on one split but fail on another, even with the same dataset. the instability is not a flaw in the model but a flaw in the evaluation protocol itself.
the authors propose a simple, training-free diagnostic called neighborhood class leakage. it measures how much the held-out class leaks into the normal data's nearest neighbors. high leakage predicts score-direction instability. experiments on fashion-mnist, cifar-10, and imagenette show the diagnostic works in both pixel space and vae latent space. the findings suggest that class-split benchmarks should be seen as geometry-dependent stress tests, not as proof of general anomaly detection ability.
why it matters: practitioners relying on class-split benchmarks may overestimate their anomaly detection models, leading to failures in real applications where anomalies overlap with normal data.