source: arxiv statistics ml: surprises in proper positive-only learning

level: research

binary classification from positive-only samples is a learning setup where the algorithm sees only examples from the positive class during training, but is tested on the full distribution including negatives. this model was introduced by natarajan in 1987. the improper learning case has been understood for decades and appears in textbooks. however, the question of when proper learning is possible remained open until now.

this work shows that a concept class is properly learnable from positive-only samples if and only if it has finite vc dimension and satisfies a new combinatorial property called uniform exterior separability. the condition essentially requires that for any set of points, there is a concept in the class that separates them from the rest of the space in a consistent way. the proof uses a combination of learning theory and combinatorial arguments.

the result reveals a sharp difference from standard pac learning. in the usual setting, proper and improper learning are equivalent for binary classification. here, proper learning is strictly harder. the authors provide separation examples showing classes that are improperly learnable but not properly learnable. this highlights that positive-only learning has a richer structure than previously thought, with practical implications for scenarios where only positive data is available.

why it matters: this clarifies when models can be trained reliably using only positive examples, which is common in applications like anomaly detection or medical diagnosis where negative samples are scarce or undefined.


source: arxiv statistics ml: surprises in proper positive-only learning