where reliability lives in vision-language models
a mechanistic study finds attention sharpness is a near-zero predictor of correctness in vision-language models, while hidden state geometry and self-consistency offer stronger reliability signals.
ai