source: arxiv machine learning: the attribution impossibility: no feature ranking is faithful, stable, and complete under collinearity

level: research

when features are collinear, no feature ranking can be faithful, stable, and complete at the same time. for pairs of correlated features, ranking becomes a coin flip. this impossibility is proven mathematically and verified with 305 theorems in the lean 4 proof assistant. the work maps out the entire design space for attribution methods, showing only two families exist: faithful-complete methods that are unstable, with rankings flipping up to half the time, and ensemble methods that are stable and report ties for symmetric features. no method falls outside this split.

the impossibility is quantitative. for gradient boosting, the attribution ratio diverges as 1/(1-rho^2). for lasso, it is infinite. random forests show convergence. the paper introduces dash, which stands for diversified aggregation of shap. dash is provably pareto-optimal among unbiased aggregations. it reaches the cramer-rao variance bound and comes with a formula for the needed ensemble size. this gives a practical way to get stable rankings when features are correlated.

the results apply to common model classes and show that standard feature importance methods can be unreliable when predictors are not independent. the lean 4 formalization provides machine-checked proofs of the core claims. the dichotomy between unstable faithful-complete methods and stable ensemble methods gives clear guidance for practitioners choosing attribution tools. the dash method offers a principled alternative that avoids the instability without sacrificing completeness.

why it matters: data scientists using feature importance on correlated data can get misleading results; this work explains the limits and offers a stable alternative.


source: arxiv machine learning: the attribution impossibility: no feature ranking is faithful, stable, and complete under collinearity