decision trees unified via bregman divergences

source: arxiv statistics ml: a general framework for decision trees via bregman divergences

level: research

decision trees are a staple in machine learning because they are easy to understand and can capture complex patterns. the classic cart algorithm, from 1984, uses squared error for regression and gini impurity or entropy for classification. this paper proposes a general framework that replaces these fixed loss functions with any bregman divergence, a broad family that includes squared euclidean distance, kullback-leibler divergence, and many others tied to exponential family distributions.

bregman divergences come from convex optimization and have a natural geometric interpretation. by using them as the splitting criterion and leaf prediction rule, the new method unifies many existing tree algorithms under one umbrella. for example, poisson loss handles count data, and itakura-saito divergence suits audio or spectral data. the framework also preserves the interpretability and computational efficiency of standard decision trees.

the authors show that the optimal leaf prediction is the conditional expectation of the sufficient statistic under the chosen divergence. this connects decision trees to generalized linear models and allows the tree to adapt to different data distributions without manual tuning. experiments demonstrate that the bregman tree matches or outperforms specialized methods on tasks like count regression and survival analysis, while keeping the model simple and transparent.

why it matters: this framework lets data scientists build decision trees for any data type by simply choosing the right bregman divergence, making interpretable models more flexible and accurate across diverse applications.

source: arxiv statistics ml: a general framework for decision trees via bregman divergences