SLIT: Designing Complexity Penalty for Classification and Regression Trees Using the SRM Principle
The statistical learning theory has formulated the Structural Risk Minimization (SRM) principle, based upon the functional form of risk bound on the generalization performance of a learning machine. This paper addresses the application of this formula, which is equivalent to a complexity penalty, to model selection tasks for decision trees, whereas the quantization of the machine capacity for decision trees is estimated using an empirical approach. Experimental results show that, for either classification or regression problems, this novel strategy of decision tree pruning performs better than alternative methods. We name classification and regression trees pruned by virtue of this methodology as Statistical Learning Intelligent Trees (SLIT).
KeywordsRegression Tree Empirical Risk Statistical Learn Theory Tree Pruning Pruning Strategy
Unable to display preview. Download preview PDF.
- 5.Yang, Z., Ji, L.: A New Way to Estimate the VC-dimension with Application to Decision Trees (Submitted). Technical report, DA-050812, Inst. of Information Processing, Dept. of Automation, Tsinghua University (2005)Google Scholar
- 9.Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman and Hall, Boca Raton (1993)Google Scholar
- 11.Mansour, Y.: Pessimistic Decision Tree Pruning Based on Tree Size. In: Proc. 14th Intl’ Conf. on Machine Learning – ICML 1997, pp. 195–201 (1997)Google Scholar