Abstract
We discuss the problem of choosing the complexity of a decision tree (measured in the number of leaf nodes) that gives us highest generalization performance. We first discuss an analysis of the generalization error of decision trees that gives us a new perspective on the regularization parameter that is inherent to any regularization (e.g., pruning) algorithm. There is an optimal setting of this parameter for every learning problem; a setting that does well for one problem will inevitably do poorly for others. We will see that the optimal setting can in fact be estimated from the sample, without “trying out” various settings on holdout data. This leads us to a nonparametric decision tree regularization algorithm that can, in principle, work well for all learning problems.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Peter Auer, Robert C. Holte, and Wolfgang Maass. Theory and applications of agnostic PAC-learning with small decision trees. In 12th ICML, pages 21–29, 1995. 354
C. Blake and C. Merz. UCI repository of machine learning data sets. http://www.ics.uci.edu/~mlearn/MLRepository.html, 1999. 352
L. Breiman, J Friedman, R. Ohlsen, and C. Stone. Classification and Regression Trees. Pacific Grove, 1984. 344, 345, 353
D. Dobkin, D. Gunopoulos, and S. Kasif. Computing optimal shallow decision trees. In Proc. International Workshop on Mathematics in Artificial Intelligence, 1996. 354
P. Domingos. Process-oriented estimation of generalization error. In IJCAI-99, 1999. 354
K. Fukumizu. Generalization error of linear neural networks in unidentifiable cases. In Algorithmic Learning Theory, 1999. 355
W. Gilks, S. Richardson, and D. Spiegelhalter, editors. Markov Chain Monte Carlo in Practice. Chapman & Hall, 1995. 350
M. Kearns and Y. Mansour. A fast, bottom-up decision tree pruning algorithm with near optimal generalization. In ICML-98, pages 269–277, 1998. 353
M. Kearns, Y. Mansour, A. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning Journal, 27:7–50, 1997. 345, 348
P. Langley and S. Sage. Tractable average case analysis of naive bayes classifiers. In ICML-99, pages 220–228, 1999. 355
M. Mehta, J. Rissanen, and R. Agrawal. MDL-based decision tree pruning. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD’95), pages 216–221, August 1995. 353
N. Metropolis, A. Rosenbluth, M. Teller, and E. Teller. Equations of state calculations by fast computing machine. J. Chem. Phys., 21:1087–1091, 1953. 350
J. Mingers. An empirical comparison of selection measures for decision-tree induction. Machine Learning, 3:319–342, 1989. 354
J. R. Quinlan. C4.5 Programs for Machine Learning. Morgan Kaufmann Publisher, 1993. 344, 354
T. Scheffer. Error Estimation and Model Selection. Infix, 1999. 350, 351
T. Scheffer and T. Joachims. Estimating the expected error of empirical minimizers for model selection. Technical Report TR 98-9, Technische Universitaet Berlin, 1998. 354
T. Scheffer and T. Joachims. Estimating the expected error of empirical minimizers for model selection (abstract). AAAI-98, 1998. 354
T. Scheffer and T. Joachims. Expected error analysis for model selection. In ICML-99, 1999. 345, 354
G. Toussaint. Bibliography on estimation of misclassification. IEEE Transactions on Information Theory IT20, 4:472–479, 1974. 345
S. Weiss and N. Indurkhya. Small sample decision tree pruning. In Proc. ICML-94, pages 335–342, 1994. 354
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Scheffer, T. (2000). Nonparametric Regularization of Decision Trees. In: López de Mántaras, R., Plaza, E. (eds) Machine Learning: ECML 2000. ECML 2000. Lecture Notes in Computer Science(), vol 1810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45164-1_36
Download citation
DOI: https://doi.org/10.1007/3-540-45164-1_36
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67602-7
Online ISBN: 978-3-540-45164-8
eBook Packages: Springer Book Archive
