Abstract
This paper focuses on the variance introduced by the discretization techniques used to handle continuous attributes in decision tree induction. Different discretization procedures are first studied empirically, then means to reduce the discretization variance are proposed. The experiment shows that discretization variance is large and that it is possible to reduce it significantly without notable computational costs. The resulting variance reduction mainly improves interpretability and stability of decision trees, and marginally their accuracy.
Chapter PDF
Similar content being viewed by others
Keywords
- Discretization Variance
- Continuous Attribute
- Threshold Variance
- Discretization Technique
- Decision Tree Induction
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
L. Breiman, J.H. Friedman, R.A. Olsen, and C.J. Stone. Classification and Regression Trees. Wadsworth International (California), 1984. 162, 169
J.R. Quinlan. C4.5: Programs for machine learning. Morgan Kaufmann (San Mateo), 1986. 162
R. Kohavi and D. H. Wolpert. Bias plus variance decomposition for zero-one loss functions. In Proc. of the Thirteenth International Conference on Machine Learning, 1996. 162
J.H. Friedman. Local learning based on recursive covering. Technical report, Department of Statistics, Standford University, August 1996. 162, 168
L. Breiman. Bagging predictors. Technical report, University of California, Department of Statistics, September 1994. 163, 167
R.L. De Mantaras. A distance-based attribute selection measure for decision tree induction. Machine Learning, 6:81–92, 1991. 164
L. Wehenkel. On uncertainty measures used for decision tree induction. In Proc. of Info. Proc. and Manag. Of Uncertainty, pages 413–418, 1996. 164
J.H. Friedman. A recursive partitioning decision rule for nonparametric classifier. IEEE Transactions on Computers, C-26:404–408, 1977. 164
L. Wehenkel. Discretization of continuous attributes for supervised learning: Variance evaluation and variance reduction. In Proc. of The Int. Fuzzy Systems Assoc. World Congress (IFSA’97), pages 381–388, 1997. 166, 168
L. Wehenkel. Automatic learning techniques in power systems. Kluwer Academic, Boston, 1998. 166, 169
P. Geurts. Discretization variance in decision tree induction. Technical report, University of Liège, Dept. of Electrical and Computer Engineering, Jan. 2000. (http://www.montefiore.ulg.ac.be/~geurts/) 166
W. Buntine. Learning classification trees. Statistics and Computing, 2:63–73, 1992. 167
Y. Freund and R.E. Schapire. A decision theoretic generalization of on-line learning and an application to boosting. In Proc. of the 2nd European Conference on Computational Learning Theory, pages 23–27. Springer Verlag, 1995. 167
C. Carter and J. Catlett. Assessing credit card applications using machine learning. IEEE Expert, Fall:71–79, 1987. 168
M.I. Jordan. A statistical approach to decision tree modeling. In Proc. of the 7th Annual ACM Conference on Computational Learning Theory. ACM Press, 1994. 168
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Geurts, P., Wehenkel, L. (2000). Investigation and Reduction of Discretization Variance in Decision Tree Induction. In: López de Mántaras, R., Plaza, E. (eds) Machine Learning: ECML 2000. ECML 2000. Lecture Notes in Computer Science(), vol 1810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45164-1_17
Download citation
DOI: https://doi.org/10.1007/3-540-45164-1_17
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67602-7
Online ISBN: 978-3-540-45164-8
eBook Packages: Springer Book Archive