Summary
Decision tree models typically give good classification decisions but poor probability estimates. In many applications, it is important to have good probability estimates as well. This chapter introduces a new algorithm, Bagged Lazy Option Trees (B-LOTs), for constructing decision trees and compares it to an alternative, Bagged Probability Estimation Trees (B-PETs). The quality of the class probability estimates produced by the two methods is evaluated in two ways. First, we compare the ability of the two methods to make good classification decisions when the misclassification costs are asymmetric. Second, we compare the absolute accuracy of the estimates themselves. The experiments show that B-LOTs produce better decisions and more accurate probability estimates than B-PETs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
B. K. Bhattacharya, G. T. Toussaint, and R. S. Poulsen. The application of Voronoi diagrams to non-parametric decision rules. In Proceedings of the 16th Symposium on Computer Science and Statistics: The Interface, pages 97-108, 1987.
C.L. Blake and C.J. Merz. UCI repository of machine learning databases, 1998.
J. P. Bradford, C. Kunz, R. Kohavi, C. Brunk, and C. E. Brodley. Pruning decision trees with misclassification costs. In C. Nedellec and C. Rouveirol, editors, Lecture Notes in Artificial Intelligence. Machine Learning: ECML-98, Tenth European Conference on Machine Learning, volume 1398, pages 131-136, Berlin, 1998. Springer Verlag.
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth International Group, 1984.
Leo Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.
Leo Breiman. Random forests. Technical report, Department of Statistics, University of California, Berkeley, CA, 1999.
W. L. Buntine. A theory of learning classification rules. PhD thesis, University of Technology, School of Computing Science, Sydney, Australia, 1990.
B. Cestnik. Estimating probabilities: A crucial task in machine learning. In L. C. Aiello, editor, Proceedings of the Ninthe European Conference on Artificial Intelligence, pages 147-149. Pitman Publishing, 1990.
H. Chipman, E. George, and R. McCulloch. Bayesian CART model search (with discussion). Journal of the American Statistical Association, 93:935–960, 1998.
B. V. Dasarathy, editor. Nearest neighbor (NN) norms: NN pattern classification techniques. IEEE Computer Society Press, Los Alamitos, CA, 1991.
D. G. T. Denison, B. K. Mallick, and A. F. M. Smith. A Bayesian CART algorithm. Biometrika, 85:363–377, 1998.
Thomas G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2):1, 2000.
Bradley Efron and Robert J. Tibshirani. An Introduction to the Bootstrap. Chapman and Hall, New York, NY, 1993.
Yoav Freund and Robert E. Schapire. Experiments with a new boosting algorithm. In Proc. 13th International Conference on Machine Learning, pages 148-146. Morgan Kaufmann, 1996.
Jerome H. Friedman, Trevor Hastie, and Rob Tibshirani. Additive logistic regression: A statistical view of boosting. Annals of Statistics, 28(2):337–407, 2000.
Jerome H. Friedman, Ron Kohavi, and Yeogirl Yun. Lazy decision trees. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 717-724, San Francisco, CA, 1996. AAAI Press/MIT Press.
S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias/variance dilemma. Neural Computation, 4(1):1–58, 1992.
I. J. Good. The estimation of probabilities: An essay on modern Bayesian methods. MIT Press, Cambridge, MA, 1965.
P. E. Hart. The condensed nearest neighbor rule. IEEE Transactions on Information Theory, 14:515–516, 1968.
Tin Kam Ho. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8):832–844, 1998.
Ron Kohavi and Clayton Kunz. Option decision trees with majority votes. In Proc. 14th International Conference on Machine Learning, pages 161-169. Morgan Kaufmann, 1997.
D. D. Margineantu and T. G. Dietterich. Bootstrap methods for the cost-sensitive evaluation of classifiers. In Proceedings of the Seventeenth International Conference on Machine Learning, San Francisco, CA, 2000. Morgan Kaufmann.
Foster Provost and Pedro Domingos. Well-trained PETs: Improving probability estimation trees. Technical Report IS-00-04, Stern School of Business, New York University, 2000.
J. R. Quinlan. C4.5: Programs for Empirical Learning. Morgan Kaufmann, San Francisco, CA, 1993.
Dietrich Wettschereck. A Study of Distance-Based Machine Learning Algorithms. PhD thesis, Department of Computer Science, Oregon State University, Corvallis, Oregon, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer Science+Business Media New York
About this chapter
Cite this chapter
Margineantu, D.D., Dietterich, T.G. (2003). Improved Class Probability Estimates from Decision Tree Models. In: Denison, D.D., Hansen, M.H., Holmes, C.C., Mallick, B., Yu, B. (eds) Nonlinear Estimation and Classification. Lecture Notes in Statistics, vol 171. Springer, New York, NY. https://doi.org/10.1007/978-0-387-21579-2_10
Download citation
DOI: https://doi.org/10.1007/978-0-387-21579-2_10
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-95471-4
Online ISBN: 978-0-387-21579-2
eBook Packages: Springer Book Archive