Skip to main content

Improved Class Probability Estimates from Decision Tree Models

  • Chapter
Nonlinear Estimation and Classification

Part of the book series: Lecture Notes in Statistics ((LNS,volume 171))

Summary

Decision tree models typically give good classification decisions but poor probability estimates. In many applications, it is important to have good probability estimates as well. This chapter introduces a new algorithm, Bagged Lazy Option Trees (B-LOTs), for constructing decision trees and compares it to an alternative, Bagged Probability Estimation Trees (B-PETs). The quality of the class probability estimates produced by the two methods is evaluated in two ways. First, we compare the ability of the two methods to make good classification decisions when the misclassification costs are asymmetric. Second, we compare the absolute accuracy of the estimates themselves. The experiments show that B-LOTs produce better decisions and more accurate probability estimates than B-PETs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. B. K. Bhattacharya, G. T. Toussaint, and R. S. Poulsen. The application of Voronoi diagrams to non-parametric decision rules. In Proceedings of the 16th Symposium on Computer Science and Statistics: The Interface, pages 97-108, 1987.

    Google Scholar 

  2. C.L. Blake and C.J. Merz. UCI repository of machine learning databases, 1998.

    Google Scholar 

  3. J. P. Bradford, C. Kunz, R. Kohavi, C. Brunk, and C. E. Brodley. Pruning decision trees with misclassification costs. In C. Nedellec and C. Rouveirol, editors, Lecture Notes in Artificial Intelligence. Machine Learning: ECML-98, Tenth European Conference on Machine Learning, volume 1398, pages 131-136, Berlin, 1998. Springer Verlag.

    Google Scholar 

  4. L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth International Group, 1984.

    Google Scholar 

  5. Leo Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.

    MathSciNet  MATH  Google Scholar 

  6. Leo Breiman. Random forests. Technical report, Department of Statistics, University of California, Berkeley, CA, 1999.

    Google Scholar 

  7. W. L. Buntine. A theory of learning classification rules. PhD thesis, University of Technology, School of Computing Science, Sydney, Australia, 1990.

    Google Scholar 

  8. B. Cestnik. Estimating probabilities: A crucial task in machine learning. In L. C. Aiello, editor, Proceedings of the Ninthe European Conference on Artificial Intelligence, pages 147-149. Pitman Publishing, 1990.

    Google Scholar 

  9. H. Chipman, E. George, and R. McCulloch. Bayesian CART model search (with discussion). Journal of the American Statistical Association, 93:935–960, 1998.

    Article  Google Scholar 

  10. B. V. Dasarathy, editor. Nearest neighbor (NN) norms: NN pattern classification techniques. IEEE Computer Society Press, Los Alamitos, CA, 1991.

    Google Scholar 

  11. D. G. T. Denison, B. K. Mallick, and A. F. M. Smith. A Bayesian CART algorithm. Biometrika, 85:363–377, 1998.

    Article  MathSciNet  MATH  Google Scholar 

  12. Thomas G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2):1, 2000.

    Article  Google Scholar 

  13. Bradley Efron and Robert J. Tibshirani. An Introduction to the Bootstrap. Chapman and Hall, New York, NY, 1993.

    MATH  Google Scholar 

  14. Yoav Freund and Robert E. Schapire. Experiments with a new boosting algorithm. In Proc. 13th International Conference on Machine Learning, pages 148-146. Morgan Kaufmann, 1996.

    Google Scholar 

  15. Jerome H. Friedman, Trevor Hastie, and Rob Tibshirani. Additive logistic regression: A statistical view of boosting. Annals of Statistics, 28(2):337–407, 2000.

    Article  MathSciNet  MATH  Google Scholar 

  16. Jerome H. Friedman, Ron Kohavi, and Yeogirl Yun. Lazy decision trees. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 717-724, San Francisco, CA, 1996. AAAI Press/MIT Press.

    Google Scholar 

  17. S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias/variance dilemma. Neural Computation, 4(1):1–58, 1992.

    Article  Google Scholar 

  18. I. J. Good. The estimation of probabilities: An essay on modern Bayesian methods. MIT Press, Cambridge, MA, 1965.

    MATH  Google Scholar 

  19. P. E. Hart. The condensed nearest neighbor rule. IEEE Transactions on Information Theory, 14:515–516, 1968.

    Article  Google Scholar 

  20. Tin Kam Ho. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8):832–844, 1998.

    Article  Google Scholar 

  21. Ron Kohavi and Clayton Kunz. Option decision trees with majority votes. In Proc. 14th International Conference on Machine Learning, pages 161-169. Morgan Kaufmann, 1997.

    Google Scholar 

  22. D. D. Margineantu and T. G. Dietterich. Bootstrap methods for the cost-sensitive evaluation of classifiers. In Proceedings of the Seventeenth International Conference on Machine Learning, San Francisco, CA, 2000. Morgan Kaufmann.

    Google Scholar 

  23. Foster Provost and Pedro Domingos. Well-trained PETs: Improving probability estimation trees. Technical Report IS-00-04, Stern School of Business, New York University, 2000.

    Google Scholar 

  24. J. R. Quinlan. C4.5: Programs for Empirical Learning. Morgan Kaufmann, San Francisco, CA, 1993.

    Google Scholar 

  25. Dietrich Wettschereck. A Study of Distance-Based Machine Learning Algorithms. PhD thesis, Department of Computer Science, Oregon State University, Corvallis, Oregon, 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer Science+Business Media New York

About this chapter

Cite this chapter

Margineantu, D.D., Dietterich, T.G. (2003). Improved Class Probability Estimates from Decision Tree Models. In: Denison, D.D., Hansen, M.H., Holmes, C.C., Mallick, B., Yu, B. (eds) Nonlinear Estimation and Classification. Lecture Notes in Statistics, vol 171. Springer, New York, NY. https://doi.org/10.1007/978-0-387-21579-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-21579-2_10

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-95471-4

  • Online ISBN: 978-0-387-21579-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics