A Bayes Evaluation Criterion for Decision Trees

  • Nicolas Voisine
  • Marc Boullé
  • Carine Hue
Part of the Studies in Computational Intelligence book series (SCI, volume 292)

Abstract

We present a new evaluation criterion for the induction of decision trees. We exploit a parameter-free Bayesian approach and propose an analytic formula for the evaluation of the posterior probability of a decision tree given the data. We thus transform the training problem into an optimization problem in the space of decision tree models, and search for the best tree, which is the maximum a posteriori (MAP) one. The optimization is performed using top-down heuristics with pre-pruning and post-pruning processes. Extensive experiments on 30 UCI datasets and on the 5 WCCI 2006 performance prediction challenge datasets show that our method obtains predictive performance similar to that of alternative state-of-the-art methods, with far simpler trees.

Keywords

Decision Tree Bayesian Optimization Minimum Description Length Supervised Learning Model Selection 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blake, C., Merz, C.: UCI repository of machine learning databases (1996), http://www.ics.uci.edu/mlearn/MLRepository.html
  2. Boullé, M.: A Bayes optimal approach for partitioning the values of categorical attributes. Journal of Machine Learning Research 6, 1431–1452 (2005)Google Scholar
  3. Boullé, M.: MODL: a Bayes optimal discretization method for continuous attributes. Machine Learning 65(1), 131–165 (2006)CrossRefGoogle Scholar
  4. Boullé, M.: Compression-Based Averaging of Selective Naive Bayes Classifiers. Journal of Machine Learning Research 8, 1659–1685 (2007)Google Scholar
  5. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification And Regression Trees. Chapman and Hall, New York (1984)MATHGoogle Scholar
  6. Breslow, L.A., Aha, D.W.: Simplifying decision trees: A survey. Knowl. Eng. Rev. 12(1), 1–40 (1997), http://dx.doi.org/10.1017/S0269888997000015 CrossRefGoogle Scholar
  7. Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley & sons, Chichester (1991)MATHCrossRefGoogle Scholar
  8. Fawcett, T.: ROC Graphs: Notes and Practical Considerations for Researchers. Technical Report HPL-2003-4, HP Laboratories (2003)Google Scholar
  9. Garner, S.R.: WEKA: The Waikato Environment for Knowledge Analysis. In: Proc. of the New Zealand Computer Science Research Students Conference, pp. 57–64 (1995)Google Scholar
  10. Guyon, I., Saffari, A., Dror, G., Bumann, J.: Performance Prediction Challenge. In: International Joint Conference on Neural Networks, pp. 2958–2965 (2006), http://www.modelselect.inf.ethz.ch/index.php
  11. Kass, G.: An exploratory technique for investigating large quantities of categorical data. Applied Statistics 29(2), 119–127 (1980)CrossRefGoogle Scholar
  12. Kohavi, R., Quinlan, R.: Decision tree discovery. In: Handbook of Data Mining and Knowledge Discovery, pp. 267–276. University Press (2002)Google Scholar
  13. Morgan, J., Sonquist, J.A.: Problems in the analysis of Survey data, And a proposal. Journal of the American Statistical Association 58, 415–435 (1963)MATHCrossRefGoogle Scholar
  14. Murthy, S.K.: Automatic construction of decision trees from data: A multi-disciplinary survey. Data Mining and Knowledge Discovery 2, 345–389 (1998)CrossRefMathSciNetGoogle Scholar
  15. Naumov, G.E.: NP-completeness of problems of construction of optimal decision trees. Soviet Physics 34(4), 270–271 (1991)MathSciNetGoogle Scholar
  16. Provost, F., Domingos, P.: Well-trained PETs: Improving Probability Estimation Trees. Technical Report CeDER #IS-00-04, New York University (2001)Google Scholar
  17. Provost, F., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 445–553 (1998)Google Scholar
  18. Quinlan, J., Rivest, R.: Inferring decision trees using the minimum description length principle. Inf. Comput. 80(3), 227–248 (1989)MATHCrossRefMathSciNetGoogle Scholar
  19. Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)Google Scholar
  20. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)Google Scholar
  21. Rissanen, J.: A universal prior for integers and estimation by minimum description length. Annals of Statistics 11(2), 416–431 (1983)MATHCrossRefMathSciNetGoogle Scholar
  22. Wallace, C., Patrick, J.: Coding Decision Trees. Machine Learning 11, 7–22 (1993)MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Nicolas Voisine
    • 1
  • Marc Boullé
    • 1
  • Carine Hue
    • 1
  1. 1.Orange LabsFrance

Personalised recommendations