The minimum description length based decision tree pruning

  • Igor Kononenko
Induction (Decision Tree Pruning, Feature Selection, Feature Discretization)
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1531)


We describe the Minimum Description Length (MDL) based decision tree pruning. A subtree is considered unreliable and therefore is pruned if the description length of the classification of the corresponding subsets of training instances together with the description lengths of each path in the subtree is greater than the description length of the classification of the whole subset of training instances in the current node. We compare the performance of our simple, parameterless, and well-founded MDL method with some other methods on 18 datasets. The classification accuracy using the MDL pruning is comparable to other approaches and the decision trees are nearly optimally pruned which makes our method an attractive tool for obtaining a first approximation of the target decision tree during the knowledge discovery process.

Key words

machine learning decision trees MDL principle 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    I. Bratko, I. Kononenko. Learning diagnostic rules from incomplete and noisy data. In: B. Phelps (ed.) Interactions in Artificial Intelligence and Statistical Methods, Technical Press.Google Scholar
  2. 2.
    L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. Clasification and Regression Trees. Wadsworth International Group, 1984.Google Scholar
  3. 3.
    B. Cestnik. Estimating probabilities: A crucial task in machine learning. Proc. European Conference on Artificial Intelligences ECAI-90, Stochkholm, August 1990, pp. 147–149.Google Scholar
  4. 4.
    B. Cestnik and I. Bratko. On estimating probabilities in tree pruning. Proc. European Working Session on Learning, (Porto, March 1991), Y. Kodratoff (ed.), Springer Verlag. pp. 138–150.Google Scholar
  5. 5.
    B. Cestnik, I. Kononenko, and I. Bratko. ASSISTANT 86: A knowledge elicitation tool for sophisticated users. In: I. Bratko and N. Lavrac (eds.), Progress in Machine Learning. Wilmslow, England: Sigma Press.Google Scholar
  6. 6.
    F. Esposito, D. Malerba, and G. Semeraro. Simplifying decision trees by pruning and grafting: new results. Proc. Europ. Conf. on Machine Learning ECML-95 (N. Lavrac and S. Wrobel, eds.), Springer Verlag, pp. 287–290.Google Scholar
  7. 7.
    K. Kira and L. Rendell. A practical approach to feature selection. Proc. Intern. Conf. on Machine Learning ICML-92 (Aberdeen, July 1992) D. Sleeman & P. Edwards (eds.), Morgan Kaufmann, pp. 249–256.Google Scholar
  8. 8.
    I. Kononenko. On biases in estimating multivalued attributes. Proc. Int. Joint Conf. on Artificial Intelligence IJCAI-95, Montreal, August 20–25 1995, pp. 1034–1040.Google Scholar
  9. 9.
    I. Kononenko and I. Bratko. Information based evaluation criterion for classifier’s performance. Machine Learning 6: 67–80.Google Scholar
  10. 10.
    I. Kononenko, I. Bratko, E. Roskar. Experiments in automatic learning of medical diagnostic rules. International School for the Synthesis of Expert’s Knowledge Workshop ISSEK-84, Bled, Slovenia, August 1984.Google Scholar
  11. 11.
    I. Kononenko, E. Simec Induction of decision trees using Relief F. in: G. Della Riccia, R. Kruse, and R. Viertl (eds.). Mathematical and Statistical Methods in Artificial Intelligence, Springer Verlag.Google Scholar
  12. 12.
    M. Kovacic. Stochastic Inductive Logic Programming. Ph.D. Thesis, University of Ljubljana, March 1995, (available at: Scholar
  13. 13.
    M. Li and P. Vitanyi. An introduction to Kolmogorov Complexity and its applications, Springer Verlag, 1993.Google Scholar
  14. 14.
    J. Mingers. An empirical comparison of selection measures for decision tree induction. Machine Learning, 4:227–243.Google Scholar
  15. 15.
    P.M. Murphy and D.W. Aha. UCI Repository of machine learning databases [Machine-readable data repository]. Irvine, CA: University of California, Department of Information and Computer Science.Google Scholar
  16. 16.
    T. Niblett and I. Bratko. Learning decision rules in noisy domains. Proc. Expert Systems 86, Brighton, UK, December 1986.Google Scholar
  17. 17.
    J.R. Quinlan. Semi-autonomous acquisition of pattern-based knowledge. Machine Intelligence 10 (J. Hayes, D. Michie, and J.H. Pao, eds.), Horwood & Wiley.Google Scholar
  18. 18.
    J.R. Quinlan. Simplifying decision trees. Int. J. of Man-Machine Studies, 27: 221–234.Google Scholar
  19. 19.
    J.R. Quinlan, C4.5 programs for machine learning, Morgan Kaufmann.Google Scholar
  20. 20.
    J. Rissanen. Universal coding, information, prediction, and estimation. IEEE Trans. on Information Theory, 30(4): 629–636.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Igor Kononenko
    • 1
  1. 1.Faculty of Computer and Inf. Sc.University of LjubljanaLjubljanaSlovenia

Personalised recommendations