A theoretical framework for decision trees in uncertain domains: Application to medical data sets

  • B. Crémilleux
  • C. Robert
Decision-Support Theories
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1211)


Experimental evidence shows that many attribute selection criteria involved in the induction of decision trees perform comparably. We set up a theoretical framework that explains this empirical law. It furthermore provides an infinite set of criteria (the C.M. criteria) which contains the most commonly used criteria. We also define C.M. pruning which is suitable in uncertain domains. In such domains, like medicine, some sub-trees which don't lessen the error rate can be relevant to point out some populations of specific interest or to give a representation of a large data file. C.M. pruning allows to keep such sub-trees, even when keeping the sub-trees doesn't increase the classification efficiency. Thus we obtain a consistent framework for both building and pruning decision trees in uncertain domains. We give typical examples in medicine, highlighting routine use of induction in this domain even if the targeted diagnosis cannot be reached for many cases from the findings under investigation.


Decision Tree Pruning Method Nest Tree Classification Error Rate Decision Tree Induction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Babic, A., Krusinska, E., & Strömberg, J. E. (1992) Extraction of diagnostic rules using recursive partitioning systems: a comparison of two approaches. Artificial Intelligence in Medicine 4, 373–387.Google Scholar
  2. [2]
    Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984) Classification and regression trees. Wadsworth. Statistics probability series. Belmont.Google Scholar
  3. [3]
    Breiman, L. (1996) Some properties of splitting criteria (technical note). Machine Learning 21, 41–47.Google Scholar
  4. [4]
    Buntine, W. (1992) Learning classification trees. Statistics and Computing 2, 63–73Google Scholar
  5. [5]
    Buntine, W., & Niblett, T. (1992) A further comparison of splitting rules for decision-tree induction. Machine Learning 8, 75–85.Google Scholar
  6. [6]
    Catlett, J. (1991) Overpruning large decision trees. In proceedings of the Twelfth International Joint Conference on Artificial Intelligence IJCAI 91. (pp 764–769). Sydney, Australia.Google Scholar
  7. [7]
    Crémilleux, B. (1991) Induction automatique: aspects théoriques, le système ARBRE, applications en médecine. Ph D thesis. Joseph Fourier University. Grenoble (France).Google Scholar
  8. [8]
    Crémilleux, B., & Robert, C. (1996) A Pruning Method for Decision Trees in Uncertain Domains: Applications in Medicine. In proceedings of the workshop Intelligent Data Analysis in Medicine and Pharmacology, ECAI 96. (pp 15–20). Budapest, Hungary.Google Scholar
  9. [9]
    Crémilleux, B., & Zreik, K. (1996) Le rôle de l'interaction personne-système lors de la production d'arbres de décision. In proceedings of the international Conference on Human-System Learning CAPS 96. (pp 20–31). Caen, France.Google Scholar
  10. [10]
    Esposito, F., Malerba, D., & Semeraro, G. (1993) Decision tree pruning as search in the state space. In Proceedings of European Conference on Machine Learning ECML 93. (pp 165–184). Vienna (Austria), P. B. Brazdil (Ed.). Lecture notes in artificial intelligence. N∘ 667. Springer-Verlag.Google Scholar
  11. [11]
    Fayyad, U. M., & Irani, K. B. (1992) The attribute selection problem in decision tree generation. In Proceedings of Tenth National Conference on Artificial Intelligence. (pp 104–110). Cambridge, MA: AAAI Press/MIT Press.Google Scholar
  12. [12]
    Fayyad, U. M. (1994) Branching on attribute values in decision tree generation. In proceedings of Twelfth National Conference on Artificial Intelligence. (pp 601–606). AAAI Press/MIT Press.Google Scholar
  13. [13]
    File, P. E., Dugard P. I., & Houston, A. S. (1994) Evaluation of the use of induction in the development of a medical expert system. Computers and Biomedical Research 27, 383–395.Google Scholar
  14. [14]
    Gams, M., & Petkovsek, M. (1988) Learning from examples in the presence of noise. In proceedings of Eighth International Workshop Expert Systems and Their Applications. (pp 609–624). Avignon, France.Google Scholar
  15. [15]
    Gascuel, O., & Caraux, G. (1992) Statistical significance in inductive learning. In proceedings of the Tenth European Conference on Artificial Intelligence ECAI 92. (pp 435–439). Vienne, Austria.Google Scholar
  16. [16]
    Gelfand, S. B., Ravishankar, C. S., & Delp, E. J. (1991) An iterative growing and pruning algorithm for classification tree design. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(2), 163–174.Google Scholar
  17. [17]
    Goodman, R. M. F., & Smyth, P. (1988) Information-theoretic rule induction. In proceedings of the Eighth European Conference on Artificial Intelligence ECAI 88. (pp 357–362). München, Germany.Google Scholar
  18. [18]
    Hart, A. (1984) Experience in the use of an inductive system in knowledge engineering. In M. Bramer (Ed.), Research and development in expert systems. Cambridge University Press.Google Scholar
  19. [19]
    Jalbert, P., Jalbert, H., & Sele, B. (1988) Types of imbalances in human reciprocal translocations: risks at birth. The cytogenetics of mammalian rearrangements, Alan R. Liss. 267–291.Google Scholar
  20. [20]
    Janssen, F., Schachner, J., Hubbard, J., & Hartman, J. (1987) The risk of deep venous thrombosis: a computerized epidemiologic approach. Surg. Am. Google Scholar
  21. [21]
    Kern, J., Dezelic, G., Dürrigl, T., & Vuletic, S. (1993) Medical decision making based on inductive learning method. Artificial Intelligence in Medicine 5, 213–223.Google Scholar
  22. [22]
    Kira, K., & Rendell, L. (1992) A practical approach to feature selection. In Proceedings of the International Conference on Machine Learning. (pp 249–256). Aberdeen, D. Sleeman & P. Edwards (Eds). Morgan Kaufmann.Google Scholar
  23. [23]
    Kononenko, I. (1994) Estimating attributes: analysis and extensions of RELIEF. In Proceedings of European Conference on Machine Learning ECML 94. (pp 171–182). Catania (Italy), F. Bergadano & L De Raedt (Eds.). Lecture notes in artificial intelligence. N∘ 784. Springer-Verlag.Google Scholar
  24. [24]
    Kononenko, I. (1995) On biases in estimating multi-valued attributes. In proceedings of the Fourteenth International Joint Conference on Artificial Intelligence IJCAI 95. (pp 1034–1040). Montréal, Canada.Google Scholar
  25. [25]
    Liu, W. Z., & White, A. P. (1994) The importance of attribute selection measures in decision tree induction. Machine Learning 15, 25–41.Google Scholar
  26. [26]
    Lopez de Mantaras, R. (1991) A distance-based attribute selection measure for decision tree induction. Machine Learning 6, 81–92.Google Scholar
  27. [27]
    Marshall, R. (1986) Partitioning methods for classification and decision making in medicine. Statistics in Medicine 5, 517–526.Google Scholar
  28. [28]
    Mingers, J. (1986) Expert systems — experiments with rule induction. Journal of the Operational Research Society 37(11), 1031–1037.Google Scholar
  29. [29]
    Mingers, J. (1989) An empirical comparison of selection measures for decision-tree induction. Machine Learning 3, 319–342.Google Scholar
  30. [30]
    Mingers, J. (1989) An empirical comparison of pruning methods for decision-tree induction. Machine Learning 4, 227–243.Google Scholar
  31. [31]
    Niblett, T. (1987) Constructing decision trees in noisy domains. In Proceedings of 2nd European Working Sessions on Learning EWSL 87. (pp 67–78). Bled (Yugoslavia), Sigma Press. Wilmslow.Google Scholar
  32. [32]
    Quinlan, J. R. (1986) Induction of decision trees. Machine Learning 1, 81–106.Google Scholar
  33. [33]
    Quinlan, J. R., & Rivest, R. L. (1989) Inferring decision trees using the minimum description length principle. Information and Computation 80(3), 227–248.Google Scholar
  34. [34]
    Quinlan J. R. (1993) C4.5 Programs for Machine Learning. San Mateo, CA. Morgan Kaufmann.Google Scholar
  35. [35]
    Rockafellar, R. T. (1970) Convex analysis. Princeton University Press. Princeton. New Jersey.Google Scholar
  36. [36]
    Safavian, S. R., & Landgrebe, D. (1991) A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics 21(3), 660–674.Google Scholar
  37. [37]
    Schaffer, C. (1993) Overfitting avoidance as bias. Machine Learning 10, 153–178.Google Scholar
  38. [38]
    Taylor, C. C., Michie D., & Spiegelhalter, D. J. (1994) Machine learning, neural and statistical classification. Ellis Horwood Series in Artificial Intelligence.Google Scholar
  39. [39]
    Wallace, C. S., & Patrick, J. D. (1993) Coding decision trees. Mach. Learn.11, 7–22.Google Scholar
  40. [40]
    White, A. P., & Liu, W. Z. (1994) Bias in Information-Based Measures in Decision Tree Induction. Machine Learning 15, 321–329.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • B. Crémilleux
    • 1
  • C. Robert
    • 2
  1. 1.GREYC, CNRS - UPRESA 1526Université de CaenCaen CédexFrance
  2. 2.Institut de Recherche en Mathématiques AppliquéesUniversité Joseph FourierGrenoble CédexFrance

Personalised recommendations