Trading Accuracy for Simplicity in Decision Trees

Bohanec, Marko; Bratko, Ivan

doi:10.1023/A:1022685808937

Trading Accuracy for Simplicity in Decision Trees

Published: June 1994

Volume 15, pages 223–250, (1994)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Trading Accuracy for Simplicity in Decision Trees

Download PDF

Marko Bohanec¹ &
Ivan Bratko²

2302 Accesses
84 Citations
Explore all metrics

Abstract

When communicating concepts, it is often convenient or even necessary to define a concept approximately. A simple, although only approximately accurate concept definition may be more useful than a completely accurate definition which involves a lot of detail. This paper addresses the problem: given a completely accurate, but complex, definition of a concept, simplify the definition, possibly at the expense of accuracy, so that the simplified definition still corresponds to the concept “sufficiently” well. Concepts are represented by decision trees, and the method of simplification is tree pruning. Given a decision tree that accurately specifies a concept, the problem is to find a smallest pruned tree that still represents the concept within some specified accuracy. A pruning algorithm is presented that finds an optimal solution by generating a dense sequence of pruned trees, decreasing in size, such that each tree has the highest accuracy among all the possible pruned trees of the same size. An efficient implementation of the algorithm, based on dynamic programming, is presented and empirically compared with three progressive pruning algorithms using both artificial and real-world data. An interesting empirical finding is that the real-world data generally allow significantly greater simplification at equal loss of accuracy.

References

Bain, M., & Muggleton, S.H. (1991). Non-monotonic learning. In Hayes, J.E., Michie, D, and Tyugu, E. (Eds.), Machine Intelligence 12. Oxford: Clarendon Press.
Google Scholar
Bohanec, M., Bratko, I., & Rajkovič, V. (1983). An expert system for decision making. In Sol, H.G. (Ed.), Processes and Tools for Decision Support. Amsterdam: North-Holland.
Google Scholar
Bohanec, M., & Rajkovič, V. (1987). An expert system approach to multi-attribute decision making. In Hamza, M.H. (Ed.) Proc. IASTED Conference on Expert Systems. Anaheim: Acta Press.
Google Scholar
Bohanec, M., Rajkovič, V., & Lavrač, N. (1988). Knowledge explanation in expert systems: A decision support system and machine learning view. In Hamza, M.H. (Ed.) Proc. IASTED Conference on Expert Systems. Anheim: Acta Press.
Google Scholar
Bohanec, M., & Rajkovič, V. (1988). Knowledge acquisition and explanation for multi-attribute decision making. Proc. International Workshop “Expert Systems and Their Applications Avignon 88”, Vol. 1, 59–78.
Google Scholar
Bohanec, M. (1990). Methods for evaluation of alternatives and knowledge explanation in multi-attribute decision making. Ph.D. Thesis, University of Ljubljana (in Slovenian).
Bratko, I. (1989). Machine learning. In Gilhooly, K.J. (Ed.), Human and Machine Problem Solving. Plenum Press.
Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J. (1984). Classification and Regression Trees. Belmont: Wadsworth.
Catlett, J. (1992). Ripple-down-rules as a mediating representation in interactive induction. Proc. Second Japanese Knowledge Acquisition for Knowledge-Based Systems Workshop: JKAW'92. Japanese Society for Artificial Intelligence, Kobe and Hatoyama, 155–170.
Google Scholar
Cestnik, B., Kononenko, I., & Bratko, I. (1987). ASSISTANT 86: A knowledge-elicitation tool for sophisticated users. In Bratko, I., and Lavrač, N. (Eds.), Progress in Machine Learning. Wilmslow: Sigma Press.
Google Scholar
Cestnik, B. (1990). Estimating probabilities: A crucial task in machine learning. Proc. European Conference on Artificial Intelligence ECAI-90. Stockholm.
Cestnik, B., & Bratko, I. (1991). On estimating probabilities in tree pruning. In Kodratoff, Y. (Ed.), Proceedings of the European Working Session on Learning EWSL-91, Porto, Portugal. March 6–8, 1991, Lecture Notes in Artificial Intelligence, Vol. 482. Berlin: Springer-Verlag.
Google Scholar
Compton, P., & Jansen, R. (1988). Knowledge in context: A strategy for expert system maintenance. In Proceedings AI'88: 2nd Australian Joint Artificial Intelligence Conference, Adelaide Australia. Berlin: Springer-Verlag.
Google Scholar
Džeroski, S., & Bratko, I. (1992). Handling noise in inductive logic programming. Proc. FGCS-92 International Workshop on Inductive Logic Programming, ICOT TM-1182. Tokyo.
French, S. (1986). Decision Theory: An Introduction to the Mathematics of Rationality. New York: Wiley.
Google Scholar
Lavrač, N., & Džeroski, S. (1991). Learning nonrecursive definitions of relations with LINUS. In Kodratoff, Y. (Ed.), Proceedings of the European Working Session on Learning EWSL-91, Porto, Portugal, March 6–8, 1991, Lecture Notes in Artificial Intelligence, Vol. 482. Berlin: Springer-Verlag.
Google Scholar
Michalski, R.S. (1990). Learning flexible concepts: Fundamental ideas and a method based on two-tiered representation. In Kodratoff, Y., and Michalski, R.S. (Eds.), Machine Learning: An Artificial Intelligence Approach, Vol. 3. San Mateo, CA: Morgan Kaufmann.
Google Scholar
Michie, D. (1989). Problems of computer-aided concept formation. In Quinlan, J.R. (Ed.), Applications of Expert Systems, Vol. 2. Turing Institute Press in association with Addison-Wesley.
Mingers, J. (1989). An empirical comparison of pruning methods for decision tree induction. Machine Learning 4, 227–243.
Google Scholar
Muggleton, S., Bain, M., Hayes-Michie, J., & Michie, D. (1989). An experimental comparison of human and machine learning formalisms. In Spatz, B. (Ed.), Proceedings of the Sixth International Workshop on Machine Learning, Cornell University, Ithaca, New York: June 26–27, 1989. San Mateo, CA: Morgan Kaufmann, 1989.
Google Scholar
Niblett, T., & Bratko, I. (1986). Learning decision rules in noisy domains. Proc. Expert Systems 86, Brighton. Cambridge: Cambridge University Press.
Google Scholar
Niblett, T. (1987). Constructing decision trees in noisy domains. In Bratko, I., and Lavrač, N. (Eds.), Progress in Machine Learning. Wilmslow: Sigma Press.
Google Scholar
Pazzani, M.J., & Brunk, C.A. (1991). Detecting and correcting errors in rule-based expert systems: An integration of empirical and explanation-based learning. Knowledge Acquisition 3 (2), 157–173.
Google Scholar
Quinlan, J.R. (1979). Discovering rules by induction from large collections of examples. In Michie, D. (Ed.), Expert Systems in the Microelectronic Age. Edinburgh: Edinburgh University Press.
Google Scholar
Quinlan, J.R. (1983). Learning efficient classification procedures and their application to chess end games. In Michalski, R.S., Carbonell, J., and Mitchell, T. (Eds.), Machine Learning, An Artificial Intelligence Approach. Los Altos: Kaufmann.
Google Scholar
Quinlan, J.R. (1986). Induction of decision trees. Machine Learning 1, 81–106.
Google Scholar
Quinlan, J.R. (1987). Generating production rules from decision trees. Proc. International Conference on Artificial Intelligence. Los Altos: Kaufmann, 304–307.
Google Scholar
Quinlan, J.R. (1990). Learning logical definitions from relations. Machine Learning 5, 239–266.
Google Scholar
Rajkovič, V., Bohanec, M., & Batagelj, V. (1988). Knowledge Engineering Techniques for Utility Identification. Acta Psychologica 68, 271–286.
Google Scholar
Rajkovič, V., & Bohanec, M. (1991). Decision support by knowledge explanation. In Sol, H.G., Vecsenyi, J. (Eds.), Environments for Supporting Decision Processes. Amsterdam: North-Holland, 47–57.
Google Scholar
Sedgewick, R. (1983). Algorithms. Reading: Addison-Wesley.
Google Scholar
Shapiro, A. (1987). Structured Induction of Expert Systems. Reading: Addison-Wesley.
Google Scholar

Download references

Author information

Authors and Affiliations

“Jožef Stefan” Institute, Jamova 39, SI-61111, Ljubljana, Slovenia
Marko Bohanec
Faculty of Electrical and Computer Engineering, University of Ljubljana, Tržaška 25, SI-61000, Ljubljana, Slovenia
Ivan Bratko

Authors

Marko Bohanec
View author publications
You can also search for this author in PubMed Google Scholar
Ivan Bratko
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bohanec, M., Bratko, I. Trading Accuracy for Simplicity in Decision Trees. Machine Learning 15, 223–250 (1994). https://doi.org/10.1023/A:1022685808937

Download citation

Issue Date: June 1994
DOI: https://doi.org/10.1023/A:1022685808937

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Trading Accuracy for Simplicity in Decision Trees

Abstract

Article PDF

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

A random forest guided tour

Density-Based Clustering Based on Hierarchical Density Estimates

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Trading Accuracy for Simplicity in Decision Trees

Abstract

Article PDF

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

A random forest guided tour

Density-Based Clustering Based on Hierarchical Density Estimates

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation