Skip to main content
Springer Nature Link
Log in
Menu
Find a journal Publish with us Track your research
Search
Cart
  1. Home
  2. Machine Learning
  3. Article

Very Simple Classification Rules Perform Well on Most Commonly Used Datasets

  • Published: April 1993
  • Volume 11, pages 63–90, (1993)
  • Cite this article
Download PDF
Machine Learning Aims and scope Submit manuscript
Very Simple Classification Rules Perform Well on Most Commonly Used Datasets
Download PDF
  • Robert C. Holte1 
  • 11k Accesses

  • 8 Altmetric

  • Explore all metrics

Abstract

This article reports an empirical investigation of the accuracy of rules that classify examples on the basis of a single attribute. On most datasets studied, the best of these very simple rules is as accurate as the rules induced by the majority of machine learning systems. The article explores the implications of this finding for machine learning research and applications.

Article PDF

Download to read the full article text

Similar content being viewed by others

Classification

Chapter © 2022

Machine Learning and Data Mining

Chapter © 2024

Rule Set

Chapter © 2017

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.
  • Artificial Intelligence
Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

  • Aha, D.W., & Kibler, D. (1989). Noise-tolerant instance-based learning algorithms. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (pp. 794–799). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Bergandano, F., Matwin, S., Michalski, R.S., & Zhang, J. (1992). Learning two-tiered descriptions of flexible concepts: The Poseidon system. Machine Learning, 8, 5–44.

    Google Scholar 

  • Buntine, W. (1989). Learning classification rules using Bayes. in A Segre (Ed.), Proceedings of the 6th International Workshop on Machine Learning (pp. 94–98). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Buntine, W., & Niblett, T. (1992). A further comparison of splitting rules for decision-tree induction. Machine Learning, 8, 75–86.

    Google Scholar 

  • Catlett, J. (1991a). Megainduction: A test flight. In L.A. Birnbaum & G.C. Collins (Eds.), Proceedings of the Eighth International Conference on Machine Learning (pp. 596–599). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Catlett, J. (1991b). On changing continuous attributes into ordered discrete attributes. In Y. Kodratoff (Ed.), Machine Learning—EWSL-91 (pp: 164–178). Springer-Verlag.

  • Cestnik, B., & Bratko, I. (1991). On estimating probabilities in tree pruning. In Y. Kodratoff (Ed.) Machine Learning—EWSL-91 (pp. 138–150). Springer-Verlag.

  • Cestnik, G., Konenenko, I., & Bratko, I. (1987). Assistant-86: A knowledge-elicitation tool for sophisticated users. In I. Bratko & N. Lavrac (Eds.), Progress in Machine Learning (pp. 31–45). Wilmslow, England: Sigma Press.

    Google Scholar 

  • Clark, P., & Boswell, R. (1991). Rule induction with CN2: Some recent improvements. In Y. Kodratoff (Ed.), Machine Learning—EWSL-91 (pp. 151–163). Springer-Verlag.

  • Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3, 261–283.

    Google Scholar 

  • Clark, P., & Niblett, T. (1987). Induction in noisy domains. In I. Bratko & N. Lavrac (Eds.), Progress in machine learning (pp. 11–30). Wilmslow, England: Sigma Press.

    Google Scholar 

  • de la Maza, M. (1991). A prototype based symbolic concept learning system. In L.A. Birnbaum & G.C. Collins (Eds.), Proceedings of the Eighth International Conference on Machine Learning (pp. 41–45). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Diaconis, P., & Efron, B. (1983). Computer-intensive methods in statistics. Scientific American, 248.

  • Fisher, D.H. (1987). Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2, 139–172.

    Google Scholar 

  • Fisher, D.H. & McKusick, K.B. (1989). An empirical comparison of ID3 and back-propagation. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (pp. 788–793). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Fisher, D.H., & Schlimmer, J.C. (1988). Concept simplification and prediction accuracy. In J. Laird (Ed.), Proceedings of the Fifth International Conference on Machine Learning (pp. 22–28). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Hirsh, H. (1990). Learning from data with bounded inconsistency. In B.W. Porter & R.J. Mooney (Eds.), Proceedings of the Seventh International Conference on Machine Learning (pp. 32–39). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Holder, L.B., Jr., (1991). Maintaining the utility of learned knowledge using model-based adaptive control. Ph.D. thesis, Computer Science Department, University of Illinois at Urbana-Champaign.

  • Holte, R.C., Acker, L., & Porter, B.W. (1989). Concept learning and the problem of small disjuncts. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (pp. 813–818). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Iba, W.F., & Langley, P. (1992). Induction of one-level decision trees. In D. Sleeman & P. Edwards (Eds.) Proceedings of the Ninth International Conference on Machine Learning (pp. 233–240). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Iba, W., Wogulis, J., & Langley, P. (1988). Trading off simplicity and coverage in incremental concept learning. In J. Laird (Ed.), Proceedings of the Fifth International Conference on Machine Learning (pp. 73–79). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Jensen, D. (1992). Induction with randomization testing: Decision-oriented analysis of large data sets. Ph.D. thesis, Washington University, St. Louis, Missouri.

  • Kibler, D., & Aha, D.W. (1988). Comparing instance-averaging with instance-filtering learning algorithms. In D. Sleeman (Ed.), EWSL88: Proceedings of the 3rd European Working Session on Learning (pp. 63–69). Pitman.

  • Lopez de Mantaras, R. (1991). A Distance-based attribute selection measure for decision tree induction. Machine Learning, 6, 81–92.

    Google Scholar 

  • McLeish, M., & Cecile, M. (1990). Enhancing medical expert systems with knowledge obtained from statistical data. Annals of Mathematics and Artificial Intelligence, 2, 261–276.

    Google Scholar 

  • Michalski, R.S. (1990). Learning flexible concepts: fundamental ideas and a method based on two-tiered representation. In Y. Kodratoff & R.S. Michalski (Eds.), Machine Learning: An Artificial Intelligence Approach (Vol. 3). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Michalski, R.S., & Chilausky, R.L. (1980). Learning by being told and learning from examples: An experimental comparison of the two methods of knowledge acquisition in the context of developing an expert system for soybean disease diagnosis. International Journal of Policy Analysis and Information Systems, 4 (2), 125–161.

    Google Scholar 

  • Michalski, R.S., Mozetic, I., Hong, J., & Lavrac, N. (1986). The multi-purpose incremental learning system AQ15 and its testing application to three medical domains. Proceedings of the Fifth National Conference on Artificial Intelligence (pp. 1041–1045). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Mingers, J. (1989). An empirical comparison of pruning methods for decision tree induction. Machine Learning, 4 (2), 227–243.

    Google Scholar 

  • Quinlan, J.R. (1989). Unknown attribute values in induction. In A. Segre (Ed.), Proceedings of the 6th International Workshop on Machine Learning (pp. 164–168). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Quinlan, J.R. (1987). Generating production rules from decision trees. Proceedings of the Tenth International Joint Conference on Artificial Intelligence (pp. 304–307). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Quinlan, J.R. (1986). Induction of decision trees, Machine Learning, 1, 81–106.

    Google Scholar 

  • Quinlan, J.R., Compton, P.J., Horn, K.A., & Lazurus, L. (1986). Inductive knowledge acquisition: a case study. Proceedings of the Second Australian Conference on Applications of Expert Systems. Sydney, Australia.

  • Rendell, L., & Seshu, R. (1990). Learning hard concepts through constructive induction. Computational Intelligence, 6, 247–270.

    Google Scholar 

  • Salzberg, S. (1991). A nearest hyperrectangle learning method. Machine Learning, 6, 251–276.

    Google Scholar 

  • Saxena, S. (1989). Evaluating alternative instance representations. In A. Segre (Ed.), Proceedings of the Sixth International Conference on Machine Learning (pp. 465–468). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Schaffer, C. (in press). Overfitting avoidance as bias. Machine Learning.

  • Schaffer, C. (1992). Sparse data and the effect of overfitting avoidance in decision tree induction. Proceedings of AAAI-92, the Tenth National Conference on Artificial Intelligence.

  • Schlimmer, J.S. (1987). Concept acquisition through representational adjustment (Technical Report 87-19). Ph.D. thesis, Department of Information and Computer Science, University of California, Irvine.

  • Schoenauer, M., & Sebag, M. (1990). Incremental learning of rules and meta-rules. In B.W. Porter & R.J. Mooney (Eds.), Proceedings of the Seventh International Conference on Machine Learning (pp. 49–57). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Shapiro, A.D. (1987). Structured induction of expert systems. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Shavlik, J., Mooney, R.J., & Towell, G. (1991). Symbolic and neural learning algorithms: An experimental comparison. Machine Learning, 6, 111–143.

    Google Scholar 

  • Tan, M., & Eshelman, L. (1988). Using weighted networks to represent classification knowledge in noisy domains. In J. Laird (Ed.), Proceedings of the Fifth International Conference on Machine Learning (pp. 121–134). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Tan, M., & Schlimmer, J. (1990). Two case studies in cost-sensitive concept acquisition. Proceedings ofAAAI-90, the Eighth National Conference on Artificial Intelligence (pp. 854–860). Cambridge, MA: MIT Press.

    Google Scholar 

  • Utgoff, P.E., & Bradley, C.E. (1990). An incremental method for finding multivariate splits for decision trees. In B.W. Porter & R.J. Mooney (Eds.), Proceedings of the Seventh International Conference on Machine Learning (pp. 58–65). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Weiss, S.M., Galen, R.S., & Tadepalli, P.V. (1990). Maximizing the predictive value of production rules. Artificial Intelligence, 45, 47–71.

    Google Scholar 

  • Weiss, S.M., & Kapouleas, I. (1990). An empirical comparison of pattern recognition, neural nets, and machine learning classification methods. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (pp. 781–787). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Wirth, J., & Catlett, J. (1988). Experiments on the costs and benefits of windowing in IDS. In J. Laird (Ed.), Proceedings of the Fifth International Conference on Machine Learning (pp. 87–99). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Yeung, D.-Y. (1991). A neural network approach to constructive induction. In L.A. Birnbaum & G.C. Collins (Eds.), Proceedings of the Eighth International Conference on Machine Learning (pp. 228–232). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Computer Science Department, University of Ottawa, Ottawa, Canada, KIN 6N5

    Robert C. Holte

Authors
  1. Robert C. Holte
    View author publications

    You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Holte, R.C. Very Simple Classification Rules Perform Well on Most Commonly Used Datasets. Machine Learning 11, 63–90 (1993). https://doi.org/10.1023/A:1022631118932

Download citation

  • Issue Date: April 1993

  • DOI: https://doi.org/10.1023/A:1022631118932

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • empirical learning
  • accuracy–complexity tradeoff
  • pruning
  • ID3
Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Advertisement

Search

Navigation

  • Find a journal
  • Publish with us
  • Track your research

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Journal finder
  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support
  • Cancel contracts here

5.135.140.155

Not affiliated

Springer Nature

© 2024 Springer Nature