Skip to main content
SpringerLink
Log in
Menu
Find a journal Publish with us Track your research
Search
Cart
  1. Home
  2. Machine Learning
  3. Article

A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms

  • Published: September 2000
  • Volume 40, pages 203–228, (2000)
  • Cite this article
Download PDF
Machine Learning Aims and scope Submit manuscript
A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms
Download PDF
  • Tjen-Sien Lim1,
  • Wei-Yin Loh2 &
  • Yu-Shan Shih3 
  • 12k Accesses

  • 738 Citations

  • 4 Altmetric

  • Explore all metrics

Abstract

Twenty-two decision tree, nine statistical, and two neural network algorithms are compared on thirty-two datasets in terms of classification accuracy, training time, and (in the case of trees) number of leaves. Classification accuracy is measured by mean error rate and mean rank of error rate. Both criteria place a statistical, spline-based, algorithm called POLYCLSSS at the top, although it is not statistically significantly different from twenty other algorithms. Another statistical algorithm, logistic regression, is second with respect to the two accuracy criteria. The most accurate decision tree algorithm is QUEST with linear splits, which ranks fourth and fifth, respectively. Although spline-based statistical algorithms tend to have good accuracy, they also require relatively long training times. POLYCLASS, for example, is third last in terms of median training time. It often requires hours of training compared to seconds for other algorithms. The QUEST and logistic regression algorithms are substantially faster. Among decision tree algorithms with univariate splits, C4.5, IND-CART, and QUEST have the best combinations of error rate and speed. But C4.5 tends to produce trees with twice as many leaves as those from IND-CART and QUEST.

Article PDF

Download to read the full article text

Similar content being viewed by others

SPAARC: A Fast Decision Tree Algorithm

Chapter © 2019

Classification

Chapter © 2022

Observing a Naïve Bayes Classifier’s Performance on Multiple Datasets

Chapter © 2014
Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

  • Agresti, A. (1990). Categorical Data Analysis. New York, NY: John Wiley & Sons.

    Google Scholar 

  • Aronis, J. M. & Provost, F. J. (1997). Increasing the efficiency of data mining algorithms with breadth-first marker propagation. In D. Heckerman, H. Mannila, D. Pregibon, & R. Uthurusamy (Eds.), Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (pp. 119–122). Menlo Park, CA: AAAI Press.

    Google Scholar 

  • Auer, P., Holte, R. C., & Maass, W. (1995). Theory and applications of agnostic PAC-learning with small decision trees. In A. Prieditis & S. Russell (Eds.), Proceedings of the Twelfth International Conference on Machine Learning (pp. 21–29). San Francisco, CA: Morgan Kaufmann.

    Google Scholar 

  • Becker, R. A., Chambers, J. M., & Wilks, A. R. (1988). The New S Language. Wadsworth.

  • Bishop, C. M. (1995). Neural Networks for Pattern Recognition. New York, NY: Oxford University Press.

    Google Scholar 

  • Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1994). Classification and Regression Trees. New York, NY: Chapman and Hall.

    Google Scholar 

  • Breslow, L. A. & Aha, D. W. (1997). Simplifying decision trees: A survey. Knowledge Engineering Review, 12, 1–40.

    Google Scholar 

  • Brodley, C. E. & Utgoff, P. E. (1992). Multivariate versus univariate decision trees. Technical Report 92–8, Department of Computer Science, University of Massachusetts, Amherst, MA.

    Google Scholar 

  • Brodley, C. E. & Utgoff, P. E. (1995). Multivariate decision trees. Machine Learning, 19, 45–77.

    Google Scholar 

  • Brown, D. E., Corruble, V., & Pittard, C. L. (1993). A comparison of decision tree classifiers with backpropagation neural networks for multimodal classification problems. Pattern Recognition, 26, 953–961.

    Google Scholar 

  • Bull, S. (1994). Analysis of attitudes toward workplace smoking restrictions. In N. Lange, L. Ryan, L. Billard, D. Brillinger, L. Conquest, & J. Greenhouse (Eds.), Case Studies in Biometry (pp. 249–271). New York, NY: John Wiley & Sons.

    Google Scholar 

  • Buntine, W. (1992). Learning classification trees. Statistics and Computing, 2, 63–73.

    Google Scholar 

  • Buntine, W. & Caruana, R. (1992). Introduction to IND Version 2.1 and Recursive Partitioning. NASA Ames Research Center, Moffet Field, CA.

    Google Scholar 

  • Clark, L. A. & Pregibon, D. (1993). Tree-based models. In J. M. Chambers & T. J. Hastie (Eds.), Statistical Models in S (pp. 377–419). New York, NY: Chapman & Hall.

    Google Scholar 

  • Cohen, W.W. (1995). Fast effective rule induction. In A. Prieditis & S. Russell (Eds.), Proceedings of the Twelfth International Conference on Machine Learning (pp. 115–123). San Francisco, CA: Morgan Kaufmann.

    Google Scholar 

  • Curram, S. P. & Mingers, J. (1994). Neural networks, decision tree induction and discriminant analysis: An empirical comparison. Journal of the Operational Research Society, 45, 440–450.

    Google Scholar 

  • Friedman, J. (1991). Multivariate adaptive regression splines (with discussion). Annals of Statistics, 19, 1–141.

    Google Scholar 

  • Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association, 32, 675–701.

    Google Scholar 

  • Hand, D. J. (1997). Construction and Assessment of Classification Rules. Chichester, England: John Wiley & Sons.

    Google Scholar 

  • Harrison, D. & Rubinfeld, D. L. (1978). Hedonic prices and the demand for clean air. Journal of Environmental Economics and Management, 5, 81–102.

    Google Scholar 

  • Hastie, T., Buja, A., & Tibshirani, R. (1995). Penalized discriminant analysis. Annals of Statistics, 23, 73–102.

    Google Scholar 

  • Hastie, T. & Tibshirani, R. (1996). Discriminant analysis by Gaussian mixtures. Journal of the Royal Statistical Society, Series B, 58, 155–176.

    Google Scholar 

  • Hastie, T., Tibshirani, R., & Buja, A. (1994). Flexible discriminant analysis by optimal scoring. Journal of the American Statistical Association, 89, 1255–1270.

    Google Scholar 

  • Hollander, M. & Wolfe, D. A. (1994). Nonparametric Statistical Methods, (2nd ed.). New York, NY: John Wiley & Sons.

    Google Scholar 

  • Holte, R. C. (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11, 63–90.

    Google Scholar 

  • Johnson, R. A. & Wichern, D. W. (1992). Applied Multivariate Statistical Analysis, (3rd ed.). Englewood Cliffs, NJ: Prentice Hall.

    Google Scholar 

  • Kohonen, T. (1995). Self-Organizing Maps. Heidelberg: Springer-Verlag.

    Google Scholar 

  • Kooperberg, C., Bose, S., & Stone, C. J. (1997). Polychotomous regression. Journal of the American Statistical Association, 92, 117–127.

    Google Scholar 

  • Lerman, C., Molyneaux, J.W., Pangemanan, S., & Iswarati. (1991). The determinants of contraceptive method and service point choice. In Secondary Analysis of the 1987 National Indonesia Contraceptive Prevalence Survey, Volume 1: Fertility and Family Planning. Honolulu, HI: East-West Population Institute.

    Google Scholar 

  • Loh, W.-Y. & Shih, Y.-S. (1997). Split selection methods for classification trees. Statistica Sinica, 7, 815–840.

    Google Scholar 

  • Loh, W.-Y. & Vanichsetakul, N. (1988). Tree-structured classification via generalized discriminant analysis (with discussion). Journal of the American Statistical Association, 83, 715–728.

    Google Scholar 

  • Mangasarian, O. L. & Wolberg, W. H. (1990). Cancer diagnosis via linear programming. Siam News, 23, 1–18.

    Google Scholar 

  • Merz, C. J. & Murphy, P. M. (1996). UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California, Irvine, CA (http://www.ics.uci.edu/~mlearn/MLRepository.html).

    Google Scholar 

  • Michie, D., Spiegelhalter, D. J., & Taylor, C. C. (Eds.). (1994). Machine Learning, Neural and Statistical Classification, Ellis Horwood: London.

    Google Scholar 

  • Ruppert G. Miller, Jr. (1981). Simultaneous Statistical Inference, (2nd ed.). New York: Springer-Verlag.

    Google Scholar 

  • Müller, W. & Wysotzki, F. (1994). Automatic construction of decision trees for classification. Annals of Operations Research, 52, 231–247.

    Google Scholar 

  • Müller, W. & Wysotzki, F. (1997). The decision-tree algorithm CAL5 based on a statistical approach to its splitting algorithm. In G. Nakhaeizadeh & C. C. Taylor (Eds.), Machine Learning and Statistics: The Interface (pp. 45–65). New York, NY: John Wiley & Sons.

    Google Scholar 

  • Murthy, S. K., Kasif, S., & Salzberg, S. (1994). A system for induction of oblique decision trees. Journal of Artificial Intelligence Research, 2, 1–33.

    Google Scholar 

  • Neter, J., Wasserman, W., & Kutner, M. H. (1990). Applied Linear Statistical Models, (3rd ed.). Boston, MA: Irwin.

    Google Scholar 

  • Oates, T. & Jensen, D. (1997). The effects of training set size on decision tree complexity. In D. H. Fisher, Jr. (Ed.), Proceedings of the Fourteenth International Conference on Machine Learning (pp. 254–262). San Francisco, CA: Morgan Kaufmann.

    Google Scholar 

  • Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Quinlan, J. R. (1996). Improved use of continuous attributes in C4.5. Journal of Artificial Intelligence Research, 4, 77–90.

    Google Scholar 

  • Ripley, B. D. (1996). Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press.

    Google Scholar 

  • Sarle, W. S. (1994). Neural networks and statistical models. In Proceedings of the Nineteenth Annual SAS Users Groups International Conference, Cary, NC (pp. 1538–1550). SAS Institute, Inc. (ftp://ftp.sas.com/pub/neural/neural1.ps).

    Google Scholar 

  • SAS Institute, Inc. SAS/STAT User's Guide, Version 6, (Vols. 1 & 2). SAS Institute, Inc., Cary, NC, 1990.

    Google Scholar 

  • Shavlik, J. W., Mooney, R. J., & Towell G. G. (1991). Symbolic and neural learning algorithms: an empirical comparison. Machine Learning, 6, 111–144.

    Google Scholar 

  • Venables, W. N. & Ripley, B. D. (1997). Modern Applied Statistics with S-Plus, (2nd ed.). NewYork, NY: Springer.

    Google Scholar 

  • Wolberg, W. H., Tanner, M. A., & Loh, W.-Y. (1988). Diagnostic schemes for fine needle aspirates of breast masses. Analytical and Quantitative Cytology and Histology, 10, 225–228.

    Google Scholar 

  • Wolberg, W. H., Tanner, M. A., & Loh, W.-Y. (1989). Fine needle aspiration for breast mass diagnosis. Archives of Surgery, 124, 814–818.

    Google Scholar 

  • Wolberg, W. H., Tanner, M. A., Loh, W.-Y., & Vanichsetakul, N. (1987). Statistical approach to fine needle aspiration diagnosis of breast masses. Acta Cytologica, 31, 737–741.

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Department of Statistics, University of Wisconsin, Madison, WI, 53706, USA

    Tjen-Sien Lim

  2. Department of Statistics, University of Wisconsin, Madison, WI, 53706, USA

    Wei-Yin Loh

  3. Department of Mathematics, National Chung Cheng University, Chiayi, 621, Taiwan, R.O.C.

    Yu-Shan Shih

Authors
  1. Tjen-Sien Lim
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Wei-Yin Loh
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Yu-Shan Shih
    View author publications

    You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lim, TS., Loh, WY. & Shih, YS. A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms. Machine Learning 40, 203–228 (2000). https://doi.org/10.1023/A:1007608224229

Download citation

  • Issue Date: September 2000

  • DOI: https://doi.org/10.1023/A:1007608224229

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • classification tree
  • decision tree
  • neural net
  • statistical classifier
Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Advertisement

Search

Navigation

  • Find a journal
  • Publish with us
  • Track your research

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

167.114.118.212

Not affiliated

Springer Nature

© 2024 Springer Nature