Randomization in Aggregated Classification Trees

  • Eugeniusz Gatnar
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


Tree-based models are popular and widely used because they are simple, flexible and powerful tools for classification. Unfortunately they are not stable classifiers. Significant improvement of model stability and prediction accuracy can be obtained by aggregation of multiple classification trees. The reduction of classification error is a result of decreasing bias or/and variance of the committee of trees (called also an ensemble or a forest). In this paper we discuss and compare different methods for model aggregation. We also address the problem of finding minimal number of trees sufficient for the forest.


Prediction Error Training Sample Random Forest Component Tree Classification Error 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. AMIT, Y. and BLANCHARD, G. (2001): Multiple Randomized Classifiers: MRCL. Technical Report, Department of Statistics, University of Chicago, Chicago.Google Scholar
  2. BAUER, E. and KOHAVI R. (1999): An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. Machine Learning, 36, 105–142.CrossRefGoogle Scholar
  3. BLAKE, C., KEOGH, E., and MERZ, C.J. (1998): UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California, Irvine.Google Scholar
  4. BREIMAN, L., FRIEDMAN, J., OLSHEN, R., and STONE, C. (1984): Classification and Regression Trees, Chapman & Hall/CRC Press, London.Google Scholar
  5. BREIMAN, L. (1996a): Bagging predictors. Machine Learning, 24, 123–140.MATHMathSciNetGoogle Scholar
  6. BREIMAN, L. (1996b): Bias, Variance and Arcing Classifiers. Technical Report, Statistics Department, University of California, Berkeley.Google Scholar
  7. BREIMAN, L. (1998): Arcing classifers. Annals of Statistics, 26, 801–849.MATHMathSciNetCrossRefGoogle Scholar
  8. BREIMAN, L. (1999): Using adaptive bagging to debias regressions. Technical Report 547, Department of Statistics, University of California, Berkeley.Google Scholar
  9. BREIMAN, L. (2001): Random Forests. Machine Learning 45, 5–32.MATHCrossRefGoogle Scholar
  10. BREIMAN, L. (2002): Wald Lecture I-Machine Learning. Department of Statistics, University of California, Berkeley.Google Scholar
  11. CARTER, C. and CATLETT, J. (1987): Assesing Credit Card Applications Using Machine Learning. IEEE Expert, Fall issue, 71–79.Google Scholar
  12. DIETTERICH, T. (2000): An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting and Randomization. Machine Learning 40, 139–158.CrossRefGoogle Scholar
  13. DIETTERICH, T. and KONG, E. (1995): Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms. Technical Report, Department of Computer Science, Oregon State University.Google Scholar
  14. FREUND, Y. and SCHAPIRE, R.E. (1997): A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences 55, 119–139.MathSciNetCrossRefGoogle Scholar
  15. FRIEDMAN, J.H. (1996): On Bias, Variance, 0/1-Loss, and The Curse-of-Dimensionality. Technical Report, Department of Computer Science, Stanford University.Google Scholar
  16. FRIEDMAN, J.H. (1999): Stochastic Gradient Boosting. Technical Report, Department of Computer Science, Stanford University.Google Scholar
  17. GATNAR, E. (2002): Tree-based models in statistics: three decades of research. In: K. Jajuga, A. Sokoiowski, and H.-H. Bock (Eds.): Classification, Clustering, and Analysis. Springer, Berlin, 399–408.Google Scholar
  18. HASTIE, T. and PREGIBON, D. (1991): Shrinking Trees. Technical Report, AT&T Laboratories, Murray Hill.Google Scholar
  19. HO, T.K. (1998): The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832–844.CrossRefGoogle Scholar
  20. JIANG, W. (2000): Process Consistency for AdaBoost. Technical Report 00-05, Department of Statistics, Northwestern University.Google Scholar
  21. KOHAVI, R. and WOLPERT, D. (1996): Bias Plus Variance Decomposition for Zero-One Loss Functions. In: L. Saitta (Ed.): Machine Learning: Proceedings of the XIIIth International Conference, Morgan Kaufman, 313–321.Google Scholar
  22. LATINNE, P., DEBEIR, O., and DECAESECKER, Ch. (2001): Limiting the number of trees in random forests, In: J. Kittler and F. Roli (Eds.): Multiple Classifier System, LNCS 2096, Springer, Berlin, 178–187.Google Scholar
  23. LUGOSI, G. and VAYATIS, N. (2002): Statistical Study of Regularized Boosting Methods. Technical Report, Department of Economics, Pompeu Fabra University, Barcelona.Google Scholar
  24. QUINLAN, J.R. (1993): C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo.Google Scholar
  25. TIBSHIRANI, R. (1996): Regression shrinkage and selection via the lasso. J.R. Statist. Soc. B, 58, 267–288.MATHMathSciNetGoogle Scholar
  26. TUKEY, J. (1977): Exploratory Data Analysis, Addison-Wesly, Reading.Google Scholar
  27. WOLPERT, D.H. (1992): Stacked Generalization. Neural Networks, 5, 241–259CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin · Heidelberg 2005

Authors and Affiliations

  • Eugeniusz Gatnar
    • 1
  1. 1.Institute of StatisticsKatowice University of EconomicsKatowicePoland

Personalised recommendations