Skip to main content
Log in

A novel method for constructing ensemble classifiers

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

This paper presents a novel ensemble classifier generation method by integrating the ideas of bootstrap aggregation and Principal Component Analysis (PCA). To create each individual member of an ensemble classifier, PCA is applied to every out-of-bag sample and the computed coefficients of all principal components are stored, and then the principal components calculated on the corresponding bootstrap sample are taken as additional elements of the original feature set. A classifier is trained with the bootstrap sample and some features randomly selected from the new feature set. The final ensemble classifier is constructed by majority voting of the trained base classifiers. The results obtained by empirical experiments and statistical tests demonstrate that the proposed method performs better than or as well as several other ensemble methods on some benchmark data sets publicly available from the UCI repository. Furthermore, the diversity-accuracy patterns of the ensemble classifiers are investigated by kappa-error diagrams.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Alpaydin, E.: Combined 5×2 cv F test for comparing supervised classification learning algorithms. Neural Comput. 11(8), 1885–1892 (1999)

    Article  Google Scholar 

  • Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P.: A comparison of decision tree ensemble creation techniques. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 173–180 (2007)

    Article  Google Scholar 

  • Blake, C.L., Merz, C.J.: UCI repository of machine learning datasets. http://www.ics.uci.edu/~mlearn/MLRepository.html (1998)

  • Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996a)

    MATH  MathSciNet  Google Scholar 

  • Breiman, L.: Out-of-bag estimation. Technical report, Statistics Department, University of California Berkeley, Berkeley, CA (1996b)

  • Breiman, L.: Arcing classifiers. Ann. Stat. 26(3), 801–849 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  • Breiman, L.: Random forests. Ann. Stat. 45(1), 5–32 (2001)

    MATH  MathSciNet  Google Scholar 

  • Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Tree. Chapman and Hall, New York (1984)

    Google Scholar 

  • Chandra, A., Yao, X.: Evolving hybrid ensembles of learning machines for better generalisation. Neurocomputing 69(7–9), 686–700 (2006)

    Article  Google Scholar 

  • Dietterich, T.G.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput. 10(7), 1895–1923 (1998)

    Article  Google Scholar 

  • Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40(2), 139–157 (2000)

    Article  Google Scholar 

  • Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman and Hall, New York (1993)

    MATH  Google Scholar 

  • Fleiss, J.L.: Statistical Methods for Rates and Proportions, 2nd edn. Wiley, New York (1981)

    MATH  Google Scholar 

  • Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning, Bari, Italy, pp. 148–156 (1996)

  • Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  • Hansen, L.K., Salamon, P.: Neural network ensembles. IEEE Trans. Pattern Anal. Mach. Intell. 12(10), 993–1001 (1990)

    Article  Google Scholar 

  • Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)

    Article  Google Scholar 

  • Hothorn, T., Lausen, B.: Double-bagging: combining classifiers by bootstrap aggregation. Pattern Recognit. 36(6), 1303–1309 (2003)

    Article  MATH  Google Scholar 

  • Hothorn, T., Lausen, B.: Bundling classifiers by bagging trees. Comput. Stat. Data Anal. 49(4), 1068–1078 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  • Hothorn, T., Leisch, F., Zeileis, A., Hornik, K.: The design and analysis of benchmark experiments. J. Comput. Graph. Stat. 14(3), 675–699 (2005)

    Article  MathSciNet  Google Scholar 

  • Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Advances in Neural Information Processing Systems, vol. 8, pp. 231–238. MIT Press, Cambridge (1995)

    Google Scholar 

  • Leblanc, M., Tibshirani, R.: Combining estimates in regression and classification. J. Am. Stat. Assoc. 91(436), 1641–1650 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  • Margineantu, D.D., Dietterich, T.G.: Pruning adaptive boosting. In: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 211–218. Morgan Kaufmann, San Mateo (1997)

    Google Scholar 

  • Meir, R., Rätsch, G.: An introduction to boosting and leveraging. In: Advanced Lectures on Machine Learning. Lecture Notes in Computer Science, vol. 2600, pp. 118–183. Springer, Berlin (2003)

    Chapter  Google Scholar 

  • Optiz, D.W., Shavlik, J.W.: Generating accurate and diverse members of a neural-network ensemble. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.L. (eds.) Advances in Neural Information Processing Systems, vol. 8, pp. 535–541. MIT Press, Cambridge (1996)

    Google Scholar 

  • Optiz, D., Maclin, R.: Popular ensemble methods: an empirical study. J. Artif. Intell. Res. 11, 169–198 (1999)

    Google Scholar 

  • Rodríguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: a new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1619–1630 (2006)

    Article  Google Scholar 

  • Skurichina, M., Duin, R.P.W.: Combining feature subsets in feature selection. In: Multiple Classifier Systems. Lecture Notes in Computer Science, vol. 2541, pp. 165–175. Springer, Berlin (2005)

    Google Scholar 

  • Tumer, K., Oza, N.C.: Input decimated ensembles. Pattern Anal. Appl. 6(1), 65–77 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  • Tresp, V.: Committee machines. In: Hu, Y.H., Hwang, J.-N. (eds.) Handbook for Neural Network Signal Processing. CRC, Boca Raton (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chun-Xia Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, CX., Zhang, JS. A novel method for constructing ensemble classifiers. Stat Comput 19, 317–327 (2009). https://doi.org/10.1007/s11222-008-9094-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-008-9094-7

Keywords

Navigation