A novel method for constructing ensemble classifiers
- 249 Downloads
This paper presents a novel ensemble classifier generation method by integrating the ideas of bootstrap aggregation and Principal Component Analysis (PCA). To create each individual member of an ensemble classifier, PCA is applied to every out-of-bag sample and the computed coefficients of all principal components are stored, and then the principal components calculated on the corresponding bootstrap sample are taken as additional elements of the original feature set. A classifier is trained with the bootstrap sample and some features randomly selected from the new feature set. The final ensemble classifier is constructed by majority voting of the trained base classifiers. The results obtained by empirical experiments and statistical tests demonstrate that the proposed method performs better than or as well as several other ensemble methods on some benchmark data sets publicly available from the UCI repository. Furthermore, the diversity-accuracy patterns of the ensemble classifiers are investigated by kappa-error diagrams.
KeywordsEnsemble classifier Bootstrap Bagging Random forest Adaboost Principal component analysis Kappa-error diagram
Unable to display preview. Download preview PDF.
- Blake, C.L., Merz, C.J.: UCI repository of machine learning datasets. http://www.ics.uci.edu/~mlearn/MLRepository.html (1998)
- Breiman, L.: Out-of-bag estimation. Technical report, Statistics Department, University of California Berkeley, Berkeley, CA (1996b) Google Scholar
- Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Tree. Chapman and Hall, New York (1984) Google Scholar
- Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning, Bari, Italy, pp. 148–156 (1996) Google Scholar
- Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Advances in Neural Information Processing Systems, vol. 8, pp. 231–238. MIT Press, Cambridge (1995) Google Scholar
- Margineantu, D.D., Dietterich, T.G.: Pruning adaptive boosting. In: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 211–218. Morgan Kaufmann, San Mateo (1997) Google Scholar
- Optiz, D.W., Shavlik, J.W.: Generating accurate and diverse members of a neural-network ensemble. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.L. (eds.) Advances in Neural Information Processing Systems, vol. 8, pp. 535–541. MIT Press, Cambridge (1996) Google Scholar
- Optiz, D., Maclin, R.: Popular ensemble methods: an empirical study. J. Artif. Intell. Res. 11, 169–198 (1999) Google Scholar
- Skurichina, M., Duin, R.P.W.: Combining feature subsets in feature selection. In: Multiple Classifier Systems. Lecture Notes in Computer Science, vol. 2541, pp. 165–175. Springer, Berlin (2005) Google Scholar
- Tresp, V.: Committee machines. In: Hu, Y.H., Hwang, J.-N. (eds.) Handbook for Neural Network Signal Processing. CRC, Boca Raton (2001) Google Scholar