Advertisement

Abstract

Diversity among individual classifiers is recognized to play a key role in ensemble, however, few theoretical properties are known for classification. In this paper, by focusing on the popular ensemble pruning setting (i.e., combining classifier by voting and measuring diversity in pairwise manner), we present a theoretical study on the effect of diversity on the generalization performance of voting in the PAC-learning framework. It is disclosed that the diversity is closely-related to the hypothesis space complexity, and encouraging diversity can be regarded to apply regularization on ensemble methods. Guided by this analysis, we apply explicit diversity regularization to ensemble pruning, and propose the Diversity Regularized Ensemble Pruning (DREP) method. Experimental results show the effectiveness of DREP.

Keywords

diversity ensemble pruning diversity regularization 

References

  1. 1.
    Banfield, R., Hall, L., Bowyer, K., Kegelmeyer, W.: Ensemble diversity measures and their application to thinning. Information Fusion 6(1), 49–62 (2005)CrossRefGoogle Scholar
  2. 2.
    Bartlett, P.: The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network. IEEE Transactions on Neural Networks 44(2), 525–536 (1998)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Breiman, L.: Bagging predictors. Machine Learning 24(3), 123–140 (1996)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterey (1984)zbMATHGoogle Scholar
  5. 5.
    Brown, G.: An Information Theoretic Perspective on Multiple Classifier Systems. In: Benediktsson, J.A., Kittler, J., Roli, F. (eds.) MCS 2009. LNCS, vol. 5519, pp. 344–353. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  6. 6.
    Caruana, R., Niculescu-Mizil, A., Crew, G., Ksikes, A.: Ensemble selection from libraries of models. In: Proceedings of the 21st International Conference on Machine Learning, pp. 18–25 (2004)Google Scholar
  7. 7.
    Chen, H., Tiňo, P., Yao, X.: Predictive ensemble pruning by expectation propagation. IEEE Transactions on Knowledge and Data Engineering 21(7), 999–1013 (2009)CrossRefGoogle Scholar
  8. 8.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)zbMATHGoogle Scholar
  9. 9.
    Dietterich, T.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning 40(2), 139–157 (2000)CrossRefGoogle Scholar
  10. 10.
    Frank, A., Asuncion, A.: UCI machine learning repository (2010)Google Scholar
  11. 11.
    Fumera, G., Roli, F.: A theoretical and experimental analysis of linear combiners for multiple classifier systems. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(6), 942–956 (2005)CrossRefGoogle Scholar
  12. 12.
    Giacinto, G., Roli, F., Fumera, G.: Design of effective multiple classifier systems by clustering of classifiers. In: Proceedings of the 15th International Conference on Pattern Recognition, Barcelona, Spain, pp. 160–163 (2000)Google Scholar
  13. 13.
    Hernández-Lobato, D., Martínez-Muñoz, G., Suárez, A.: Empirical analysis and evaluation of approximate techniques for pruning regression bagging ensembles. Neurocomputing 74(12-13), 2250–2264 (2011)CrossRefGoogle Scholar
  14. 14.
    Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. In: Advances in Neural Information Processing Systems, Denver, CO, vol. 7, pp. 231–238 (1994)Google Scholar
  15. 15.
    Kuncheva, L., Whitaker, C.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51(2), 181–207 (2003)zbMATHCrossRefGoogle Scholar
  16. 16.
    Kuncheva, L., Whitaker, C., Shipp, C., Duin, R.: Limits on the majority vote accuracy in classifier fusion. Pattern Analysis & Applications 6(1), 22–31 (2003)MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    Lazarevic, A., Obradovic, Z.: Effective pruning of neural network classifier ensembles. In: Proceedings of the IEEE/INNS International Joint Conference on Neural Networks, Washington, DC, pp. 796–801 (2001)Google Scholar
  18. 18.
    Li, N., Zhou, Z.-H.: Selective Ensemble under Regularization Framework. In: Benediktsson, J.A., Kittler, J., Roli, F. (eds.) MCS 2009. LNCS, vol. 5519, pp. 293–303. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  19. 19.
    Margineantu, D., Dietterich, T.: Pruning adaptive boosting. In: Proceedings of the 14th International Conference on Machine Learning, Nashville, TN, pp. 211–218 (1997)Google Scholar
  20. 20.
    Martínez-Muñoz, G., Hernández-Lobato, D., Suárez, A.: An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(2), 245–259 (2009)CrossRefGoogle Scholar
  21. 21.
    Martínez-Muñoz, G., Suárez, A.: Aggregation ordering in bagging. In: Proceeding of the IASTED International Conference on Artificial Intelligence and Applications, Innsbruck, Austria, pp. 258–263 (2004)Google Scholar
  22. 22.
    Martínez-Muñoz, G., Suárez, A.: Pruning in ordered bagging ensembles. In: Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, pp. 609–616 (2006)Google Scholar
  23. 23.
    Partalas, I., Tsoumakas, G., Vlahavas, I.: Focused ensemble selection: A diversity-based method for greedy ensemble selection. In: Proceedings of 18th European Conference on Artificial Intelligence, Patras, Greece, pp. 117–121 (2008)Google Scholar
  24. 24.
    Partalas, I., Tsoumakas, G., Vlahavas, I.: A study on greedy algorithms for ensemble pruning. Technical Report TR-LPIS-360-12, Department of Informatics, Aristotle University of Thessaloniki, Greece (2012)Google Scholar
  25. 25.
    Schapire, R., Freund, Y., Bartlett, P., Lee, W.: Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics 26(5), 1651–1686 (1998)MathSciNetzbMATHCrossRefGoogle Scholar
  26. 26.
    Tamon, C., Xiang, J.: On the Boosting Pruning Problem. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 404–412. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  27. 27.
    Tang, E.K., Suganthan, P., Yao, X.: An analysis of diversity measures. Machine Learning 65(1), 247–271 (2006)CrossRefGoogle Scholar
  28. 28.
    Tsoumakas, G., Partalas, I., Vlahavas, I.: An Ensemble Pruning Primer. In: Okun, O., Valentini, G. (eds.) Applications of Supervised and Unsupervised Ensemble Methods. SCI, vol. 245, pp. 1–13. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  29. 29.
    Valiant, L.: A theory of the learnable. Communications of the ACM 27, 1134–1142 (1984)zbMATHCrossRefGoogle Scholar
  30. 30.
    Yu, Y., Li, Y.-F., Zhou, Z.-H.: Diversity regularized machine. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Spain, pp. 1603–1608 (2011)Google Scholar
  31. 31.
    Zhang, T.: Covering number bounds of certain regularized linear function classes. Journal of Machine Learning Research 2, 527–550 (2002)zbMATHGoogle Scholar
  32. 32.
    Zhang, Y., Burer, S., Street, W.: Ensemble pruning via semi-definite programming. Journal of Machine Learning Research 7, 1315–1338 (2006)MathSciNetzbMATHGoogle Scholar
  33. 33.
    Zhou, Z.-H.: Ensemble Methods: Foundations and Algorithms. Chapman & Hall/CRC, Boca Raton, FL (2012)Google Scholar
  34. 34.
    Zhou, Z.-H., Li, N.: Multi-information Ensemble Diversity. In: El Gayar, N., Kittler, J., Roli, F. (eds.) MCS 2010. LNCS, vol. 5997, pp. 134–144. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  35. 35.
    Zhou, Z.-H., Wu, J., Tang, W.: Ensembling neural networks: Many could be better than all. Artificial Intelligence 137(1-2), 239–263 (2002)MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Nan Li
    • 1
    • 2
  • Yang Yu
    • 1
  • Zhi-Hua Zhou
    • 1
  1. 1.National Key Laboratory for Novel Software TechnologyNanjing UniversityNanjingChina
  2. 2.School of Mathematical SciencesSoochow UniversitySuzhouChina

Personalised recommendations