Diversity in Combinations of Heterogeneous Classifiers

  • Kuo-Wei Hsu
  • Jaideep Srivastava
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5476)

Abstract

In this paper, we introduce the use of combinations of heterogeneous classifiers to achieve better diversity. Conducting theoretical and empirical analyses of the diversity of combinations of heterogeneous classifiers, we study the relationship between heterogeneity and diversity. On the one hand, the theoretical analysis serves as a foundation for employing heterogeneous classifiers in Multi-Classifier Systems or ensembles. On the other hand, experimental results provide empirical evidence. We consider synthetic as well as real data sets, utilize classification algorithms that are essentially different, and employ various popular diversity measures for evaluation. Two interesting observations will contribute to the future design of Multi-Classifier Systems and ensemble techniques. First, the diversity among heterogeneous classifiers is higher than that among homogeneous ones, and hence using heterogeneous classifiers to construct classifier combinations would increase the diversity. Second, the heterogeneity primarily results from different classification algorithms rather than the same algorithm with different parameters.

Keywords

Multi-Classifier System ensemble diversity heterogeneity 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aha, D., Kibler, D.: Instance-based learning algorithms. Machine Learning 6(1), 37–66 (1991)MATHGoogle Scholar
  2. 2.
    Alkoot, F.M., Kittler, J.: Multiple expert system design by combined feature selection and probability level fusion. In: Proc. of the 3rd International Conference on Information Fusion, vol. 2, pp. THC5/9–THC516 (2000)Google Scholar
  3. 3.
    Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
  4. 4.
    Bahler, D., Navarro, L.: Methods for Combining Heterogeneous Sets of Classifiers. In: The 17th National Conference on Artificial Intelligence, Workshop on New Research Problems for Machine Learning (2000)Google Scholar
  5. 5.
    Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P.: A New Ensemble Diversity Measure Applied to Thinning Ensembles. In: International Workshop on Multiple Classifier Systems, pp. 306–316 (2003)Google Scholar
  6. 6.
    Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)MATHGoogle Scholar
  7. 7.
    Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)MathSciNetMATHCrossRefGoogle Scholar
  8. 8.
    Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorization. Information Fusion 6(1), 5–20 (2005)CrossRefGoogle Scholar
  9. 9.
    Dietterich, T.G.: Ensemble Methods in Machine Learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  10. 10.
    Freund, Y., Schapire, R.E.: Experiments with a New Boosting Algorithm. In: Proc. of the 13th International Conference on Machine Learning, pp. 148–156 (1996)Google Scholar
  11. 11.
    Ghosh, J.: Multiclassifier Systems: Back to the Future. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 1–15. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  12. 12.
    Hettich, S., Bay, S.D.: The UCI KDD Archive. University of California, Department of Information and Computer Science, Irvine, CA (1999), http://kdd.ics.uci.edu
  13. 13.
    John, G.H., Langley, P.: Estimating Continuous Distributions in Bayesian Classifiers. In: The 11th Conference on Uncertainty in Artificial Intelligence, pp. 338–345 (1995)Google Scholar
  14. 14.
    Kuncheva, L.I., Whitaker, C.J.: Ten measures of diversity in classifier ensembles: limits for two classifiers. In: A DERA/IEE Workshop on Intelligent Sensor Processing, pp. 10/1–10/10 (2001)Google Scholar
  15. 15.
    Kuncheva, L.I., Skurichina, M., Duin, R.P.W.: An experimental study on diversity for bagging and boosting with linear classifiers. Information Fusion 3(4), 245–258 (2002)CrossRefGoogle Scholar
  16. 16.
    Kuncheva, L.I., Whitaker, C.J.: Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy. Machine Learning 51(2), 181–207 (2003)MATHCrossRefGoogle Scholar
  17. 17.
    Kuncheva, L.I.: That elusive diversity in classifier ensembles. In: Proc. of Iberian Conference on Pattern Recognition and Image Analysis, pp. 1126–1138 (2003)Google Scholar
  18. 18.
    Opitz, D., Maclin, R.: Popular ensemble methods: An empirical study. Journal of AI Research 11, 169–198 (1999)MATHGoogle Scholar
  19. 19.
    Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)Google Scholar
  20. 20.
    Ranawana, R.: Multi-Classifier Systems - Review and a Roadmap for Developers. International Journal of Hybrid Intelligent Systems 3(1), 35–61 (2006)MATHCrossRefGoogle Scholar
  21. 21.
    Schapire, R.E.: The boosting approach to machine learning: An overview. In: MSRI Workshop on Nonlinear Estimation and Classification (2002)Google Scholar
  22. 22.
    Skurichina, M., Kuncheva, L., Duin, R.P.: Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 62–71. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  23. 23.
    Valentini, G., Masulli, F.: Ensembles of Learning Machines. In: Marinaro, M., Tagliaferri, R. (eds.) WIRN 2002. LNCS, vol. 2486, pp. 3–22. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  24. 24.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)MATHGoogle Scholar
  25. 25.
    Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.-H., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 Algorithms in Data Mining. Knowledge and Information Systems 14(1), 1–37 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Kuo-Wei Hsu
    • 1
  • Jaideep Srivastava
    • 1
  1. 1.University of MinnesotaMinneapolisUSA

Personalised recommendations