Machine Learning

, Volume 51, Issue 2, pp 181–207 | Cite as

Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy

  • Ludmila I. Kuncheva
  • Christopher J. Whitaker


Diversity among the members of a team of classifiers is deemed to be a key issue in classifier combination. However, measuring diversity is not straightforward because there is no generally accepted formal definition. We have found and studied ten statistics which can measure diversity among binary classifier outputs (correct or incorrect vote for the class label): four averaged pairwise measures (the Q statistic, the correlation, the disagreement and the double fault) and six non-pairwise measures (the entropy of the votes, the difficulty index, the Kohavi-Wolpert variance, the interrater agreement, the generalized diversity, and the coincident failure diversity). Four experiments have been designed to examine the relationship between the accuracy of the team and the measures of diversity, and among the measures themselves. Although there are proven connections between diversity and accuracy in some special cases, our results raise some doubts about the usefulness of diversity measures in building classifier ensembles in real-life pattern recognition problems.

pattern recognition multiple classifiers ensemble/committee of learners dependency and diversity majority vote 


  1. Afifi, A., & Azen, S. (1979). Statistical analysis. A computer oriented approach. New York: Academic Press.Google Scholar
  2. Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36, 105–142.Google Scholar
  3. Breiman, L. (1996). Bagging predictors. Machine Learning, 26:2, 123–140.Google Scholar
  4. Breiman, L. (1999). Combining predictors. In A. Sharkey (Ed.), Combining artificial neural nets (pp. 31–50). London: Springer-Verlag.Google Scholar
  5. Cunningham, P.,& Carney, J. (2000). Diversity versus quality in classification ensembles based on feature selection. Technical Report TCD-CS-2000-02, Department of Computer Science, Trinity College Dublin.Google Scholar
  6. Dietterich, T. (2000a). Ensemble methods in machine learning. In J. Kittler, & F. Roli (Eds.), Multiple classifier systems, Vol. 1857 of Lecture Notes in Computer Science (pp. 1–15). Cagliari, Italy, Springer.Google Scholar
  7. Dietterich, T. (2000b). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization. Machine Learning, 40:2, 139–157.Google Scholar
  8. Drucker, H., Cortes, C., Jackel, L., LeCun, Y., & Vapnik, V. (1994). Boosting and other ensemble methods. Neural Computation, 6, 1289–1301.Google Scholar
  9. Duin, R. (1997). PRTOOLS (Version 2). A Matlab toolbox for pattern recognition. Pattern Recognition Group, Delft University of Technology.Google Scholar
  10. Fleiss, J. (1981). Statistical methods for rates and proportions. John Wiley & Sons.Google Scholar
  11. Freund, Y., & Schapire, R. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55:1, 119–139.Google Scholar
  12. Giacinto, G., & Roli, F. (2001). Design of effective neural network ensembles for image classification processes. Image Vision and Computing Journal, 19:9/10, 699–707.Google Scholar
  13. Hansen, L.,& Salamon, P. (1990). Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12:10, 993–1001.Google Scholar
  14. Hashem, S. (1999). Treating harmful collinearity in neural network ensembles. In A. Sharkey (Ed.), Combining artificial neural nets (pp. 101–125). London: Springer-Verlag.Google Scholar
  15. Ho, T. (1998). The random space method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20:8, 832–844.Google Scholar
  16. Huang, Y., & Suen, C. (1995). A method of combining multiple experts for the recognition of unconstrained handwritten numerals. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17, 90–93.Google Scholar
  17. Ji, C., & Ma, S. (1997). Combination of weak classifiers. IEEE Transactions on Neural Networks, 8:1, 32–42.Google Scholar
  18. Kohavi, R., & Wolpert, D. (1996). Bias plus variance decomposition for zero-one loss functions. In L. Saitta (Ed.), Machine Learning: Proc. 13th International Conference (pp. 275–283). Morgan Kaufmann.Google Scholar
  19. Krogh, A., & Vedelsby, J. (1995). Neural network ensembles, cross validation and active learning. In G. Tesauro, D. Touretzky, & T. Leen (Eds.), Advances in neural information processing systems (Vol. 7, pp. 231–238). Cambridge, MA: MIT Press.Google Scholar
  20. Kuncheva, L. (2000). Fuzzy classifier design. Studies in Fuzziness and Soft Computing. Heidelberg: Springer Verlag.Google Scholar
  21. Kuncheva, L., Bezdek, J., & Duin, R. (2001). Decision templates for multiple classifier fusion: An experimental comparison. Pattern Recognition, 34:2, 299–314.Google Scholar
  22. Kuncheva, L., Whitaker, C., Shipp, C.,& Duin, R. (2000). Limits on the majority vote accuracy in classifier fusion. Pattern Analysis and Applications. accepted.Google Scholar
  23. Lam, L. (2000). Classifier combinations: Implementations and theoretical issues. In J. Kittler, & F. Roli (Eds.), Multiple classifier systems, Vol. 1857 of Lecture Notes in Computer Science (pp. 78–86). Cagliari, Italy, Springer.Google Scholar
  24. Littlewood, B., & Miller, D. (1989). Conceptual modeling of coincident failures in multiversion software. IEEE Transactions on Software Engineering, 15:12, 1596–1614.Google Scholar
  25. Liu, Y., & Yao, X. (1999). Ensemble learning via negative correlation. Neural Networks, 12, 1399–1404.Google Scholar
  26. Looney, S. (1988). A statistical technique for comparing the accuracies of several classifiers. Pattern Recognition Letters, 8, 5–9.Google Scholar
  27. Opitz, D.,& Shavlik, J. (1999).A genetic algorithm approach for creating neural network ensembles. In A. Sharkey (Ed.), Combining artificial neural nets (pp. 79–99). London: Springer-Verlag.Google Scholar
  28. Parmanto, B., Munro, P., & Doyle, H. (1996). Reducing variance of committee prediction with resampling techniques. Connection Science, 8:3/4, 405–425.Google Scholar
  29. Partridge, D., & Krzanowski, W. J. (1997). Software diversity: Practical statistics for its measurement and exploitation. Information & Software Technology, 39, 707–717.Google Scholar
  30. Rosen, B. (1996). Ensemble learning using decorrelated neural networks. Connection Science, 8:3/4, 373–383.Google Scholar
  31. Ruta, D., & Gabrys, B. (2001). Application of the evolutionary algorithms for classifier selection in multiple classifier systems with majority voting. In J. Kittler, & F. Roli (Eds.), Proc. Second International Workshop on Multiple Classifier Systems, Vol. 2096 of Lecture Notes in Computer Science (pp. 399–408). Cambridge, UK. Springer-Verlag.Google Scholar
  32. Schapire, R. (1999). Theoretical views of boosting. In Proc. 4th European Conference on Computational Learning Theory (pp. 1–10).Google Scholar
  33. Sharkey, A., & Sharkey, N. (1997). Combining diverse neural nets. The Knowledge Engineering Review, 12:3, 231–247.Google Scholar
  34. Skalak, D. (1996). The sources of increased accuracy for two proposed boosting algorithms. In Proc. American Association for Artificial Intelligence, AAAI-96, Integrating Multiple Learned Models Workshop.Google Scholar
  35. Sneath, P., & Sokal, R. (1973). Numerical Taxonomy. W.H. Freeman & Co.Google Scholar
  36. Tumer, K., & Ghosh, J. (1996a). Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition, 29:2, 341–348.Google Scholar
  37. Tumer, K., & Ghosh, J. (1996b). Error correlation and error reduction in ensemble classifiers. Connection Science, 8:3/4, 385–404.Google Scholar
  38. Tumer, K.,& Ghosh, J. (1999). Linear and order statistics combiners for pattern classification. In A. Sharkey (Ed.), Combining artificial neural nets (pp. 127–161). London: Springer-Verlag.Google Scholar
  39. Yule, G. (1900). On the association of attributes in statistics. Phil. Trans., A, 194, 257–319.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • Ludmila I. Kuncheva
    • 1
  • Christopher J. Whitaker
    • 1
  1. 1.School of InformaticsUniversity of Wales, BangorBangor, GwyneddUK

Personalised recommendations