Machine Learning

, Volume 65, Issue 1, pp 247–271 | Cite as

An analysis of diversity measures

  • E. K. TangEmail author
  • P. N. Suganthan
  • X. Yao


Diversity among the base classifiers is deemed to be important when constructing a classifier ensemble. Numerous algorithms have been proposed to construct a good classifier ensemble by seeking both the accuracy of the base classifiers and the diversity among them. However, there is no generally accepted definition of diversity, and measuring the diversity explicitly is very difficult. Although researchers have designed several experimental studies to compare different diversity measures, usually confusing results were observed. In this paper, we present a theoretical analysis on six existing diversity measures (namely disagreement measure, double fault measure, KW variance, inter-rater agreement, generalized diversity and measure of difficulty), show underlying relationships between them, and relate them to the concept of margin, which is more explicitly related to the success of ensemble learning algorithms. We illustrate why confusing experimental results were observed and show that the discussed diversity measures are naturally ineffective. Our analysis provides a deeper understanding of the concept of diversity, and hence can help design better ensemble learning algorithms.


Classifier ensemble Diversity measures Margin distribution Majority vote Disagreement measure Double fault measure KW variance Interrater agreement Generalized diversity Measure of difficulty Entropy measure Coincident failure diversity 


  1. Atukorale, A. S., Downs, T., & Suganthan, P. N. (2003). Boosting HONG Networks. Neurocomputing, 51, 75–86.CrossRefGoogle Scholar
  2. Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36, 105-142.Google Scholar
  3. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.zbMATHMathSciNetGoogle Scholar
  4. Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.zbMATHCrossRefGoogle Scholar
  5. Brown, G., Wyatt, J. L., Harris, R. & Yao, X. (2004). Diversity Creation Methods: A Survey and Categorization. Information Fusion Journal (Special issue on Diversity in Multiple Classifier Systems), 6(1), 5–20.zbMATHGoogle Scholar
  6. Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121–167.CrossRefGoogle Scholar
  7. Dietterich, T. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization. Machine Learning, 40(2), 1–22.CrossRefGoogle Scholar
  8. Fleiss, J. (1981). Statistical methods for rates and proportions. John Wiley & Sons.Google Scholar
  9. Freund, Y. (1995). Boosting a weak learning algorithm by majority. Information and Computation, 121(2), 256–285.zbMATHMathSciNetCrossRefGoogle Scholar
  10. Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In: Proc. 13th Int. Conference on Machine Learning(pp. 148–156). Morgan Kaufmann.Google Scholar
  11. Giacinto, G., & Roli, F. (2001). Design of effective neural network ensembles for image classification processes. Image Vision and Computing, 19(9/10), 699–707.Google Scholar
  12. Hansen, L. & Salamon, P. (1990). Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10), 993–1001.CrossRefGoogle Scholar
  13. Ho, T. (1998). The random space method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832–844.CrossRefGoogle Scholar
  14. Kohavi, R., & Wolpert, D. (1996). Bias plus variance decomposition for zero-one loss functions. In: L. Saitta (Ed.), Proc. 13th Int. Conference on Machine Learning(pp. 275–283). Morgan Kaufmann.Google Scholar
  15. Krogh, A., & Vedelsby, J. (1995). Neural network ensembles, cross validation, and active Learning. In: G. Tesauro, D. S. Touretzky and T. K. Leen (Eds.), Advances in Neural Information Processing Systems 7 (pp. 231–238). Cambridge, MA: MIT Press.Google Scholar
  16. Kuncheva, L., & Whitaker, C. (2003a). Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51, 181–207.CrossRefGoogle Scholar
  17. Kuncheva, L.I. (2003b). That elusive diversity in classifier ensembles. In: Proc IbPRIA 2003, Mallorca, Spain, 2003, Lecture Notes in Computer Science, Springer-Verlag, LNCS 2652, 1126–1138.Google Scholar
  18. Liu, Y., & Yao, X. (1997). Negatively correlated neural networks can produce best ensembles. Australian Journal of Intelligent Information Processing Systems, 4(3/4), 176–185.Google Scholar
  19. Liu, Y., Yao, X., & Higuchi, T. (2000). Evolutionary ensembles with negative correlation learning. IEEE Transactions on Evolutionary Computation, 4, 380–387.zbMATHCrossRefGoogle Scholar
  20. Margineantu, D., & Dietterich, T. (1997). Pruning Adaptive Boosting. In: Proceedings ICML’97: International Conference on Machine Learning (pp. 211–218). Los Altos, CA: Morgan Kaufmann.Google Scholar
  21. Mason, L., Bartlett, P. L., & Baxter, J. (2000). Improved generalization through explicit optimization of margins. Machine Learning, 38(3), 243–255.zbMATHCrossRefGoogle Scholar
  22. Patridge, D., & Krzanowski, W. J. (1997). Software diversity: Practical statistics for its measurement and exploitation. Information & Software Technology, 39, 707–717.CrossRefGoogle Scholar
  23. Rätsch, G., Onoda, T., & Müller, K.-R. (2001). Soft margins for AdaBoost. Machine Learning, 42(3), 287–320.zbMATHCrossRefGoogle Scholar
  24. Schapire, R. E., Freund, Y., Bartlett, P. L., & Lee, W. S. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. Annals of Statistics, 26(5), 1651–1686.zbMATHMathSciNetCrossRefGoogle Scholar
  25. Schapire, R. (1999). Theoretical views of boosting. In: Proc. 4th European Conference on Computational Learning Theory(pp. 1–10).Google Scholar
  26. Skalak, D. (1996). The sources of increased accuracy for two proposed boosting algorithms. In: Proc. American Association for Artificial Intelligence, AAAI-96, Integrating Multiple Learned Models Workshop.Google Scholar
  27. Suganthan, P. N. (1999). Hierarchical Overlapped SOM’s for Pattern Classification. IEEE Transactions on Neural Networks, 10(1), 193–196.CrossRefGoogle Scholar
  28. Suykens, J. A. K., Van Gestel, T., De Brabanter, J., De Moor, B., & Vandewalle, J. (2002). Least Squares Support Vector Machines Singapore. World Scientific.Google Scholar
  29. Tamon, C., & Xiang, J. (2000). On the Boosting Pruning problem. In: R. L. Mantaras and E. Plaza (Eds.), Machine Learning: Proc. 11th European Conference, Vol. 1810 Lecture Notes in Computer Science (pp. 404–412). Springer.Google Scholar
  30. Vapnik, V. (1995). The Nature of Statistical Learning Theory. Berlin: Springer.zbMATHGoogle Scholar

Copyright information

© Springer Science + Business Media, LLC 2006

Authors and Affiliations

  1. 1.School of Electrical and Electronic EngineeringNanyang Technological UniversitySingapore
  2. 2.School of Computer ScienceUniversity of BirminghamBirminghamUK

Personalised recommendations