Abstract
Diversity among the base classifiers is deemed to be important when constructing a classifier ensemble. Numerous algorithms have been proposed to construct a good classifier ensemble by seeking both the accuracy of the base classifiers and the diversity among them. However, there is no generally accepted definition of diversity, and measuring the diversity explicitly is very difficult. Although researchers have designed several experimental studies to compare different diversity measures, usually confusing results were observed. In this paper, we present a theoretical analysis on six existing diversity measures (namely disagreement measure, double fault measure, KW variance, inter-rater agreement, generalized diversity and measure of difficulty), show underlying relationships between them, and relate them to the concept of margin, which is more explicitly related to the success of ensemble learning algorithms. We illustrate why confusing experimental results were observed and show that the discussed diversity measures are naturally ineffective. Our analysis provides a deeper understanding of the concept of diversity, and hence can help design better ensemble learning algorithms.
Article PDF
Similar content being viewed by others
References
Atukorale, A. S., Downs, T., & Suganthan, P. N. (2003). Boosting HONG Networks. Neurocomputing, 51, 75–86.
Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36, 105-142.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
Brown, G., Wyatt, J. L., Harris, R. & Yao, X. (2004). Diversity Creation Methods: A Survey and Categorization. Information Fusion Journal (Special issue on Diversity in Multiple Classifier Systems), 6(1), 5–20.
Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121–167.
Dietterich, T. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization. Machine Learning, 40(2), 1–22.
Fleiss, J. (1981). Statistical methods for rates and proportions. John Wiley & Sons.
Freund, Y. (1995). Boosting a weak learning algorithm by majority. Information and Computation, 121(2), 256–285.
Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In: Proc. 13th Int. Conference on Machine Learning(pp. 148–156). Morgan Kaufmann.
Giacinto, G., & Roli, F. (2001). Design of effective neural network ensembles for image classification processes. Image Vision and Computing, 19(9/10), 699–707.
Hansen, L. & Salamon, P. (1990). Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10), 993–1001.
Ho, T. (1998). The random space method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832–844.
Kohavi, R., & Wolpert, D. (1996). Bias plus variance decomposition for zero-one loss functions. In: L. Saitta (Ed.), Proc. 13th Int. Conference on Machine Learning(pp. 275–283). Morgan Kaufmann.
Krogh, A., & Vedelsby, J. (1995). Neural network ensembles, cross validation, and active Learning. In: G. Tesauro, D. S. Touretzky and T. K. Leen (Eds.), Advances in Neural Information Processing Systems 7 (pp. 231–238). Cambridge, MA: MIT Press.
Kuncheva, L., & Whitaker, C. (2003a). Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51, 181–207.
Kuncheva, L.I. (2003b). That elusive diversity in classifier ensembles. In: Proc IbPRIA 2003, Mallorca, Spain, 2003, Lecture Notes in Computer Science, Springer-Verlag, LNCS 2652, 1126–1138.
Liu, Y., & Yao, X. (1997). Negatively correlated neural networks can produce best ensembles. Australian Journal of Intelligent Information Processing Systems, 4(3/4), 176–185.
Liu, Y., Yao, X., & Higuchi, T. (2000). Evolutionary ensembles with negative correlation learning. IEEE Transactions on Evolutionary Computation, 4, 380–387.
Margineantu, D., & Dietterich, T. (1997). Pruning Adaptive Boosting. In: Proceedings ICML’97: International Conference on Machine Learning (pp. 211–218). Los Altos, CA: Morgan Kaufmann.
Mason, L., Bartlett, P. L., & Baxter, J. (2000). Improved generalization through explicit optimization of margins. Machine Learning, 38(3), 243–255.
Patridge, D., & Krzanowski, W. J. (1997). Software diversity: Practical statistics for its measurement and exploitation. Information & Software Technology, 39, 707–717.
Rätsch, G., Onoda, T., & Müller, K.-R. (2001). Soft margins for AdaBoost. Machine Learning, 42(3), 287–320.
Schapire, R. E., Freund, Y., Bartlett, P. L., & Lee, W. S. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. Annals of Statistics, 26(5), 1651–1686.
Schapire, R. (1999). Theoretical views of boosting. In: Proc. 4th European Conference on Computational Learning Theory(pp. 1–10).
Skalak, D. (1996). The sources of increased accuracy for two proposed boosting algorithms. In: Proc. American Association for Artificial Intelligence, AAAI-96, Integrating Multiple Learned Models Workshop.
Suganthan, P. N. (1999). Hierarchical Overlapped SOM’s for Pattern Classification. IEEE Transactions on Neural Networks, 10(1), 193–196.
Suykens, J. A. K., Van Gestel, T., De Brabanter, J., De Moor, B., & Vandewalle, J. (2002). Least Squares Support Vector Machines Singapore. World Scientific.
Tamon, C., & Xiang, J. (2000). On the Boosting Pruning problem. In: R. L. Mantaras and E. Plaza (Eds.), Machine Learning: Proc. 11th European Conference, Vol. 1810 Lecture Notes in Computer Science (pp. 404–412). Springer.
Vapnik, V. (1995). The Nature of Statistical Learning Theory. Berlin: Springer.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Tom Fawcett
Rights and permissions
About this article
Cite this article
Tang, E.K., Suganthan, P.N. & Yao, X. An analysis of diversity measures. Mach Learn 65, 247–271 (2006). https://doi.org/10.1007/s10994-006-9449-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-006-9449-2