Annals of Operations Research

, Volume 263, Issue 1–2, pp 5–19 | Cite as

Distant diversity in dynamic class prediction

Data Mining and Analytics


Instead of using the same ensemble for all data instances, recent studies have focused on dynamic ensembles in which a new ensemble is chosen from a pool of classifiers for each new data instance. Classifiers agreement in the region where a new data instance resides in has been considered as a major factor in dynamic ensembles. We postulate that the classifiers chosen for a dynamic ensemble should behave similarly in the region in which the new instance resides, but differently outside of this area. In other words, we hypothesize that high local accuracy, combined with high diversity in other regions, is desirable. To verify the validity of this hypothesis we propose two approaches. The first approach focuses on finding the k-nearest data instances to the new instance, which then defines a neighborhood, and maximizes simultaneously local accuracy and distant diversity, based on data instances outside of the neighborhood. The second method makes use of an alternative definition of the neighborhood: all data instances are in the neighborhood. However, the importance of data instances for accuracy and diversity depends on the distance to the new instance. We demonstrate through several experiments that the distance-based diversity and accuracy outperform all benchmark methods.


Dynamic ensemble Classification Diversity Local accuracy 



We presented this study at the INFORMS 2015 Data Mining & Analytics Workshop and got invited by the co-chairs of the workshop to submit to this special issue.


  1. Ahn, H., & Kim, K. J. (2008). Using genetic algorithms to optimize nearest neighbors for data mining. Annals of Operations Research, 163(1), 5–18.CrossRefGoogle Scholar
  2. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.Google Scholar
  3. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.CrossRefGoogle Scholar
  4. Brown, G., & Kuncheva, L. I. (2010). “Good” and “bad” diversity in majority vote ensembles. In N. Gayar, J. Kittler, & F. Roli (eds.), Multiple classifier systems, lecture notes in computer science (Vol. 5997, pp. 124–133). Berlin, Heidelberg: Springer. doi: 10.1007/978-3-642-12127-2-13.
  5. Cavalin, P. R., Sabourin, R., & Suen, C. Y. (2010). Dynamic selection of ensembles of classifiers using contextual information. In N. Gayar, J. Kittler, & F. Roli (Eds.), Multiple classifier systems, lecture notes in computer science (Vol. 5997, pp. 145–154). Berlin, Heidelberg: Springer. doi: 10.1007/978-3-642-12127-2-15.
  6. Chang, C. C., & Lin, C. J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3), 27.Google Scholar
  7. Didaci, L., & Giacinto, G. (2004). Dynamic classifier selection by adaptive K-nearest-neighbourhood rule. In F. Roli, J. Kittler & T. Windeatt (Eds.), Multiple classifier systems: Proceedings of the 5th International workshop, MCS 2004, Cagliari, Italy, June 9–11 (pp. 174–183). Berlin: Springer.Google Scholar
  8. Didaci, L., Giacinto, G., Roli, F., & Marcialis, G. L. (2005). A study on the performances of dynamic classifier selection based on local accuracy estimation. Pattern Recognition, 38(11), 2188–2191.CrossRefGoogle Scholar
  9. Dos Santos, E. M., Sabourin, R., & Maupin, P. (2008). A dynamic overproduce-and-choose strategy for the selection of classifier ensembles. Pattern Recognition, 41(10), 2993–3009.CrossRefGoogle Scholar
  10. Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In International workshop on machine learning (Vol. 96, pp. 148–156). Morgan Kaufmann.Google Scholar
  11. Giacinto, G., & Roli, F. (2001). Dynamic classifier selection based on multiple classifier behaviour. Pattern Recognition, 34(9), 1879–1882.CrossRefGoogle Scholar
  12. Gray, G. A., Williams, P. J., Brown, W. M., Faulon, J. L., & Sale, K. L. (2010). Disparate data fusion for protein phosphorylation prediction. Annals of Operations Research, 174(1), 219–235.CrossRefGoogle Scholar
  13. Hsu, K. W., & Srivastava, J. (2010). Relationship between diversity and correlation in multi-classifier systems. In M. J. Zaki, J. X. Yu, B. Ravindran, & V. Pudi (Eds.), Advances in knowledge discovery and data mining, lecture notes in computer science (Vol. 6119, pp. 500–506). Berlin, Heidelberg: Springer. doi: 10.1007/978-3-642-13672-6-47.
  14. Ko, A. H., Sabourin, R., & Britto, A. S, Jr. (2008). From dynamic classifier selection to dynamic ensemble selection. Pattern Recognition, 41(5), 1718–1731.CrossRefGoogle Scholar
  15. Kohavi, R., & Wolpert, D. H. (1996). Bias plus variance decomposition for zero-one loss functions. In Machine learning: Proceedings of the thirteenth international (pp. 275–283).Google Scholar
  16. Kuncheva, L. I., & Whitaker, C. J. (2003). Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51(2), 181–207.CrossRefGoogle Scholar
  17. Margineantu, D. D., & Dietterich, T. G. (1997). Pruning adaptive boosting. International Workshop on Machine Learning, 97, 211–218.Google Scholar
  18. Tang, E. K., Suganthan, P. N., & Yao, X. (2006). An analysis of diversity measures. Machine Learning, 65(1), 247–271.CrossRefGoogle Scholar
  19. Tumer, K., & Ghosh, J. (1996). Error correlation and error reduction in ensemble classifiers. Connection Science, 8, 385–404.CrossRefGoogle Scholar
  20. Woods, K., Kegelmeyer, W. P, Jr., & Bowyer, K. (1997). Combination of multiple classifiers using local accuracy estimates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4), 405–410.CrossRefGoogle Scholar
  21. Yaşar Sağlam, C., & Street, W. N. (2014). Dynamic class prediction with classifier based distance measure. In Conferences in research and practice in information technology (CRPIT): Proceedings of The twelfth Australasian data mining conference, ICML-04 (Vol. 158).Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Evidence, Monitoring and Governance; Corporate, Governance and InformationMinistry of Business, Innovation, and EmploymentWellingtonNew Zealand
  2. 2.Department of Management SciencesUniversity of IowaIowa CityUSA

Personalised recommendations