Graph-Based Model-Selection Framework for Large Ensembles

  • Krisztian Buza
  • Alexandros Nanopoulos
  • Lars Schmidt-Thieme
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6076)


The intuition behind ensembles is that different prediciton models compensate each other’s errors if one combines them in an appropriate way. In case of large ensembles a lot of different prediction models are available. However, many of them may share similar error characteristics, which highly depress the compensation effect. Thus the selection of an appropriate subset of models is crucial. In this paper, we address this problem. As major contribution, for the case if a large number of models is present, we propose a graph-based framework for model selection while paying special attention to the interaction effect of models. In this framework, we introduce four ensemble techniques and compare them to the state-of-the-art in experiments on publicly available real-world data.


Ensemble model selection 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bacauskiene, M., Verikas, A., Gelzinis, A., Valincius, D.: A feature selection technique for generation of classification committees and its application to categorization of laryngeal images. Pattern Recognition 42, 645–654 (2009)CrossRefGoogle Scholar
  2. 2.
    Bryll, R., Gutierrez-Osuna, R., Quek, F.: Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognition 36(6), 1291–1302 (2003)CrossRefzbMATHGoogle Scholar
  3. 3.
    Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  4. 4.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)CrossRefGoogle Scholar
  5. 5.
    Li, G.-Z., Liu, T.-Y.: Feature selection for bagging of support vector machines. In: Yang, Q., Webb, G. (eds.) PRICAI 2006. LNCS (LNAI), vol. 4099, pp. 271–277. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Peng, Y.: A novel ensemble machine learning for robust microarray data classification. Computers in Biology and Medicine 36(6), 553–573 (2006)CrossRefGoogle Scholar
  7. 7.
    Preisach, C., Schmidt-Thieme, L.: Ensembles of relational classifiers. Knowl. Inf. Syst. 14, 249–272 (2008)CrossRefzbMATHGoogle Scholar
  8. 8.
    Tan, A.C., Gilbert, D.: Ensemble machine learning on gene expression data for cancer classification (2003)Google Scholar
  9. 9.
    Ting, K.M., Witten, I.H.: Stacked generalization: when does it work? In: Int’l. Joint Conf. on Artificial Intelligence, pp. 866–871. Morgan Kaufmann, San Francisco (1997)Google Scholar
  10. 10.
    Tsymbal, A., Patterson, D.W., Puuronen, S.: Ensemble feature selection with simple bayesian classification. Inf. Fusion 4, 87–100 (2003)CrossRefGoogle Scholar
  11. 11.
    Webb, G.I., Boughton, J.R., Wang, Z.: Not so naive bayes: Aggregating one-dependence estimators. Mach. Learn. 58(1), 5–24 (2005)CrossRefzbMATHGoogle Scholar
  12. 12.
    Wolpert, D.H.: Stacked generalization. Neural Networks 5, 241–259 (1992)CrossRefGoogle Scholar
  13. 13.
    Yang, Y., et al.: To select or to weigh: A comparative study of linear combination schemes for superparent-one-dependence estimators. IEEE Trans. on Knowledge and Data Engineering 19, 1652–1665 (2007)CrossRefGoogle Scholar
  14. 14.
    Zhou, Z.-H., Wu, J., Tang, W., Zhou, Z.h., Wu, J., Tang, W.: Ensembling neural networks: Many could be better than all. Artificial Intelligence 137(1-2), 239–263 (2002)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Krisztian Buza
    • 1
  • Alexandros Nanopoulos
    • 1
  • Lars Schmidt-Thieme
    • 1
  1. 1.Information Systems and Machine Learning Lab (ISMLL) Samelsonplatz 1University of HildesheimHildesheimGermany

Personalised recommendations