Ensemble Learning

  • H. B. Mitchell


The subject of this chapter is ensemble learning in which our system is characterized by an ensemble of M models. The models may share the same common representational format or each model may have its own distinct common representational format. To make our discussion more concrete we shall concentrate on the (supervised) classification of an object O using a multiple classifier system (MCS). Given an unknown object O, our goal is to optimally assign it to one of K classes, c k ,k ∈ {1,2,…,K}, using an ensemble of M (supervised) classifiers, S m ,m ∈ {1,2,…,M}. The theory of multiple classifier systems suggests that if the pattern of errors made by one classifier, S m , is different from the pattern of errors made by another classifier, S n , then we may exploit this difference to give a more accurate and more reliable classification of O. If the error rates of the classifiers are less than \(\frac{1}{2}\), then the MCS error rate, \(E_{\text{MCS}}\), should decrease with the number of classifiers, M, and with the mean diversity, \(\bar{\sigma}\), between the classifiers.


Class Label Majority Vote Bayesian Model Average Ensemble Learn Expert Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Avidan, S.: SpatialBoost: Adding spatial reasoning to Adaboost. In: Proc. 9th Euro. Conf. Comp. Vis., pp. 780–785 (2006)Google Scholar
  2. 2.
    Boulle, M.: Regularization and averaging of the selective naive Bayes classifier. In: Proc. 2006 Int. Joint Conf. Neural Networks, pp. 2989–2997 (2006)Google Scholar
  3. 3.
    Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)zbMATHCrossRefGoogle Scholar
  4. 4.
    Chawla, N.V., Lazarevic, A., Hall, L.O., Bowyer, K.W.: SmoteBoost: Improving Prediction of the Minority Class in Boosting. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 107–119. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  5. 5.
    Dudoit, S., Fridlyand, J.: Bagging to improve the accurcay of a clustering procedure. Bioinformatics 19, 1090–1099 (2003)CrossRefGoogle Scholar
  6. 6.
    Elkan, C.: Boosting and naive Bayes learning. Tech Rept CS97-557. University of California, San Diego (September 1997)Google Scholar
  7. 7.
    Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Ann. Stat.  38, 337–374 (2000)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrero, F.: Patt. Recogn. 44, 1761–1776 (2011)CrossRefGoogle Scholar
  9. 9.
    Garcia-Pedrajas, N., Ortiz-Boyer, D.: An empirical study of binary classifier fusion methods for multi-class classification. Inf. Fusion. 12, 111–130 (2011)CrossRefGoogle Scholar
  10. 10.
    Ghahramani, Z., Kim, H.-C.: Bayesian classifier combination. Gatsby Tech Rept, University College, University of London, UK (2003)Google Scholar
  11. 11.
    Gonzales-Barron, U., Butler, F.: J. Food Engng. 74, 268–278 (2006)CrossRefGoogle Scholar
  12. 12.
    Huang, Y.S., Suen, C.Y.: A method of combining multiple experts for the recognition of unconstrained handwritten numerals. IEEE Trans. Patt. Anal. Mach. Intell. 17, 90–94 (1995)CrossRefGoogle Scholar
  13. 13.
    Jordan, M.I., Jacobs, R.A.: Hierarchical mixture of experts and the EM algorithm. Neural Comp. 6, 181–214Google Scholar
  14. 14.
    Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Trans. Patt. Anal. Mach. Intell. 20, 226–239 (1998)CrossRefGoogle Scholar
  15. 15.
    Ko, A.H.R., Sabourin, R., de Souza Britto Jr., A., Oliveria, L.: Pairwise fusion matrix for combining classifiers. Patt. Recogn. 40, 2198–2210 (2007)zbMATHCrossRefGoogle Scholar
  16. 16.
    Kuncheva, L.I.: Combining Pattern Classifiers. John Wiley and Sons (2004)Google Scholar
  17. 17.
    Li, S.Z., Zhang, Z.-Q.: Floatboost learning and statistical face detection. IEEE Trans. Patt. Anal. Mach. Intell. 26, 1112–1123 (2004)CrossRefGoogle Scholar
  18. 18.
    Lumini, A., Nanni, L.: Detector of image orientation based on Borda count. Patt. Recogn. 27, 180–186 (2006)CrossRefGoogle Scholar
  19. 19.
    Martinez-Munoz, G., Suarez, A.: Switching Class Labels to Generate Classification Ensembles. Patt. Recogn. 38, 1483–1494 (2005)CrossRefGoogle Scholar
  20. 20.
    Marzio, M., Taylor, C.C.: On boosting kernel density methods for multivariate data: density estimation and classification. Stat. Meth. Appl. 14, 163–178 (2005)zbMATHCrossRefGoogle Scholar
  21. 21.
    Melgani, F.: Robust image binarization with ensembles of thresholding algorithms. Elec. Imag. 15, 023010 (2006)CrossRefGoogle Scholar
  22. 22.
    Minka, T.P.: The “summation trick” as an outlier model. Unpublished article. Available from Minka’s homepage (2003)Google Scholar
  23. 23.
    Polikar, R.: Ensemble based systems in decision making. IEEE Circuit Syst. Mag. 6, 21–45 (2006)CrossRefGoogle Scholar
  24. 24.
    Ranawana, R.: Multiclassifier systems - review and a roadmap for developers. Int. J. Hybrid Intell. Syst. 3, 35–61 (2006)zbMATHGoogle Scholar
  25. 25.
    Ridgeway, G., Madigan, D., Richardson, T., O’Kane, J.W.: Interpretable boosted naive Bayes classification. In: Proc. 4th Int. Conf. Know. Discovery Data Mining, pp. 101–104 (1998)Google Scholar
  26. 26.
    Rokarch, L.: Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography. Comp. Stat. Data Anal. 53, 4046–4072 (2009)CrossRefGoogle Scholar
  27. 27.
    Schapire, R.E.: The boosting approach to machine learning: An overview. In: Proc. MSRI Workshop Nonlinear Estimation and Classification (2002)Google Scholar
  28. 28.
    Shen, L., Bai, L.: MutualBoost learning for selecting Gabor features for face recognition. Patt. Recogn. Lett. 27, 1758–1767 (2006)CrossRefGoogle Scholar
  29. 29.
    Shiraishi, Y., Fukumizu, K.: Statistical approaches to combining binary classifiers for multi-class classification. Neurocomp. 74, 680–686 (2011)CrossRefGoogle Scholar
  30. 30.
    Skurichina, M., Duin, R.P.W.: Combining Feature Subsets in Feature Selection. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (eds.) MCS 2005. LNCS, vol. 3541, pp. 165–175. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  31. 31.
    Tang, E.K., Suganthan, P.N., Yao, X., Qin, A.K.: Linear dimensionality reduction using relevance weighted LDA. Patt. Recogn. 38, 485–493 (2005)zbMATHCrossRefGoogle Scholar
  32. 32.
    Tang, E.K., Suganthan, P.N., Yao, X.: An analysis of diversity measures. Mach. Learn. 65, 247–271 (2006)CrossRefGoogle Scholar
  33. 33.
    Tax, D.M.J.: One class classification. PhD thesis, Delft University, The Netherlands (2001)Google Scholar
  34. 34.
    Ting, K.M., Witten, I.H.: Stacked generalization: when does it work? In: Proc. 15th Int. Joint Conf. Art. Intell. (1997)Google Scholar
  35. 35.
    Vezhnevets, A., Vezhnevets, V.: Modest AdaBoost-teaching AdaBoost to generalize better. In: 15th Int. Conf. Comp. Graph. Appl. (2005)Google Scholar
  36. 36.
    Viaene, S., Derrig, R., Dedene, G.: A case study of applying boosting naive Bayes to claim fraud diagnosis. IEEE Trans. Know. Data Engng. 16, 612–619 (2004)CrossRefGoogle Scholar
  37. 37.
    Xu, L., Krzyzak, A., Suen, C.Y.: Several methods for combining multiple classifiers and their applications in handwritten character recognition. IEEE Trans. Syst. Man Cybern. 22, 418–435 (1992)CrossRefGoogle Scholar
  38. 38.
    Webb, G.I.: Multiboosting: A technique combining boosting and wagging. Mach. Learn. 40, 159–197 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Section 3424IAI Elta Electronics Ind. Ltd.AshdodIsrael

Personalised recommendations