Bagging with Asymmetric Costs for Misclassified and Correctly Classified Examples

  • Ricardo Ñanculef
  • Carlos Valle
  • Héctor Allende
  • Claudio Moraga
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4756)


Diversity is a key characteristic to obtain advantages of combining predictors. In this paper, we propose a modification of bagging to explicitly trade off diversity and individual accuracy. The procedure consists in dividing the bootstrap replicates obtained at each iteration of the algorithm in two subsets: one consisting of the examples misclassified by the ensemble obtained at the previous iteration, and the other consisting of the examples correctly recognized. A high individual accuracy of a new classifier on the first subset increases diversity, measured as the value of the Q statistic between the new classifier and the existing classifier ensemble. A high accuracy on the second subset on the other hand, decreases diversity. We trade off between both components of the individual accuracy using a parameter λ ∈ [0,1] that changes the cost of a misclassification on the second subset. Experiments are provided using well-known classification problems obtained from UCI. Results are also compared with boosting and bagging.


Ensemble Methods Bagging Diversity Neural Networks Classification Algorithms 


  1. 1.
    Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36(1-2), 105–139 (1999)CrossRefGoogle Scholar
  2. 2.
    Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)Google Scholar
  3. 3.
    Breiman, L.: Bagging predictors. Machine Learning 26(2), 123–140 (1996)Google Scholar
  4. 4.
    Breiman, L.: Half and half bagging and hard boundary points. Technical report, Statistics Department, University of California (1998)Google Scholar
  5. 5.
    Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: A survey and categorisation. Information Fusion Journal 6(1), 5–20 (2004)CrossRefGoogle Scholar
  6. 6.
    Brown, G., Wyatt, J.L.: Negative correlation learning and the ambiguity family of ensemble methods. In: Windeatt, T., Roli, F. (eds.) MCS 2003. LNCS, vol. 2709, Springer, Heidelberg (2003)Google Scholar
  7. 7.
    Freud, Y., Schapire, R.: A decision-theoretic generalization of on-line learning and application to boosting. J. of Computer and System Sciences 55(1), 119–137 (1997)CrossRefGoogle Scholar
  8. 8.
    Freud, Y., Schapire, R.: A short introduction to boosting. Journal of Japanese Society for Artificial Intelligence 14(5), 771–780 (1999)Google Scholar
  9. 9.
    Grandvalet, Y.: Bagging down-weights leverage points. In: IJCNN, vol. IV, pp. 505–510 (2000)Google Scholar
  10. 10.
    Grandvalet, Y.: Bagging equalizes influence. Mach. Learning 55(3), 251–270 (2004)zbMATHCrossRefGoogle Scholar
  11. 11.
    Kuncheva, L.: That elusive diversity in classifier ensembles. In: Perales, F.J., Campilho, A., Pérez, N., Sanfeliu, A. (eds.) IbPRIA 2003. LNCS, vol. 2652, pp. 1126–1138. Springer, Heidelberg (2003)Google Scholar
  12. 12.
    Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. Wiley InterScience, Chichester (2004)zbMATHGoogle Scholar
  13. 13.
    Whitaker, C., Kuncheva, L.: Measures of diversity in classifier ensembles. Machine Learning 51, 181–207 (2003)zbMATHCrossRefGoogle Scholar
  14. 14.
    Poggio, T., Rifkin, R., Mukherjee, S.: Bagging regularizes. Technical Report 214/AI Memo 2002-003, MIT CBCL (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Ricardo Ñanculef
    • 1
  • Carlos Valle
    • 1
  • Héctor Allende
    • 1
  • Claudio Moraga
    • 2
    • 3
  1. 1.Universidad Técnica Federico Santa María, Departamento de Informática, CP 110-V ValparaísoChile
  2. 2.European Centre for Soft Computing 33600 Mieres, AsturiasSpain
  3. 3.Dortmund University, 44221 DortmundGermany

Personalised recommendations