Machine Learning

, Volume 55, Issue 3, pp 251–270 | Cite as

Bagging Equalizes Influence

  • Yves Grandvalet
Article

Abstract

Bagging constructs an estimator by averaging predictors trained on bootstrap samples. Bagged estimates almost consistently improve on the original predictor. It is thus important to understand the reasons for this success, and also for the occasional failures. It is widely believed that bagging is effective thanks to the variance reduction stemming from averaging predictors. However, seven years from its introduction, bagging is still not fully understood. This paper provides experimental evidence supporting the hypothesis that bagging stabilizes prediction by equalizing the influence of training examples. This effect is detailed in two different frameworks: estimation on the real line and regression. Bagging’s improvements/deteriorations are explained by the goodness/badness of highly influential examples, in situations where the usual variance reduction argument is at best questionable. Finally, reasons for the equalization effect are advanced. They support that other resampling strategies such as half-sampling should provide qualitatively identical effects while being computationally less demanding than bootstrap sampling.

bagging influence leverage bias/variance 

References

  1. Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36:1/2, 105–139. CrossRefGoogle Scholar
  2. Breiman, L. (1996a). Bagging predictors. Machine Leanring, 24:2, 123–140.Google Scholar
  3. Breiman, L. (1996b). Bias, variance, and arcing classifiers. Technical R: Statistics Department, University of California at Berkeley.Google Scholar
  4. Breiman, L. (1996c). Heuristics of instability and stabilization in model selection. The Annals of Statistics, 24:6, 2350–2383.CrossRefGoogle Scholar
  5. Breiman, L. (1998). Arcing classifiers. The Annals of Statistics, 263,:801–849.Google Scholar
  6. Breiman, L. (1999), Prediction games and arcing algorithms.Neural Computation, 11:7, 1493–1517.CrossRefPubMedGoogle Scholar
  7. Bitihlmann, P., & Yu, B. (2000). Explaining Bagging. Technical Report 92, Seminar för Statistik, ETH, Zörich.Google Scholar
  8. Buja, A., & Stuetzle, W. (2000). The effect of bagging on variance, bias and mean squared error. Technical Report, AT & T Labs-Research.Google Scholar
  9. Burgess, A. N. (1997). Estimating equivalint:kernels For neural networks: A data perturbation approach. In M. Mozer, M. Jordan, & T. Petsche’(Eds.l,;:Advances in neural information processing systems 9 (pp. 382–388). MIT Press.Google Scholar
  10. Dietterich, T. G. (2000). An expr”:iental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting aid randomization. Machine Learning, 40:2,1–19.CrossRefGoogle Scholar
  11. Domingos, P. (1997). Why does bagging work? A bayesian account and its implications. In Proceedings of the Third i nte tial Conference on Knowledge Discovery and Data Mining (pp. 155–158). AAAI Press.Google Scholar
  12. Drucker, H.(1997). Improving regressors using boosting techniques. In Proceedings of the Fourteenth International Conference:ton Machine Learning (pp. 107–115). Morgan Kaufmann.Google Scholar
  13. Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap, Vol. 57 of monographs on statistics and applied probability. Chapman & Hall.Google Scholar
  14. Friedman, J. H., & Hall, P. (2000), On bagging and non-linear estimation. Technical Report, Stanford University, Stanford, CA.Google Scholar
  15. Hastie, T. J., & Tibshirani, R. J. (1990). Generalized additive models, Vol. 43 of monographs on statistics and appliedprobability. Chapman & Hall.Google Scholar
  16. Huber, P. J. (1981). Robust statistics. Wiley.Google Scholar
  17. Maclin, R., & Opitz, D. (1997). An empirical evaluation of bagging and boosting. In Proceedings of the Fourteenth National Conference on Artificial Intelligence (pp. 546–551). AAAI Press.Google Scholar
  18. Quinlan, J. R. (1996). Bagging, boosting, and C4.5. In Proceedings of the Thirteenth National Conference on Artificial Intelligence (pp. 725–730). AAAI Press.Google Scholar
  19. Rao, J. S., & Tibshirani, R. J. (1997). The out-of-bootstrap method for model averaging and selection. Technical Report, University of Toronto.Google Scholar
  20. Raviv, Y., & Intrator, N. (1996). Bootstrapping with noise: An effective regularization technique. Connection Science, 8:3, 355–372.CrossRefGoogle Scholar
  21. Rousseeuw, P. J. (1997). Robust regression, positive breakdown. In S. Kotz, C. Read, & D. Banks (Eds.), Encv-clopedia of statistical sciences. Wiley.Google Scholar
  22. Saporta, G. (1990). Probabilits, analyse des donnies et statistique. Paris: Editions Technip.Google Scholar
  23. Schapire, R., Freund, Y., Bartlett, P., & Lee, W. S. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26:5, 1651–1686.CrossRefGoogle Scholar
  24. Taniguchi, M., & Tresp, V. (1997). Averaging regularized estimators. Neural computation, 9:7, 1163–1178.Google Scholar
  25. Tibshirani, R. J., & Knight, K. (1999). The covariance inflation criterion for adaptive model selection. Journal of the Royal Statistical Society, B, 61:3, 529–546. CrossRefGoogle Scholar
  26. Wolpert, D. H., & Macready, W. G. (1996). Combining stacking with bagging to improve a learning algorithm. Technical Report SFI-TR–96–03–123, Santa Fe Institute.Google Scholar

Copyright information

© Kluwer Academic Publishers 2004

Authors and Affiliations

  • Yves Grandvalet
    • 1
  1. 1.Heudiasyc, UMR CNRS 6599, Université de Technologie de CompiègneFrance

Personalised recommendations