Bagging Classification Models with Reduced Bootstrap

  • Rafael Pino-Mejías
  • María-Dolores Cubiles-de-la-Vega
  • Manuel López-Coello
  • Esther-Lydia Silva-Ramírez
  • María-Dolores Jiménez-Gamero
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3138)


Bagging is an ensemble method proposed to improve the predictive performance of learning algorithms, being specially effective when applied to unstable predictors. It is based on the aggregation of a certain number of prediction models, each one generated from a bootstrap sample of the available training set. We introduce an alternative method for bagging classification models, motivated by the reduced bootstrap methodology, where the generated bootstrap samples are forced to have a number of distinct original observations between two values k 1 and k 2. Five choices for k 1 and k 2 are considered, and the five resulting models are empirically studied and compared with bagging on three real data sets, employing classification trees and neural networks as the base learners. This comparison reveals for this reduced bagging technique a trend to diminish the mean and the variance of the error rate.


Hide Node Bootstrap Sample Classification Tree Test Error Rate Percent Error Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Breiman, L.: Bagging Predictors. Mach. Learn. 24, 123–140 (1996)zbMATHMathSciNetGoogle Scholar
  2. 2.
    Bühlman, P., Yu, B.: Analyzing Bagging. Ann. Stat. 30(4), 927–961 (2002)CrossRefGoogle Scholar
  3. 3.
    Buja, A., Stuetzle, W.: The effect of bagging on variance, bias, and mean squared error. AT&T Labs-Research (2000) (preprint)Google Scholar
  4. 4.
    Rao, C.R., Pathak, P.K., Koltchinskii, V.I.: Bootstrap by sequential resampling. J. Statist. Plan. Infer. 64, 257–281 (1997)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Jiménez-Gamero, M.D., Muñoz-García, J., Pino-Mejías, R.: Reduced bootstrap for the median. Stat. Sinica (2004) (in press)Google Scholar
  6. 6.
    Efron, B.: Bootstrap methods: another look at the jackknife. Ann. Stat. 7, 1–26 (1979)zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Muñoz-García, J., Pino-Mejías, R., Muñoz-Pichardo, J.M., Cubiles-de-la-Vega, M.D.: Identification of outlier bootstrap samples. J. Appl. Stat. 24(3), 333–342 (1997)CrossRefGoogle Scholar
  8. 8.
    Hall, P.: Antithetic resampling for the bootstrap. Biometrika, 713–724 (1989)Google Scholar
  9. 9.
    Johns, M.V.: Importance sampling for bootstrap confidence intervals. J. Am. Stat. Assoc. 83, 709–714 (1988)zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Jiménez-Gamero, M.D., Muñoz-García, J., Muñoz-Reyes, A., Pino-Mejías, R.: On Efronś method II with identification of outlier bootstrap samples. Computation. Stat. 13, 301–318 (1998)zbMATHGoogle Scholar
  11. 11.
    Ihaka, R., Gentleman, R.: R: A Language for Data Analysis and Graphics. J. Comput. Graph. Stat. 5, 299–314 (1996)CrossRefGoogle Scholar
  12. 12.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth (1984)Google Scholar
  13. 13.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, Heidelberg (2001)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Rafael Pino-Mejías
    • 1
    • 2
  • María-Dolores Cubiles-de-la-Vega
    • 2
  • Manuel López-Coello
    • 3
  • Esther-Lydia Silva-Ramírez
    • 3
  • María-Dolores Jiménez-Gamero
    • 2
  1. 1.Centro Andaluz de ProspectivaSevillaSpain
  2. 2.Departamento de Estadística e Investigación Operativa, Facultad de MatemáticasUniversidad de SevillaSevillaSpain
  3. 3.Departamento de Lenguajes y Sistemas Informáticos, E. Superior de IngenieríaUniversidad de CádizCádizSpain

Personalised recommendations