Bagging Classification Models with Reduced Bootstrap
Bagging is an ensemble method proposed to improve the predictive performance of learning algorithms, being specially effective when applied to unstable predictors. It is based on the aggregation of a certain number of prediction models, each one generated from a bootstrap sample of the available training set. We introduce an alternative method for bagging classification models, motivated by the reduced bootstrap methodology, where the generated bootstrap samples are forced to have a number of distinct original observations between two values k 1 and k 2. Five choices for k 1 and k 2 are considered, and the five resulting models are empirically studied and compared with bagging on three real data sets, employing classification trees and neural networks as the base learners. This comparison reveals for this reduced bagging technique a trend to diminish the mean and the variance of the error rate.
KeywordsHide Node Bootstrap Sample Classification Tree Test Error Rate Percent Error Rate
- 3.Buja, A., Stuetzle, W.: The effect of bagging on variance, bias, and mean squared error. AT&T Labs-Research (2000) (preprint)Google Scholar
- 5.Jiménez-Gamero, M.D., Muñoz-García, J., Pino-Mejías, R.: Reduced bootstrap for the median. Stat. Sinica (2004) (in press)Google Scholar
- 8.Hall, P.: Antithetic resampling for the bootstrap. Biometrika, 713–724 (1989)Google Scholar
- 12.Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth (1984)Google Scholar