Encyclopedia of Database Systems

2009 Edition


  • Hwanjo Yu
Reference work entry
DOI: https://doi.org/10.1007/978-0-387-39940-9_566



The bootstrap is a statistical method for estimating the performance (e.g., accuracy) of classification or regression methods. The bootstrap is based on the statistical procedure of sampling with replacement. Unlike other estimation methods such as cross-validation, the same object or tuple can be selected for the training set more than once in the boostrap. That is, each time a tuple is selected, it is equally likely to be selected again and re-added to the training set.

Historical Background

The bootstrap sampling was developed by Bradley Efron in 1979, and mainly used for estimating the statistical parameters such as mean, standard errors, etc. [2]. A meta-classification method using the bootstrap called bootstrap aggregating (or bagging) was proposed by Leo Breiman in 1994 to improve the classification by combining classifications of randomly generated training sets [1].


This section discusses a commonly...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Breiman L. Bagging predictors. Machine Learning, 1996.Google Scholar
  2. 2.
    Efron B. and Tibshirani R.J. An Introduction to the Bootstrap. CRC Press, Boca Raton, 1994.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Hwanjo Yu
    • 1
  1. 1.University of IowaIowa CityUSA