Resampling Strategies for Model Assessment and Selection

  • Richard Simon


The advent of DNA microarrays and proteomics technology has stimulated the development and use of classification algorithms for biomedical studies. In oncology, for example, a common application is predicting response to treatment based on expression profiling of tumor tissue. Such a classifier could be used as an aid in treatment selection for future patients based on the expression profiles of their tumors. In developing such a classifier, it is important to estimate the predictive accuracy that can be expected for future application of the classifier.


Prediction Error Linear Discriminant Analysis Classifier Development Resampling Strategy Diagonal Linear Discriminant Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Ambroise, C. and McLachlan, G.J. (2002). Selection bias in gene extraction on th basis of microarray gene expression data. Proc. Natl. Acad. Sci. USA, 98:6562–6566.CrossRefGoogle Scholar
  2. Efron, B. and Tibshirani, R. (1993). An Introduction to the Bootstrap. Chapman & Hall.Google Scholar
  3. Efron, B. and Tibshirani, R.J. (1997). Improvements on cross-validation. The.632+ bootstrap method. J. Am. Stat. Assoc., 92:548–560.CrossRefGoogle Scholar
  4. Kattan, M.W. (2003). Judging new markers by their ability to improve predictive accuracy. J. Natl. Cancer Inst., 95(9):634–635.PubMedCrossRefGoogle Scholar
  5. Kattan, M.W. (2004). Evaluating a new marker’s predictive contribution. Clin. Cancer Res., 10:822–824.PubMedCrossRefGoogle Scholar
  6. Michiels, S., Koscielny, S., and Hill, C. (2005). Prediction of cancer outcome with microarrays: A multiple random validation strategy. The Lancet, 365:488–492.CrossRefGoogle Scholar
  7. Molinaro, A.M., Simon, R., and Pfeiffer, R.M (2005). Prediction error estimation: A comparison of resampling methods. Bioinformatics, 21(15):3301–3307.PubMedCrossRefGoogle Scholar
  8. Radmacher, M.D., McShane, L.M., and Simon, R. (2002). A paradigm for class prediction using gene expression profiles. J. Comp. Bio., 9(3):505–511.CrossRefGoogle Scholar
  9. Simon, R. and Lam, A.P. (2005). BRB-ArrayTools Users Guide (version 3.3). Technical Report 28, Biometric Research Branch, National Cancer Institute, Bethesda, MD, USA.Google Scholar
  10. Simon, R., Radmacher, M.D., Dobbin, K., and McShane, L.M. (2003). Pitfalls in the analysis of DNA microarray data: Class prediction methods. J. Natl. Cancer Inst., 95:14–18.PubMedCrossRefGoogle Scholar
  11. Snedecor, G.W. and Cochran, W.G. (1989). Statistical Methods. Iowa State University Press, Ames Iowa, USA.Google Scholar
  12. Tibshirani, R., Hastie, T., Narasimhan, B., and Chu, G. (2002). Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl. Acad. Sci. USA, 99(10):6567–6572.PubMedCrossRefGoogle Scholar
  13. Tibshirani, R.J. and Efron, B. (2002). Pre-validation and inference in microarrays. Stat. Appl. Gen. Mol. Biol., 1(1).Google Scholar
  14. Varma, S. and Simon, R. (2006). Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics, 7:91.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Richard Simon
    • 1
  1. 1.National Cancer InstituteRockvilleUSA

Personalised recommendations