Skip to main content

On the Simultaneous Analysis of Clinical and Omics Data: A Comparison of Globalboosttest and Pre-validation Techniques

  • Conference paper
  • First Online:
Statistical Models for Data Analysis

Abstract

In medical research biostatisticians are often confronted with supervised learning problems involving different kinds of predictors including, e.g., classical clinical predictors and high-dimensional “omics” data. The question of the added predictive value of high-dimensional omics data given that classical predictors are already available has long been under-considered in the biostatistics and bioinformatics literature. This issue is characterized by a lack of guidelines and a huge amount of conceivable approaches. Two existing methods addressing this important issue are systematically compared in the present paper. The globalboosttest procedure (Boulesteix & Hothorn. (2010). BMC Bioinformatics, 11, 78.) examines the additional predictive value of high-dimensional molecular data via boosting regression including a clinical offset, while the pre-validation method sums up omics data in form of a new cross-validated predictor that is finally assessed in a standard generalized linear model (Tibshirani & Efron. (2002). Statistical Applications in Genetics and Molecular Biology, 1, 1). Globalboosttest and pre-validation are introduced and discussed, then assessed based on a simulation study with survival data and finally applied to breast cancer microarray data for illustration. R codes to reproduce our results and figures are available from http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/gbtpv/index.html.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Binder, H., & Schumacher, M. (2008). Adapting prediction error estimates for biased complexity selection in high-dimensional bootstrap samples. Statistical Applications in Genetics and Molecular Biology, 7(1), 12.

    Article  MathSciNet  Google Scholar 

  • Boulesteix, A., Porzelius, C., & Daumer, M. (2008). Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value. Bioinformatics, 24(15), 1698–1706.

    Article  Google Scholar 

  • Boulesteix, A. L., & Hothorn, T. (2010). Testing the additional predictive value of high-dimensional molecular data. BMC Bioinformatics, 11, 78.

    Article  Google Scholar 

  • Boulesteix, A.L., & Sauerbrei, W. (2011). Added predictive value of high-throughput molecular data to clinical data and its validation. Briefings in Bioinformatics, 12(3), 215–229.

    Article  Google Scholar 

  • Chin, K., et al. (2006). Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Cancer Cell, 10(6), 529–541.

    Google Scholar 

  • Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society, Series B, 34(2), 187–220.

    MATH  Google Scholar 

  • Höfling, H., & Tibshirani, R. J. (2008). A study of pre-validation. The Annals of Applied Statistics, 2(2), 643–664.

    Article  MathSciNet  MATH  Google Scholar 

  • Tibshirani, R. J., & Efron, B. (2002). Pre-validation and inference in microarrays. Statistical Applications in Genetics and Molecular Biology, 1, 1.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank Jutta Engel for helpful advice on the breast cancer data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anne-Laure Boulesteix .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Oelker, MR., Boulesteix, AL. (2013). On the Simultaneous Analysis of Clinical and Omics Data: A Comparison of Globalboosttest and Pre-validation Techniques. In: Giudici, P., Ingrassia, S., Vichi, M. (eds) Statistical Models for Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00032-9_30

Download citation

Publish with us

Policies and ethics