Omics–Bioinformatics in the Context of Clinical Data

Mayer, Gert; Heinze, Georg; Mischak, Harald; Hellemons, Merel E.; Heerspink, Hiddo J. Lambers; Bakker, Stephan J. L.; de Zeeuw, Dick; Haiduk, Martin; Rossing, Peter; Oberbauer, Rainer

doi:10.1007/978-1-61779-027-0_22

Gert Mayer²,
Georg Heinze³,
Harald Mischak⁴,
Merel E. Hellemons⁵,
Hiddo J. Lambers Heerspink⁵,
Stephan J. L. Bakker⁵,
Dick de Zeeuw⁵,
Martin Haiduk⁶,
Peter Rossing⁷ &
…
Rainer Oberbauer⁸

Part of the book series: Methods in Molecular Biology ((MIMB,volume 719))

5354 Accesses
11 Citations

Abstract

The Omics revolution has provided the researcher with tools and methodologies for qualitative and quantitative assessment of a wide spectrum of molecular players spanning from the genome to the metabolome level. As a consequence, explorative analysis (in contrast to purely hypothesis driven research procedures) has become applicable. However, numerous issues have to be considered for deriving meaningful results from Omics, and bioinformatics has to respect these in data analysis and interpretation. Aspects include sample type and quality, concise definition of the (clinical) question, and selection of samples ideally coming from thoroughly defined sample and data repositories. Omics suffers from a principal shortcoming, namely unbalanced sample-to-feature matrix denoted as “curse of dimensionality”, where a feature refers to a specific gene or protein among the many thousands assayed in parallel in an Omics experiment. This setting makes the identification of relevant features with respect to a phenotype under analysis error prone from a statistical perspective. From this sample size calculation for screening studies and for verification of results from Omics, bioinformatics is essential. Here we present key elements to be considered for embedding Omics bioinformatics in a quality controlled workflow for Omics screening, feature identification, and validation. Relevant items include sample and clinical data management, minimum sample quality requirements, sample size estimates, and statistical procedures for computing the significance of findings from Omics bioinformatics in validation studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Tanaka, H. (2010) Omics-based medicine and systems pathology. A new perspective for personalized and predictive medicine. Methods Inf Med 16, 173–85.
Google Scholar
Buyse, M., Loi, S., van’t Veer, L., et al. (2006) Validation and clinical utility of a 70-gene prognostic signature for women with node-negative breast cancer. J Natl Cancer Inst 98, 1183–92.
Article PubMed CAS Google Scholar
Zürbig, P., Schiffer, E., and Mischak, H. (2009) Capillary electrophoresis coupled to mass spectrometry for proteomic profiling of human urine and biomarker discovery. Methods Mol Biol 564,105–21.
Article PubMed Google Scholar
Illig, T., Gieger, Ch., Zhai, G., Römisch-Margl, W., Wang-Sattler, R., Prehn, C., Altmaier, E., Kastenmüller, G., Kato, B.S., Mewes, H.W., Meitinger, T., Hrabé de Angelis, M., Kronenberg, F., Soranzo, N., Wichmann, H.E., Spector, T.D., Adamski, J., and Suhre, K. (2010) A genome-wide perspective of genetic variation in human metabolism. Nat Genet 42, 137–41.
Article PubMed CAS Google Scholar
Brinkman, J.W., de Zeeuw, D., Duker, J.J., Gansevoort, R.T., Kema, I.P., Hillege, H.L., et al. (2005) Falsely low urinary albumin concentrations after prolonged frozen storage of urine samples. Clin Chem 51, 2181–83.
Article PubMed CAS Google Scholar
Brinkman, J.W., de Zeeuw, D., Gansevoort, R.T., Duker, J.J., Kema, I.P., de Jong, P.E., et al. (2007) Prolonged frozen storage of urine reduces the value of albuminuria for mortality prediction. Clin Chem 53, 153–4.
Article PubMed CAS Google Scholar
Lambers Heerspink, H.J., Nauta, F.L., van der Zee, C.P., Brinkman, J.W., Gansevoort, R.T., de Zeeuw, D., et al. (2009) Alkalinization of urine samples preserves albumin concentrations during prolonged frozen storage in patients with diabetes mellitus. Diabet Med 26, 556–9.
Article PubMed CAS Google Scholar
Rossing, K., Mischak, H., Dakna, M., Zürbig, P., Novak, J., Julian, B.A., Good, D.M., Coon, J.J., Tarnow, L., and Rossing, P. (2008) Urinary proteomics in diabetes and CKD. J Am Soc Nephrol 19, 1283–90.
Article PubMed CAS Google Scholar
Jung, S.-H., Bang, H., and Young, S. (2005) Sample size calculation for multiple testing in microarray analysis. Biostatistics 6, 157–69.
Article PubMed Google Scholar
Sitek, B., Potthoff, S., Schulenborg, T., Stegbauer, J., Vinke, T., Rump, L.C., Meyer, H.E., Vonend, O., and Stuhler, K. (2006) Novel approaches to analyse glomerular proteins from smallest scale murine and human samples using DIGE saturation labelling. Proteomics 6, 4337–45.
Article PubMed CAS Google Scholar
Mischak, H., Coon, J.J., Novak, J., Weissinger, E.M., Schanstra, J.P., and Dominiczak, A. (2009) Capillary electrophoresis-mass spectrometry as a powerful tool in biomarker discovery and clinical diagnosis: An update of recent developments. Mass Spectrom Rev 28, 703–24.
Article PubMed CAS Google Scholar
Rai, A.J., Gelfand, C.A., Haywood, B.C., Warunek, D.J., Yi, J., Schuchard, M.D., Mehigh, R.J., Cockrill, S.L., Scott, G.B., Tammen, H., Schulz-Knappe, P., Speicher, D.W., Vitzthum, F., Haab, B.B., Siest, G., and Chan, D.W. (2005) HUPO Plasma Proteome Project specimen collection and handling: Towards the standardization of parameters for plasma proteome samples. Proteomics 5, 3262–77.
Article PubMed CAS Google Scholar
Shen, Y., Kim, J., Strittmatter, E.F., Jacobs, J.M., Camp, D.G., Fang, R., Tolie, N., Moore, R.J., and Smith, R.D. (2005) Characterization of the human blood plasma proteome. Proteomics 5, 4034–45.
Article PubMed CAS Google Scholar
Righetti, P.G., and Boschetti, E. (2008) The ProteoMiner and the FortyNiners: Searching for gold nuggets in the proteomic arena. Mass Spectrom Rev 27, 596–608.
Article PubMed CAS Google Scholar
Mischak, H., Kolch, W., Aivalotis, M., Bouyssie, D., Court, M., Dihazi, H., Dihazi, G.H., Franke, J., Garin, J., Gonzales de Peredo, A., Iphöfer, A., Jansch, L., Lacroix, C., Makridakis, M., Masselon, C., Metzger, J., Monsarrat, B., Mrug, M., Norling, M., Novak, J., Pich, A., Pitt, A., Bongcam-Rudloff, E., Siwy, J., Suzuki, H., Thongboonkerd, V., Wang, L., Zoidakis, J., Zurbig, P., Schanstra, J., and Vlahou, A. (2010) Comprehensive human urine standards for comparability and standardization in clinical proteome analysis. Proteomics Clin Appl 4, 464–78.
Article PubMed CAS Google Scholar
Dettmer, K., Aronov, P.A., and Hammock, B.D. (2007) Mass spectrometry-based metabolomics. Mass Spectrom Rev 26, 51–78.
Article PubMed CAS Google Scholar
Ramautar, R., Somsen, G.W., and de Jong, G.J. (2009) CE-MS in metabolomics. Electrophoresis 30, 276–91.
Article PubMed CAS Google Scholar
Mittlböck, M., and Schemper, M. (1996) Explained variation for logistic regression. Stat Med 15, 1987–97.
Article PubMed Google Scholar
Schemper, M., and Henderon, R. (2000) Predictive accuracy and explained variation in Cox regression. Biometrics 56, 249–55.
Article PubMed CAS Google Scholar
Heinze, G., and Schemper, M. (2003) Comparing the importance of prognostic factors in Cox and logistic regression using SAS. Comput Methods Programs Biomed 71, 1455–63.
Google Scholar
Dunkler, D., Michiels, S., and Schemper, M. (2007) Gene expression profiling: Does it add predictive accuracy to clinical characteristics in cancer prognosis? Eur J Cancer 43, 745–51.
Article PubMed CAS Google Scholar
Tibshirani, R. (1996) Regression shrinkage and selection via the lasso. J Royal Stat Soc B 58, 267–88.
Google Scholar
Tibshirani, R. (1997) The lasso method for variable selection in the Cox model. Stat Med 16, 385–95.
Article PubMed CAS Google Scholar
le Cessie, S., and van Houwelingen, H.C. (1992) Ridge estimators in logistic regression. Appl Stat 41, 191–201.
Article Google Scholar
Verweij, P.J.M., and van Houwelingen, H.C. (1994) Penalized likelihood in Cox regression. Stat Med 13, 2427–36.
Article PubMed CAS Google Scholar
Zou, H., and Hastie, T. (2005) Regularization and variable selection via the elastic net. J Royal Stat Soc B 67, 301–20.
Article Google Scholar
Berrar, D., Bradbury, I., and Dubitzky, W. (2006) Avoiding model selection bias in small-sample genomic datasets. Bioinformatics 15, 1245–50.
Article Google Scholar
Lusa, L., McShane, L.M., Radmacher, M.D., Shih, J.H., Wright, G.W., and Simon, R. (2007) Appropriateness of some resampling-based inference procedures for assessing performance of prognostic classifiers derived from microarray data. Stat Med 28, 1102–13.
Article Google Scholar
Jiang, W., Varma, S., and Simon, R. (2008) Calculating confidence intervals for prediction error in microarray classification using resampling. Stat Appl Genet Mol Biol 7, 8.
Google Scholar
Gatsonis, C., and Sampson, A.R. (1989) Multiple correlation: Exact power and sample size calculations. Psychol Bull 106, 516–24.
Article PubMed CAS Google Scholar
Granger, C.B., Van Eyk, J.E., Mockrin, S.C., and Anderson, N.L. (2004) National Heart, Lung, and Blood Institute Clinical Proteomics Working Group report. Circulation 109, 1697–703.
Article PubMed Google Scholar
Mischak, H., Apweiler, R., Banks, R.E., Conaway, M., Coon, J.J., Dominizak, A., Ehrich, J.H., Fliser, D., Girolami, M., Hermjakob, H., Hochstrasser, D.F., Jankowski, V., Julian, B.A., Kolch, W., Massy, Z., Neususs, C., Novak, J., Peter, K., Rossing, K., Schanstra, J.P., Semmes, O.J., Theodorescu, D., Thongboonkerd, V., Weissinger, E.M., Van Eyk, J.E., and Yamamoto, T. (2007) Clinical proteomics: A need to define the field and to begin to set adequate standards. Proteomics Clin Appl 1, 148–56.
Article PubMed CAS Google Scholar

Download references

Acknowledgements

This work was supported by the European Union FP7 project “SysKid”, project number 241544.

Author information

Authors and Affiliations

Department of Internal Medicine IV (Nephrology and Hypertension), Medical University of Innsbruck, Innsbruck, Austria
Gert Mayer
Section of Clinical Biometrics, Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
Georg Heinze
mosaiques diagnostics GmbH, Hannover, Germany
Harald Mischak
Department of Nephrology, University Medical Center Groningen, Groningen, The Netherlands
Merel E. Hellemons, Hiddo J. Lambers Heerspink, Stephan J. L. Bakker & Dick de Zeeuw
emergentec biodevelopment GmbH, Vienna, Austria
Martin Haiduk
Steno Diabetes Center Denmark, Gentofte, Denmark
Peter Rossing
Medical University of Vienna and KH Elisabethinen Linz, Vienna, Austria
Rainer Oberbauer

Authors

Gert Mayer
View author publications
You can also search for this author in PubMed Google Scholar
Georg Heinze
View author publications
You can also search for this author in PubMed Google Scholar
Harald Mischak
View author publications
You can also search for this author in PubMed Google Scholar
Merel E. Hellemons
View author publications
You can also search for this author in PubMed Google Scholar
Hiddo J. Lambers Heerspink
View author publications
You can also search for this author in PubMed Google Scholar
Stephan J. L. Bakker
View author publications
You can also search for this author in PubMed Google Scholar
Dick de Zeeuw
View author publications
You can also search for this author in PubMed Google Scholar
Martin Haiduk
View author publications
You can also search for this author in PubMed Google Scholar
Peter Rossing
View author publications
You can also search for this author in PubMed Google Scholar
Rainer Oberbauer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gert Mayer .

Editor information

Editors and Affiliations

emergentec biodevelopment GmbH, Gersthofer Strasse 29-31, Vienna, 1180, Austria
Bernd Mayer

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Mayer, G. et al. (2011). Omics–Bioinformatics in the Context of Clinical Data. In: Mayer, B. (eds) Bioinformatics for Omics Data. Methods in Molecular Biology, vol 719. Humana Press. https://doi.org/10.1007/978-1-61779-027-0_22

Download citation

DOI: https://doi.org/10.1007/978-1-61779-027-0_22
Published: 29 January 2011
Publisher Name: Humana Press
Print ISBN: 978-1-61779-026-3
Online ISBN: 978-1-61779-027-0
eBook Packages: Springer Protocols

Publish with us

Policies and ethics