Abstract
The Omics revolution has provided the researcher with tools and methodologies for qualitative and quantitative assessment of a wide spectrum of molecular players spanning from the genome to the metabolome level. As a consequence, explorative analysis (in contrast to purely hypothesis driven research procedures) has become applicable. However, numerous issues have to be considered for deriving meaningful results from Omics, and bioinformatics has to respect these in data analysis and interpretation. Aspects include sample type and quality, concise definition of the (clinical) question, and selection of samples ideally coming from thoroughly defined sample and data repositories. Omics suffers from a principal shortcoming, namely unbalanced sample-to-feature matrix denoted as “curse of dimensionality”, where a feature refers to a specific gene or protein among the many thousands assayed in parallel in an Omics experiment. This setting makes the identification of relevant features with respect to a phenotype under analysis error prone from a statistical perspective. From this sample size calculation for screening studies and for verification of results from Omics, bioinformatics is essential. Here we present key elements to be considered for embedding Omics bioinformatics in a quality controlled workflow for Omics screening, feature identification, and validation. Relevant items include sample and clinical data management, minimum sample quality requirements, sample size estimates, and statistical procedures for computing the significance of findings from Omics bioinformatics in validation studies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tanaka, H. (2010) Omics-based medicine and systems pathology. A new perspective for personalized and predictive medicine. Methods Inf Med 16, 173–85.
Buyse, M., Loi, S., van’t Veer, L., et al. (2006) Validation and clinical utility of a 70-gene prognostic signature for women with node-negative breast cancer. J Natl Cancer Inst 98, 1183–92.
Zürbig, P., Schiffer, E., and Mischak, H. (2009) Capillary electrophoresis coupled to mass spectrometry for proteomic profiling of human urine and biomarker discovery. Methods Mol Biol 564,105–21.
Illig, T., Gieger, Ch., Zhai, G., Römisch-Margl, W., Wang-Sattler, R., Prehn, C., Altmaier, E., Kastenmüller, G., Kato, B.S., Mewes, H.W., Meitinger, T., Hrabé de Angelis, M., Kronenberg, F., Soranzo, N., Wichmann, H.E., Spector, T.D., Adamski, J., and Suhre, K. (2010) A genome-wide perspective of genetic variation in human metabolism. Nat Genet 42, 137–41.
Brinkman, J.W., de Zeeuw, D., Duker, J.J., Gansevoort, R.T., Kema, I.P., Hillege, H.L., et al. (2005) Falsely low urinary albumin concentrations after prolonged frozen storage of urine samples. Clin Chem 51, 2181–83.
Brinkman, J.W., de Zeeuw, D., Gansevoort, R.T., Duker, J.J., Kema, I.P., de Jong, P.E., et al. (2007) Prolonged frozen storage of urine reduces the value of albuminuria for mortality prediction. Clin Chem 53, 153–4.
Lambers Heerspink, H.J., Nauta, F.L., van der Zee, C.P., Brinkman, J.W., Gansevoort, R.T., de Zeeuw, D., et al. (2009) Alkalinization of urine samples preserves albumin concentrations during prolonged frozen storage in patients with diabetes mellitus. Diabet Med 26, 556–9.
Rossing, K., Mischak, H., Dakna, M., Zürbig, P., Novak, J., Julian, B.A., Good, D.M., Coon, J.J., Tarnow, L., and Rossing, P. (2008) Urinary proteomics in diabetes and CKD. J Am Soc Nephrol 19, 1283–90.
Jung, S.-H., Bang, H., and Young, S. (2005) Sample size calculation for multiple testing in microarray analysis. Biostatistics 6, 157–69.
Sitek, B., Potthoff, S., Schulenborg, T., Stegbauer, J., Vinke, T., Rump, L.C., Meyer, H.E., Vonend, O., and Stuhler, K. (2006) Novel approaches to analyse glomerular proteins from smallest scale murine and human samples using DIGE saturation labelling. Proteomics 6, 4337–45.
Mischak, H., Coon, J.J., Novak, J., Weissinger, E.M., Schanstra, J.P., and Dominiczak, A. (2009) Capillary electrophoresis-mass spectrometry as a powerful tool in biomarker discovery and clinical diagnosis: An update of recent developments. Mass Spectrom Rev 28, 703–24.
Rai, A.J., Gelfand, C.A., Haywood, B.C., Warunek, D.J., Yi, J., Schuchard, M.D., Mehigh, R.J., Cockrill, S.L., Scott, G.B., Tammen, H., Schulz-Knappe, P., Speicher, D.W., Vitzthum, F., Haab, B.B., Siest, G., and Chan, D.W. (2005) HUPO Plasma Proteome Project specimen collection and handling: Towards the standardization of parameters for plasma proteome samples. Proteomics 5, 3262–77.
Shen, Y., Kim, J., Strittmatter, E.F., Jacobs, J.M., Camp, D.G., Fang, R., Tolie, N., Moore, R.J., and Smith, R.D. (2005) Characterization of the human blood plasma proteome. Proteomics 5, 4034–45.
Righetti, P.G., and Boschetti, E. (2008) The ProteoMiner and the FortyNiners: Searching for gold nuggets in the proteomic arena. Mass Spectrom Rev 27, 596–608.
Mischak, H., Kolch, W., Aivalotis, M., Bouyssie, D., Court, M., Dihazi, H., Dihazi, G.H., Franke, J., Garin, J., Gonzales de Peredo, A., Iphöfer, A., Jansch, L., Lacroix, C., Makridakis, M., Masselon, C., Metzger, J., Monsarrat, B., Mrug, M., Norling, M., Novak, J., Pich, A., Pitt, A., Bongcam-Rudloff, E., Siwy, J., Suzuki, H., Thongboonkerd, V., Wang, L., Zoidakis, J., Zurbig, P., Schanstra, J., and Vlahou, A. (2010) Comprehensive human urine standards for comparability and standardization in clinical proteome analysis. Proteomics Clin Appl 4, 464–78.
Dettmer, K., Aronov, P.A., and Hammock, B.D. (2007) Mass spectrometry-based metabolomics. Mass Spectrom Rev 26, 51–78.
Ramautar, R., Somsen, G.W., and de Jong, G.J. (2009) CE-MS in metabolomics. Electrophoresis 30, 276–91.
Mittlböck, M., and Schemper, M. (1996) Explained variation for logistic regression. Stat Med 15, 1987–97.
Schemper, M., and Henderon, R. (2000) Predictive accuracy and explained variation in Cox regression. Biometrics 56, 249–55.
Heinze, G., and Schemper, M. (2003) Comparing the importance of prognostic factors in Cox and logistic regression using SAS. Comput Methods Programs Biomed 71, 1455–63.
Dunkler, D., Michiels, S., and Schemper, M. (2007) Gene expression profiling: Does it add predictive accuracy to clinical characteristics in cancer prognosis? Eur J Cancer 43, 745–51.
Tibshirani, R. (1996) Regression shrinkage and selection via the lasso. J Royal Stat Soc B 58, 267–88.
Tibshirani, R. (1997) The lasso method for variable selection in the Cox model. Stat Med 16, 385–95.
le Cessie, S., and van Houwelingen, H.C. (1992) Ridge estimators in logistic regression. Appl Stat 41, 191–201.
Verweij, P.J.M., and van Houwelingen, H.C. (1994) Penalized likelihood in Cox regression. Stat Med 13, 2427–36.
Zou, H., and Hastie, T. (2005) Regularization and variable selection via the elastic net. J Royal Stat Soc B 67, 301–20.
Berrar, D., Bradbury, I., and Dubitzky, W. (2006) Avoiding model selection bias in small-sample genomic datasets. Bioinformatics 15, 1245–50.
Lusa, L., McShane, L.M., Radmacher, M.D., Shih, J.H., Wright, G.W., and Simon, R. (2007) Appropriateness of some resampling-based inference procedures for assessing performance of prognostic classifiers derived from microarray data. Stat Med 28, 1102–13.
Jiang, W., Varma, S., and Simon, R. (2008) Calculating confidence intervals for prediction error in microarray classification using resampling. Stat Appl Genet Mol Biol 7, 8.
Gatsonis, C., and Sampson, A.R. (1989) Multiple correlation: Exact power and sample size calculations. Psychol Bull 106, 516–24.
Granger, C.B., Van Eyk, J.E., Mockrin, S.C., and Anderson, N.L. (2004) National Heart, Lung, and Blood Institute Clinical Proteomics Working Group report. Circulation 109, 1697–703.
Mischak, H., Apweiler, R., Banks, R.E., Conaway, M., Coon, J.J., Dominizak, A., Ehrich, J.H., Fliser, D., Girolami, M., Hermjakob, H., Hochstrasser, D.F., Jankowski, V., Julian, B.A., Kolch, W., Massy, Z., Neususs, C., Novak, J., Peter, K., Rossing, K., Schanstra, J.P., Semmes, O.J., Theodorescu, D., Thongboonkerd, V., Weissinger, E.M., Van Eyk, J.E., and Yamamoto, T. (2007) Clinical proteomics: A need to define the field and to begin to set adequate standards. Proteomics Clin Appl 1, 148–56.
Acknowledgements
This work was supported by the European Union FP7 project “SysKid”, project number 241544.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Mayer, G. et al. (2011). Omics–Bioinformatics in the Context of Clinical Data. In: Mayer, B. (eds) Bioinformatics for Omics Data. Methods in Molecular Biology, vol 719. Humana Press. https://doi.org/10.1007/978-1-61779-027-0_22
Download citation
DOI: https://doi.org/10.1007/978-1-61779-027-0_22
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-61779-026-3
Online ISBN: 978-1-61779-027-0
eBook Packages: Springer Protocols