Metabolic Profiling pp 277-298 | Cite as
Processing and Analysis of GC/LC-MS-Based Metabolomics Data
Abstract
Data processing forms a crucial step in metabolomics studies, impacting upon data output quality, analysis potential and subsequent biological interpretation. This chapter provides an overview of data processing and analysis of GC-MS- and LC-MS-based metabolomics data. Data preprocessing steps are described, including the different software available for dealing with such complex datasets. Multivariate techniques for the subsequent analysis of metabolomics data, including principal components analysis (PCA) and partial least squares discriminant analysis (PLS-DA), are described with illustrations. Steps for the identification of potential biomarkers and the use of metabolite databases are also outlined.
Key words
GC-MS LC-MS metabolomics metabolite alignment multivariate PCA PLS-DANotes
Acknowledgements
The authors would like to acknowledge Dr. Timothy Ebbels for valuable discussions during the preparation of this chapter. EW would like to acknowledge Waters Corporation for funding. Perrine Masson would like to acknowledge Servier Laboratories Ltd. for funding.
References
- 1.Nicholson, J. K., Connelly, J., Lindon, J. C., Holmes, E. (2002) Metabonomics: a platform for studying drug toxicity and gene function. Nat Rev Drug Discov 1, 153–161.PubMedCrossRefGoogle Scholar
- 2.Fiehn, O. (2002) Metabolomics – the link between genotypes and phenotypes. Plant Mol Biol 48, 155–171.PubMedCrossRefGoogle Scholar
- 3.Nicholson, J. K., Lindon, J. C. (2008) Systems biology: metabonomics. Nature 455, 1054–1056.PubMedCrossRefGoogle Scholar
- 4.Trygg, J., Holmes, E., Lundstedt, T. (2007) Chemometrics in metabonomics. J Proteome Res 6, 469–479.PubMedCrossRefGoogle Scholar
- 5.Smith, C. A., Want, E. J., O’Maille, G., Abagyan, R., Siuzdak, G. (2006) XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal Chem 78, 779–787.PubMedCrossRefGoogle Scholar
- 6.Burton, L., Ivosev, G., Tate, S., Impey, G., Wingate, J., Bonner, R. (2008) Instrumental and experimental effects in LC-MS-based metabolomics. J Chromatogr B Anal Technol Biomed Life Sci 871, 227–235.CrossRefGoogle Scholar
- 7.Scholz, M., Gatzek, S., Sterling, A., Fiehn, O., Selbig, J. (2004) Metabolite fingerprinting: detecting biological features by independent component analysis. Bioinformatics 20, 2447–2454.PubMedCrossRefGoogle Scholar
- 8.Wang, W., Zhou, H., Lin, H., Roy, S., Shaler, T. A., Hill, L. R., Norton, S., Kumar, P., Anderle, M., Becker, C. H. (2003) Quantification of proteins and metabolites by mass spectrometry without isotopic labeling or spiked standards. Anal Chem 75, 4818–4826.PubMedCrossRefGoogle Scholar
- 9.Oresic, M., Clish, C. B., Davidov, E. J., Verheij, E., Vogels, J., Havekes, L. M., Neumann, E., Adourian, A., Naylor, S., van der Greef, J., Plasterer, T. (2004) Phenotype characterization using integrated gene, protein and metabolite profiling. Appl Bioinform 3, 205–217.CrossRefGoogle Scholar
- 10.Yeung, K. Y., Ruzzo, W. L. (2001) Principal components analysis for clustering gene expression data. Bioinformatics 17, 763–774.PubMedCrossRefGoogle Scholar
- 11.Jolliffe, I. T. (2002) Principal Components Analysis, 2nd edn, Springer, New York, NY.Google Scholar
- 12.Ivosev, G., Burton, L., Bonner, R. (2008) Dimensionality reduction and visualization in principal components analysis. Anal Chem 80, 4933–4944.PubMedCrossRefGoogle Scholar
- 13.Barker, M., Rayens, W. (2003) Partial least squares for discrimination. J Chemom 17, 166–173.CrossRefGoogle Scholar
- 14.Bylesjo, M., Rantalainen, M., Cloarec, O., Nicholson, J. K. (2006) OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification. J Chemom 20, 341–351.CrossRefGoogle Scholar
- 15.Trygg, J., Wold, S. (2002) Orthogonal projections to latent structures (O-PLS). J Chemom 16, 119–128.CrossRefGoogle Scholar
- 16.Trygg, J. (2002) O2-PLS for qualitative and quantitative analysis in multivariate calibration. J Chemom 16, 283–293.CrossRefGoogle Scholar
- 17.Jackson, J. E. (2003) A User’s Guide to Principal Components, Wiley–Interscience, New York, NY.Google Scholar
- 18.Zelena, E., Dunn, W. B., Broadhurst, D., Francis-McIntyre, S., Carroll, K. M., Begley, P., O‘Hagan, S., Knowles, J. D., Halsall, A., HUSERMET Consortium, Wilson, I. D., Kell, D. B. (2009) Development of a robust and repeatable UPLC-MS method for the long-term metabolomics study of human serum. Anal Chem 81, 1357–1364.PubMedCrossRefGoogle Scholar
- 19.Gika, H. G., Macpherson, E., Theodoridis, G. A., Wilson, I. D. (2008) Evaluation of the repeatability of ultra-performance liquid chromatography–TOF-MS for global metabolic profiling of human urine samples. J Chromatogr B Anal Technol Biomed Life Sci 871, 299–305.CrossRefGoogle Scholar
- 20.Brereton, R. G. (2006) Consequences of sample sizes, variable selection, model validation and optimisation for predicting classification ability from analytical data. Trends Anal Chem 25, 1103–1111.CrossRefGoogle Scholar
- 21.Anderssen, E., Dyrstad, K., Westad, F., Martens, H. (2006) Reducing over-optimism in variable selection by cross-model validation. Chemom Intell Lab Syst 84, 69–74.CrossRefGoogle Scholar
- 22.Romero, P., Wagg, J., Green, M. L., Kaiser, D., Krummenacker, M., Karp, P. D. (2005) Computational prediction of human metabolic pathways from the complete human genome. Genome Biol 6, R2.PubMedCrossRefGoogle Scholar
- 23.Wishart, D. S., Tzur, D., Knox, C., Eisner, R., Guo, A. C., Young, N., Cheng, D., Jewell, K., Arndt, D., Sawhney, S., Fung, C., Nikolai, L., Lewis, M., Coutouly, M. A., Forsythe, I., Tang, P., Shrivastava, S., Jeroncic, K., Stothard, P., Amegbey, G., Block, D., Hau, D. D., Wagner, J., Miniaci, J., Clements, M., Gebremedhin, M., Guo, N., Zhang, Y., Duggan, G. E., Macinnis, G. D., Weljie, A. M., Dowlatabadi, R., Bamforth, F., Clive, D., Greiner, R., Li, L., Marrie, T., Sykes, B. D., Vogel, H. J., Querengesser, L. (2007) HMDB: the human metabolome database. Nucleic Acids Res 35, D521–D526.PubMedCrossRefGoogle Scholar
- 24.Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., Kanehisa, M. (1999) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 27, 29–34.PubMedCrossRefGoogle Scholar
- 25.Smith, C. A., O’Maille, G., Want, E. J., Qin, C., Trauger, S. A., Brandon, T. R., Custodio, D. E., Abagyan, R., Siuzdak, G. (2005) METLIN: a metabolite mass spectral database. Ther Drug Monit 27, 747–751.PubMedCrossRefGoogle Scholar
- 26.Babushok, V. I., Linstrom, P. J., Reed, J. J., Zenkevich, I. G., Brown, R. L., Mallard, W. G., Stein, S. E. (2007) Development of a database of gas chromatographic retention properties of organic compounds. J Chromatogr A 1157, 414–421.PubMedCrossRefGoogle Scholar
- 27.Kopka, J., Schauer, N., Krueger, S., Birkemeyer, C., Usadel, B., Bergmüller, E., Dörmann, P., Weckwerth, W., Gibon, Y., Stitt, M., Willmitzer, L., Fernie, A. R., Steinhauser, D. (2005) GMD@CSB.DB: the Golm Metabolome Database. Bioinformatics 21, 1635–1638.PubMedCrossRefGoogle Scholar
- 28.Weckwerth, W., Morgenthal, K. (2005) Metabolomics: from pattern recognition to biological interpretation. Drug Discov Today 10, 1551–1558.PubMedCrossRefGoogle Scholar
- 29.Saghatelian, A., Trauger, S. A., Want, E. J., Hawkins, E. G., Siuzdak, G., Cravatt, B. F. (2004) Assignment of endogenous substrates to enzymes by global metabolite profiling. Biochemistry 43, 14332–14339.PubMedCrossRefGoogle Scholar