Processing and Analysis of GC/LC-MS-Based Metabolomics Data

Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 708)

Abstract

Data processing forms a crucial step in metabolomics studies, impacting upon data output quality, analysis potential and subsequent biological interpretation. This chapter provides an overview of data processing and analysis of GC-MS- and LC-MS-based metabolomics data. Data preprocessing steps are described, including the different software available for dealing with such complex datasets. Multivariate techniques for the subsequent analysis of metabolomics data, including principal components analysis (PCA) and partial least squares discriminant analysis (PLS-DA), are described with illustrations. Steps for the identification of potential biomarkers and the use of metabolite databases are also outlined.

Key words

GC-MS LC-MS metabolomics metabolite alignment multivariate PCA PLS-DA 

Notes

Acknowledgements

The authors would like to acknowledge Dr. Timothy Ebbels for valuable discussions during the preparation of this chapter. EW would like to acknowledge Waters Corporation for funding. Perrine Masson would like to acknowledge Servier Laboratories Ltd. for funding.

References

  1. 1.
    Nicholson, J. K., Connelly, J., Lindon, J. C., Holmes, E. (2002) Metabonomics: a platform for studying drug toxicity and gene function. Nat Rev Drug Discov 1, 153–161.PubMedCrossRefGoogle Scholar
  2. 2.
    Fiehn, O. (2002) Metabolomics – the link between genotypes and phenotypes. Plant Mol Biol 48, 155–171.PubMedCrossRefGoogle Scholar
  3. 3.
    Nicholson, J. K., Lindon, J. C. (2008) Systems biology: metabonomics. Nature 455, 1054–1056.PubMedCrossRefGoogle Scholar
  4. 4.
    Trygg, J., Holmes, E., Lundstedt, T. (2007) Chemometrics in metabonomics. J Proteome Res 6, 469–479.PubMedCrossRefGoogle Scholar
  5. 5.
    Smith, C. A., Want, E. J., O’Maille, G., Abagyan, R., Siuzdak, G. (2006) XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal Chem 78, 779–787.PubMedCrossRefGoogle Scholar
  6. 6.
    Burton, L., Ivosev, G., Tate, S., Impey, G., Wingate, J., Bonner, R. (2008) Instrumental and experimental effects in LC-MS-based metabolomics. J Chromatogr B Anal Technol Biomed Life Sci 871, 227–235.CrossRefGoogle Scholar
  7. 7.
    Scholz, M., Gatzek, S., Sterling, A., Fiehn, O., Selbig, J. (2004) Metabolite fingerprinting: detecting biological features by independent component analysis. Bioinformatics 20, 2447–2454.PubMedCrossRefGoogle Scholar
  8. 8.
    Wang, W., Zhou, H., Lin, H., Roy, S., Shaler, T. A., Hill, L. R., Norton, S., Kumar, P., Anderle, M., Becker, C. H. (2003) Quantification of proteins and metabolites by mass spectrometry without isotopic labeling or spiked standards. Anal Chem 75, 4818–4826.PubMedCrossRefGoogle Scholar
  9. 9.
    Oresic, M., Clish, C. B., Davidov, E. J., Verheij, E., Vogels, J., Havekes, L. M., Neumann, E., Adourian, A., Naylor, S., van der Greef, J., Plasterer, T. (2004) Phenotype characterization using integrated gene, protein and metabolite profiling. Appl Bioinform 3, 205–217.CrossRefGoogle Scholar
  10. 10.
    Yeung, K. Y., Ruzzo, W. L. (2001) Principal components analysis for clustering gene expression data. Bioinformatics 17, 763–774.PubMedCrossRefGoogle Scholar
  11. 11.
    Jolliffe, I. T. (2002) Principal Components Analysis, 2nd edn, Springer, New York, NY.Google Scholar
  12. 12.
    Ivosev, G., Burton, L., Bonner, R. (2008) Dimensionality reduction and visualization in principal components analysis. Anal Chem 80, 4933–4944.PubMedCrossRefGoogle Scholar
  13. 13.
    Barker, M., Rayens, W. (2003) Partial least squares for discrimination. J Chemom 17, 166–173.CrossRefGoogle Scholar
  14. 14.
    Bylesjo, M., Rantalainen, M., Cloarec, O., Nicholson, J. K. (2006) OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification. J Chemom 20, 341–351.CrossRefGoogle Scholar
  15. 15.
    Trygg, J., Wold, S. (2002) Orthogonal projections to latent structures (O-PLS). J Chemom 16, 119–128.CrossRefGoogle Scholar
  16. 16.
    Trygg, J. (2002) O2-PLS for qualitative and quantitative analysis in multivariate calibration. J Chemom 16, 283–293.CrossRefGoogle Scholar
  17. 17.
    Jackson, J. E. (2003) A User’s Guide to Principal Components, Wiley–Interscience, New York, NY.Google Scholar
  18. 18.
    Zelena, E., Dunn, W. B., Broadhurst, D., Francis-McIntyre, S., Carroll, K. M., Begley, P., O‘Hagan, S., Knowles, J. D., Halsall, A., HUSERMET Consortium, Wilson, I. D., Kell, D. B. (2009) Development of a robust and repeatable UPLC-MS method for the long-term metabolomics study of human serum. Anal Chem 81, 1357–1364.PubMedCrossRefGoogle Scholar
  19. 19.
    Gika, H. G., Macpherson, E., Theodoridis, G. A., Wilson, I. D. (2008) Evaluation of the repeatability of ultra-performance liquid chromatography–TOF-MS for global metabolic profiling of human urine samples. J Chromatogr B Anal Technol Biomed Life Sci 871, 299–305.CrossRefGoogle Scholar
  20. 20.
    Brereton, R. G. (2006) Consequences of sample sizes, variable selection, model validation and optimisation for predicting classification ability from analytical data. Trends Anal Chem 25, 1103–1111.CrossRefGoogle Scholar
  21. 21.
    Anderssen, E., Dyrstad, K., Westad, F., Martens, H. (2006) Reducing over-optimism in variable selection by cross-model validation. Chemom Intell Lab Syst 84, 69–74.CrossRefGoogle Scholar
  22. 22.
    Romero, P., Wagg, J., Green, M. L., Kaiser, D., Krummenacker, M., Karp, P. D. (2005) Computational prediction of human metabolic pathways from the complete human genome. Genome Biol 6, R2.PubMedCrossRefGoogle Scholar
  23. 23.
    Wishart, D. S., Tzur, D., Knox, C., Eisner, R., Guo, A. C., Young, N., Cheng, D., Jewell, K., Arndt, D., Sawhney, S., Fung, C., Nikolai, L., Lewis, M., Coutouly, M. A., Forsythe, I., Tang, P., Shrivastava, S., Jeroncic, K., Stothard, P., Amegbey, G., Block, D., Hau, D. D., Wagner, J., Miniaci, J., Clements, M., Gebremedhin, M., Guo, N., Zhang, Y., Duggan, G. E., Macinnis, G. D., Weljie, A. M., Dowlatabadi, R., Bamforth, F., Clive, D., Greiner, R., Li, L., Marrie, T., Sykes, B. D., Vogel, H. J., Querengesser, L. (2007) HMDB: the human metabolome database. Nucleic Acids Res 35, D521–D526.PubMedCrossRefGoogle Scholar
  24. 24.
    Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., Kanehisa, M. (1999) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 27, 29–34.PubMedCrossRefGoogle Scholar
  25. 25.
    Smith, C. A., O’Maille, G., Want, E. J., Qin, C., Trauger, S. A., Brandon, T. R., Custodio, D. E., Abagyan, R., Siuzdak, G. (2005) METLIN: a metabolite mass spectral database. Ther Drug Monit 27, 747–751.PubMedCrossRefGoogle Scholar
  26. 26.
    Babushok, V. I., Linstrom, P. J., Reed, J. J., Zenkevich, I. G., Brown, R. L., Mallard, W. G., Stein, S. E. (2007) Development of a database of gas chromatographic retention properties of organic compounds. J Chromatogr A 1157, 414–421.PubMedCrossRefGoogle Scholar
  27. 27.
    Kopka, J., Schauer, N., Krueger, S., Birkemeyer, C., Usadel, B., Bergmüller, E., Dörmann, P., Weckwerth, W., Gibon, Y., Stitt, M., Willmitzer, L., Fernie, A. R., Steinhauser, D. (2005) GMD@CSB.DB: the Golm Metabolome Database. Bioinformatics 21, 1635–1638.PubMedCrossRefGoogle Scholar
  28. 28.
    Weckwerth, W., Morgenthal, K. (2005) Metabolomics: from pattern recognition to biological interpretation. Drug Discov Today 10, 1551–1558.PubMedCrossRefGoogle Scholar
  29. 29.
    Saghatelian, A., Trauger, S. A., Want, E. J., Hawkins, E. G., Siuzdak, G., Cravatt, B. F. (2004) Assignment of endogenous substrates to enzymes by global metabolite profiling. Biochemistry 43, 14332–14339.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.Biomolecular Medicine, Department of Surgery and Cancer, Faculty of MedicineImperial CollegeLondonUK

Personalised recommendations