Skip to main content

Analyzing the Metabolome

  • Protocol
  • First Online:
Clinical Bioinformatics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1168))

Abstract

Metabolites, the chemical entities that are transformed during metabolism, provide a functional readout of cellular biochemistry that offers the best prediction of the phenotype and the nature of a disease. Mass spectrometry now allows thousands of metabolites to be quantitated. The targeted or untargeted data from metabolic profiling can be combined with either supervised or unsupervised approaches to improve interpretation. These sophisticated statistical techniques are computationally intensive. This chapter reviews techniques applicable to metabolomics approaches to disease.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

AUC:

Area under curve

LC:

Liquid chromatography

MS:

Mass spectrometry

NMR:

Nuclear magnetic resonance

ROC:

Receiver operating characteristic

References

  1. Patti G, Yanes O, Siuzdak G (2012) Innovation: metabolomics: the apogee of the omic trilogy. Nat Rev Mol Cell Biol 13:263–269

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  2. Janecková H, Hron K, Wojtowicz P et al (2012) Targeted metabolomic analysis of plasma samples for the diagnosis of inherited metabolic disorders. J Chromatogr A 1226:11–17

    Article  PubMed  Google Scholar 

  3. Robinson AB, Robinson NE (2011) Origins of metabolic profiling. Methods Mol Biol 708:1–23

    Article  CAS  PubMed  Google Scholar 

  4. Kind T, Scholz M, Fiehn O (2009) How large is the metabolome? A critical analysis of data exchange practices in chemistry. PLoS One 4:e5440

    Article  PubMed Central  PubMed  Google Scholar 

  5. Dudley E, Yousef M, Wang Y et al (2010) Targeted metabolomics and mass spectrometry. Adv Protein Chem Struct Biol 80:45–83

    Article  CAS  PubMed  Google Scholar 

  6. Yanes O, Tautenhahn R, Patti GJ et al (2011) Expanding coverage of the metabolome for global metabolite profiling. Anal Chem 83:2152–2161

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  7. Suhre K, Shin SY, Petersen AK et al (2011) Human metabolic individuality in biomedical and pharmaceutical research. Nature 477:54–60

    Article  CAS  PubMed  Google Scholar 

  8. Nordstrom A, Want E, Northen T et al (2008) Multiple ionization mass spectrometry strategy used to reveal the complexity of metabolomics. Anal Chem 80:421–429

    Article  PubMed  Google Scholar 

  9. Buescher JM, Moco S, Sauer U et al (2010) Ultrahigh performance liquid chromatography-tandem mass spectrometry method for fast and robust quantification of anionic and aromatic metabolites. Anal Chem 82:4403–4412

    Article  CAS  PubMed  Google Scholar 

  10. Want EJ, O’Maille G, Smith CA et al (2006) Solvent-dependent metabolite distribution, clustering, and protein extraction for serum profiling with mass spectrometry. Anal Chem 78:743–752

    Article  CAS  PubMed  Google Scholar 

  11. Patti GJ (2011) Separation strategies for untargeted metabolomics. J Sep Sci 34:3406–3469

    Article  Google Scholar 

  12. Xia J, Psychogios N, Young N et al (2009) MetaboAnalyst: a web server for metabolomic data analysis and interpretation. Nucleic Acids Res 37:W652–W660

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  13. Tautenhahn R, Patti GJ, Tinehart D et al (2012) XCMS Online: a web based platform to process untargetted metabolomic data. Anal Chem 84:5035–5039

    Google Scholar 

  14. Wishart D, Tzur D, Knox C et al (2007) HMDB: the Human Metabolome Database. Nucleic Acids Res 35:D521–D526

    Google Scholar 

  15. Smith CA, O’Maille G, Want EJ et al (2005) METLIN: a metabolite mass spectral database. Ther Drug Monit 27:747–751

    Article  CAS  PubMed  Google Scholar 

  16. Zhu ZJ, Schultz AW, Wang J et al (2013) Nat Protoc 8: 451–460. Scripps Centre for Metabolomics and Mass Spectrometry: METLIN. http://metlin.scripps.edu/

  17. Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36

    CAS  PubMed  Google Scholar 

  18. Lehmann EL (1975) Non parametric statistical methods based on ranks. Holden-Day, San Francisco, CA, Section 1.2

    Google Scholar 

  19. Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70

    Google Scholar 

  20. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc B 57:289–300

    Google Scholar 

  21. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning: data mining, inference and prediction, 1st edn, Springer series in statistics. Springer, New York

    Book  Google Scholar 

  22. Zhang Z, Chan DW (2010) The road from discovery to clinical diagnostics: lessons learned from the first FDA-cleared in vitro diagnostic multivariate index assay of proteomic bio-markers. Cancer Epidemiol Biomarkers Prev 19:2995–2999

    Article  CAS  PubMed  Google Scholar 

  23. Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J Roy Stat Soc B 36:111–147

    Google Scholar 

  24. Braga-Neto UM, Dougherty ER (2004) Is cross-validation valid for small-sample microarray classification? Bioinformatics 20:374–380

    Article  CAS  PubMed  Google Scholar 

  25. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J Roy Stat Soc B 67:301–320

    Article  Google Scholar 

  26. Kiiveri HT (2008) A general approach to simultaneous model fitting and variable elimination in response models for biological data with many more variables than observations. BMC Bioinformatics 9:195

    Article  PubMed Central  PubMed  Google Scholar 

  27. Ding B, Gentleman R (2005) Classification using generalized partial least squares. J Comput Graph Stat 14:280–298

    Article  Google Scholar 

  28. Hastie T, Buja A, Tibshirani R (1995) Penalized discriminant analysis. Ann Stat 23:73–102

    Article  Google Scholar 

  29. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297

    Google Scholar 

  30. Cristiani N, Taylor JS (2000) An introduction to support vector machines. Cambridge University Press, Cambridge

    Google Scholar 

  31. Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  32. Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 28:337–374

    Article  Google Scholar 

  33. Blanchard G, Lugosi G, Vayatis N (2003) On the rate of convergence of regularized boosting classifiers. J Mach Learn Res 4:861–894

    Google Scholar 

  34. Baran R, Kochi H, Saito N et al (2006) MathDAMP: a package for differential analysis of metabolite profiles. BMC Bioinformatics 7:530

    Article  PubMed Central  PubMed  Google Scholar 

  35. Lommen A (2009) MetAlign: interface-driven, versatile metabolomics tool for hyphenated full-scan mass spectrometry data preprocessing. Anal Chem 81:3079–3086

    Article  CAS  PubMed  Google Scholar 

  36. Katajamaa M, Miettinen J, Oresic M (2006) MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22:634–636

    Article  CAS  PubMed  Google Scholar 

  37. Smith C, Want E, O’Maille G et al (2006) XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal Chem 78:779–787

    Article  CAS  PubMed  Google Scholar 

  38. Jolliffe I (1986) Principal components analysis. Springer, New York

    Book  Google Scholar 

  39. Friedman JH, Tukey JW (1974) A projection pursuit algorithm for exploratory data analysis. IEEE Trans Comput 23:881–889

    Article  Google Scholar 

  40. Hyvärinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neural Netw 13:411–430

    Article  PubMed  Google Scholar 

  41. Everitt B, Landau S, Leese M (2001) Cluster analysis, 4th edn. Edward Arnold, London

    Google Scholar 

  42. Hartigan J, Wong M (1979) A K-means clustering algorithm. J Roy Stat Soc C-App 28:100–108

    Google Scholar 

  43. Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J Roy Stat Soc B 63:411–423

    Article  Google Scholar 

  44. Feng Y, Hamerly G (2006) PG-means: learning the number of clusters in data. In: Scholkope B, Platt J, Hofmann T (eds) Advances in neural information processing systems 19. MIT, Cambridge, MA, pp 393–400

    Google Scholar 

  45. Sibson R (1973) SLINK: an optimally efficient algorithm for the single-link cluster method. Comput J 16:30–34

    Article  Google Scholar 

  46. The Comprehensive R Archive Network: R Sources (2014) http://cran.r-project.org/. Accessed 14 Apr 2014

  47. Tautenhahn R, Patti G, Kalisiak E et al (2011) metaXCMS: second-order analysis of untargeted metabolomics data. Anal Chem 83:696–700

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  48. The Comprehensive R Archive Network. http://cran.r-project.org/

  49. R Development Core Team (2003) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, www.R-project.org

    Google Scholar 

  50. Bioconductor: High Throughput Assays (2014) http://www.bioconductor.org/. Accessed 14 Apr 2014

  51. Gentleman RC, Carey VJ, Bates DM et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5:R80

    Article  PubMed Central  PubMed  Google Scholar 

  52. Venkatraman E, Begg CB (1996) A distribution-free procedure for comparing receiver operating characteristic curves from a paired experiment. Biometrika 83:835–848

    Article  Google Scholar 

  53. Begg CB (1987) Biases in the assessment of diagnostic tests. Stat Med 6:411–423

    Article  CAS  PubMed  Google Scholar 

  54. Ambroise C, McLachlan GJ (2002) Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci USA 99:6562–6566

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  55. Witten JG, Hastie T, Tibshirani R (2013) An introduction to statistical learning with applications in R. Springer, New York

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francis G. Bowling .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this protocol

Cite this protocol

Bowling, F.G., Thomas, M. (2014). Analyzing the Metabolome. In: Trent, R. (eds) Clinical Bioinformatics. Methods in Molecular Biology, vol 1168. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0847-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-0847-9_3

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-0846-2

  • Online ISBN: 978-1-4939-0847-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics