Computational Strategies for Biological Interpretation of Metabolomics Data

  • Jianguo XiaEmail author
Part of the Advances in Experimental Medicine and Biology book series (AEMB, volume 965)


Biological interpretation of metabolomics data relies on two basic steps: metabolite identification and functional analysis. These two steps need to be applied in a coordinated manner to enable effective data understanding. The focus of this chapter is to introduce the main computational concepts and workflows during this process. After a general overview of the field, three sections will be presented: the first section will introduce the main computational methods and bioinformatics tools for metabolite identification using spectra from common analytical platforms; the second section will focus on introducing major bioinformatics approaches for functional enrichment analysis of metabolomics data; and the last section will discuss the three main workflows in current metabolomics studies, including the chemometrics approach, the metabolic profiling approach and the more recent chemo-enrichment analysis approach. The chapter ends with summary and future perspectives on computational metabolomics.


Metabolomics Chemometrics Metabolic profiling Metabolite set enrichment analysis Chemo-enrichment analysis 



Automated mass spectral deconvolution and identification system


Bayesian automated metabolite analyzer for NMR


Gas chromatography mass spectrometry


Cerebral spinal fluid


Gene ontology


Gene set enrichment analysis


Liquid chromatography mass spectrometry


Metabolite set enrichment analysis


National Institute of Standards and Technology


Nuclear magnetic resonance


Principal component analysis


Partial least squares discriminant analysis


Orthogonal partial least squares discriminant analysis


Overrepresentation analysis


Polymerase chain reaction


  1. 1.
    Delaglio F, Grzesiek S, Vuister GW, Zhu G, Pfeifer J, Bax A. Nmrpipe – a multidimensional spectral processing system based on Unix pipes. J Biomol NMR. 1995;6(3):277–93. doi: 10.1007/Bf00197809.CrossRefPubMedGoogle Scholar
  2. 2.
    Zhao Q, Stoyanova R, Du SY, Sajda P, Brown TR. HiRes – a tool for comprehensive assessment and interpretation of metabolomic data. Bioinformatics. 2006;22(20):2562–4. doi: 10.1093/bioinformatics/btl428.CrossRefPubMedGoogle Scholar
  3. 3.
    Wishart DS, Jewison T, Guo AC, Wilson M, Knox C, Liu Y, et al. HMDB 3.0–the human metabolome database in 2013. Nucleic Acids Res. 2013;41(Database issue):D801–7. doi: 10.1093/nar/gks1065.
  4. 4.
    Cui Q, Lewis IA, Hegeman AD, Anderson ME, Li J, Schulte CF, et al. Metabolite identification via the Madison metabolomics consortium database. Nat Biotechnol. 2008;26(2):162–4. doi: 10.1038/nbt0208-162.CrossRefPubMedGoogle Scholar
  5. 5.
    Hao J, Liebeke M, Astle W, De Iorio M, Bundy JG, Ebbels TMD. Bayesian deconvolution and quantification of metabolites in complex 1D NMR spectra using BATMAN. Nat Protoc. 2014;9(6):1416–27. doi: 10.1038/nprot.2014.090.CrossRefPubMedGoogle Scholar
  6. 6.
    Hao J, Astle W, De Iorio M, Ebbels TMD. BATMAN-an R package for the automated quantification of metabolites from nuclear magnetic resonance spectra using a Bayesian model. Bioinformatics. 2012;28(15):2088–90. doi: 10.1093/bioinformatics/bts308.CrossRefPubMedGoogle Scholar
  7. 7.
    Ravanbakhsh S, Liu P, Bjorndahl TC, Mandal R, Grant JR, Wilson M, et al. Accurate, fully-automated NMR spectral profiling for metabolomics (vol 10, e0124219, 2015). Plos One. 2015;10(7). doi: 10.1371/journal.pone.0132873.
  8. 8.
    Xia JG, Bjorndahl TC, Tang P, Wishart DS. MetaboMiner – semi-automated identification of metabolites from 2D NMR spectra of complex biofluids. BMC Bioinformatics. 2008;9:507. doi: 10.1186/1471-2105-9-507.CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Lewis IA, Schommer SC, Markley JL. rNMR: open source software for identifying and quantifying metabolites in NMR spectra. Magn Reson Chem. 2009;47:S123–6. doi: 10.1002/mrc.2526.CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Stein SE. An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data. J Am Soc Mass Spectr. 1999;10(8):770–81. doi: 10.1016/S1044-0305(99)00047-1.CrossRefGoogle Scholar
  11. 11.
    Fiehn O, Wohlgemuth G, Scholz M. Setup and annotation of metabolomic experiments by integrating biological and mass spectrometric metadata. Lect Notes Comput Sci. 2005;3615:224–39. doi: 10.1007/11530084_18.CrossRefGoogle Scholar
  12. 12.
    Bunk B, Kucklick M, Jonas R, Munch R, Schobert M, Jahn D, et al. MetaQuant: a tool for the automatic quantification of GC/MS-based metabolome data. Bioinformatics. 2006;22(23):2962–5. doi: 10.1093/bioinformatics/btl526.CrossRefPubMedGoogle Scholar
  13. 13.
    Carroll AJ, Badger MR, Millar AH. The MetabolomeExpress Project: enabling web-based processing, analysis and transparent dissemination of GC/MS metabolomics datasets. BMC Bioinformatics. 2010;11. doi: 10.1186/1471-2105-11-376.
  14. 14.
    Hiller K, Hangebrauk J, Jager C, Spura J, Schreiber K, Schomburg D. MetaboliteDetector: comprehensive analysis tool for targeted and nontargeted GC/MS based metabolome analysis. Anal Chem. 2009;81(9):3429–39. doi: 10.1021/ac802689c.CrossRefPubMedGoogle Scholar
  15. 15.
    Luedemann A, Strassburg K, Erban A, Kopka J. TagFinder for the quantitative analysis of gas chromatography – mass spectrometry (GC-MS)-based metabolite profiling experiments. Bioinformatics. 2008;24(5):732–7. doi: 10.1093/bioinformatics/btn023.CrossRefPubMedGoogle Scholar
  16. 16.
    Schauer N, Steinhauser D, Strelkov S, Schomburg D, Allison G, Moritz T, et al. GC-MS libraries for the rapid identification of metabolites in complex biological samples. FEBS Lett. 2005;579(6):1332–7. doi: 10.1016/j.febslet.2005.01.029.CrossRefPubMedGoogle Scholar
  17. 17.
    Kind T, Wohlgemuth G, Lee DY, Lu Y, Palazoglu M, Shahbaz S, et al. FiehnLib: mass spectral and retention index libraries for metabolomics based on quadrupole and time-of-flight gas chromatography/mass spectrometry. Anal Chem. 2009;81(24):10038–48. doi: 10.1021/ac9019522.CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Wishart DS, Lewis MJ, Morrissey JA, Flegel MD, Jeroncic K, Xiong YP, et al. The human cerebrospinal fluid metabolome. J Chromatogr B. 2008;871(2):164–73. doi: 10.1016/j.jchromb.2008.05.001.CrossRefGoogle Scholar
  19. 19.
    Psychogios N, Hau DD, Peng J, Guo AC, Mandal R, Bouatra S, et al. The human serum metabolome. Plos One. 2011;6(2). doi: 10.1371/journal.pone.0016957.
  20. 20.
    Bouatra S, Aziat F, Mandal R, Guo AC, Wilson MR, Knox C, et al. The human urine metabolome. Plos One. 2013;8(9). doi: 10.1371/journal.pone.0073076.
  21. 21.
    Kind T, Fiehn O. Metabolomic database annotations via query of elemental compositions: mass accuracy is insufficient even at less than 1 ppm. BMC Bioinformatics. 2006;7. doi: 10.1186/1471-2105-7-234.
  22. 22.
    Weber RJM, Viant MR. MI-pack: increased confidence of metabolite identification in mass spectra by integrating accurate masses and metabolic pathways. Chemometr Intell Lab. 2010;104(1):75–82. doi: 10.1016/j.chemolab.2010.04.010.CrossRefGoogle Scholar
  23. 23.
    Silva RR, Jourdan F, Salvanha DM, Letisse F, Jamin EL, Guidetti-Gonzalez S, et al. ProbMetab: an R package for Bayesian probabilistic annotation of LC-MS-based metabolomics. Bioinformatics. 2014;30(9):1336–7. doi: 10.1093/bioinformatics/btu019.CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Daly R, Rogers S, Wandy J, Jankevics A, Burgess KEV, Breitling R. MetAssign: probabilistic annotation of metabolites from LC-MS data using a Bayesian clustering approach. Bioinformatics. 2014;30(19):2764–71. doi: 10.1093/bioinformatics/btu370.CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Liberzon A, Subramanian A, Pinchback R, Thorvaldsdottir H, Tamayo P, Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011;27(12):1739–40. doi: 10.1093/bioinformatics/btr260.CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102(43):15545–50. doi: 10.1073/pnas.0506580102.CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 2014;42(Database issue):D199–205. doi: 10.1093/nar/gkt1076.
  28. 28.
    Caspi R, Billington R, Ferrer L, Foerster H, Fulcher CA, Keseler IM, et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 2016;44(D1):D471–80. doi: 10.1093/nar/gkv1164.CrossRefPubMedGoogle Scholar
  29. 29.
    Xia JG, Wishart DS. MSEA: a web-based tool to identify biologically meaningful patterns in quantitative metabolomic data. Nucleic Acids Res. 2010;38:W71–7. doi: 10.1093/nar/gkq329.CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Cavalcante RG, Patil S, Weymouth TE, Bendinskas KG, Karnovsky A, Sartor MA. ConceptMetab: exploring relationships among metabolite sets to identify links among biomedical concepts. Bioinformatics. 2016;32(10):1536–43. doi: 10.1093/bioinformatics/btw016.CrossRefPubMedGoogle Scholar
  31. 31.
    Moreno P, Beisken S, Harsha B, Muthukrishnan V, Tudose I, Dekker A, et al. BiNChE: a web tool and library for chemical enrichment analysis based on the ChEBI ontology. BMC Bioinformatics. 2015;16. doi: 10.1186/s12859-015-0486-3.
  32. 32.
    Hastings J, Chepelev L, Willighagen E, Adams N, Steinbeck C, Dumontier M. The chemical information ontology: provenance and disambiguation for chemical data on the biological semantic web. Plos One. 2011;6(10):e25513. doi: 10.1371/journal.pone.0025513.CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Kankainen M, Gopalacharyulu P, Holm L, Oresic M. MPEA-metabolite pathway enrichment analysis. Bioinformatics. 2011;27(13):1878–9. doi: 10.1093/bioinformatics/btr278.CrossRefPubMedGoogle Scholar
  34. 34.
    Kamburov A, Cavill R, Ebbels TMD, Herwig R, Keun HC. Integrated pathway-level analysis of transcriptomics and metabolomics data with IMPaLA. Bioinformatics. 2011;27(20):2917–8. doi: 10.1093/bioinformatics/btr499.CrossRefPubMedGoogle Scholar
  35. 35.
    Chagoyen M, Pazos F. MBRole: enrichment analysis of metabolomic data. Bioinformatics. 2011;27(5):730–1. doi: 10.1093/bioinformatics/btr001.CrossRefPubMedGoogle Scholar
  36. 36.
    Xia JG, Wishart DS. MetPA: a web-based metabolomics tool for pathway analysis and visualization. Bioinformatics. 2010;26(18):2342–4. doi: 10.1093/bioinformatics/btq418.CrossRefPubMedGoogle Scholar
  37. 37.
    da Huang W, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2009;37(1):1–13. doi: 10.1093/nar/gkn923.CrossRefGoogle Scholar
  38. 38.
    Tarca AL, Bhatti G, Romero R. A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity. Plos One. 2013;8(11). doi: 10.1371/journal.pone.0079217.
  39. 39.
    Goeman JJ, Van De Geer SA, Van Houwelingen HC. Testing against a high dimensional alternative. J R Stat Soc Ser B (Stat Methodol). 2006;68(3):477–93. doi: 10.1111/j.1467-9868.2006.00551.x.CrossRefGoogle Scholar
  40. 40.
    Xia JG, Sinelnikov IV, Han B, Wishart DS. MetaboAnalyst 3.0-making metabolomics more meaningful. Nucleic Acids Res. 2015;43(W1):W251–7. doi: 10.1093/nar/gkv380.CrossRefPubMedPubMedCentralGoogle Scholar
  41. 41.
    Persicke M, Ruckert C, Plassmeier J, Stutz LJ, Kessler N, Kalinowski J, et al. MSEA: metabolite set enrichment analysis in the MeltDB metabolomics software platform: metabolic profiling of Corynebacterium glutamicum as an example. Metabolomics. 2012;8(2):310–22. doi: 10.1007/s11306-011-0311-6.CrossRefGoogle Scholar
  42. 42.
    Alexa A, Rahnenfuhrer J, Lengauer T. Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics. 2006;22(13):1600–7. doi: 10.1093/bioinformatics/btl140.CrossRefPubMedGoogle Scholar
  43. 43.
    Tarca AL, Draghici S, Khatri P, Hassan SS, Mittal P, Kim JS, et al. A novel signaling pathway impact analysis. Bioinformatics. 2009;25(1):75–82. doi: 10.1093/bioinformatics/btn577.CrossRefPubMedGoogle Scholar
  44. 44.
    Draghici S, Khatri P, Tarca AL, Amin K, Done A, Voichita C, et al. A systems biology approach for pathway level analysis. Genome Res. 2007;17(10):1537–45. doi: 10.1101/gr.6202607.CrossRefPubMedPubMedCentralGoogle Scholar
  45. 45.
    Mitrea C, Taghavi Z, Bokanizad B, Hanoudi S, Tagett R, Donato M, et al. Methods and approaches in the topology-based analysis of biological pathways. Front Physiol. 2013;4. doi: 10.3389/fphys.2013.00278.
  46. 46.
    Karnovsky A, Weymouth T, Hull T, Tarcea VG, Scardoni G, Laudanna C, et al. Metscape 2 bioinformatics tool for the analysis and visualization of metabolomics and gene expression data. Bioinformatics. 2012;28(3):373–80. doi: 10.1093/bioinformatics/btr661.CrossRefPubMedGoogle Scholar
  47. 47.
    Bylesjo M, Rantalainen M, Cloarec O, Nicholson JK, Holmes E, Trygg J. OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification. J Chemometr. 2006;20(8–10):341–51. doi: 10.1002/cem.1006.CrossRefGoogle Scholar
  48. 48.
    Xia JG, Mandal R, Sinelnikov IV, Broadhurst D, Wishart DS. MetaboAnalyst 2.0-a comprehensive server for metabolomic data analysis. Nucleic Acids Res. 2012;40(W1):W127–33. doi: 10.1093/nar/gks374.CrossRefPubMedPubMedCentralGoogle Scholar
  49. 49.
    Xia JG, Psychogios N, Young N, Wishart DS. MetaboAnalyst: a web server for metabolomic data analysis and interpretation. Nucleic Acids Res. 2009;37:W652–60. doi: 10.1093/nar/gkp356.CrossRefPubMedPubMedCentralGoogle Scholar
  50. 50.
    Thevenot EA, Roux A, Xu Y, Ezan E, Junot C. Analysis of the human adult urinary metabolome variations with Age, body mass index, and gender by implementing a comprehensive workflow for univariate and OPLS statistical analyses. J Proteome Res. 2015;14(8):3322–35. doi: 10.1021/acs.jproteome.5b00354.CrossRefPubMedGoogle Scholar
  51. 51.
    Edoardo G, Francesca C, Dimitrios S, Andrea S, Michela G, Jose MG-M, et al. muma, an R package for metabolomics univariate and multivariate statistical analysis. Curr Metabolomics. 2013;1(2):180–9. doi: 10.2174/2213235X11301020005.CrossRefGoogle Scholar
  52. 52.
    Jennen D, Ruiz-Aracama A, Magkoufopoulou C, Peijnenburg A, Lommen A, van Delft J, et al. Integrating transcriptomics and metabonomics to unravel modes-of-action of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) in HepG2 cells. BMC Syst Biol. 2011;5:139. doi: 10.1186/1752-0509-5-139.CrossRefPubMedPubMedCentralGoogle Scholar
  53. 53.
    Xia J, Fjell CD, Mayer ML, Pena OM, Wishart DS, Hancock RE. INMEX–a web-based tool for integrative meta-analysis of expression data. Nucleic Acids Res. 2013;41(Web Server issue):W63–70. doi: 10.1093/nar/gkt338.
  54. 54.
    Li SZ, Park Y, Duraisingham S, Strobel FH, Khan N, Soltow QA, et al. Predicting network activity from high throughput metabolomics. Plos Comput Biol. 2013;9(7). doi: 10.1371/journal.pcbi.1003123.
  55. 55.
    Tautenhahn R, Patti GJ, Rinehart D, Siuzdak G. XCMS Online: a web-based platform to process untargeted metabolomic data. Anal Chem. 2012;84(11):5035–9. doi: 10.1021/ac300698c.CrossRefPubMedPubMedCentralGoogle Scholar
  56. 56.
    Kaever A, Landesfeind M, Feussner K, Mosblech A, Heilmann I, Morgenstern B, et al. MarVis-Pathway: integrative and exploratory pathway analysis of non-targeted metabolomics data. Metabolomics. 2015;11(3):764–77. doi: 10.1007/s11306-014-0734-y.CrossRefPubMedGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Institute of Parasitology, and Department of Animal ScienceMcGill UniversitySainte Anne de BellevueCanada
  2. 2.Department of Microbiology and ImmunologyMcGill UniversityMontrealCanada

Personalised recommendations