Summary
Metabolomics is the large-scale analysis of metabolites and as such requires bioinformatics tools for data analysis, visualization, and integration. This chapter describes the basic composition of chromatographically coupled mass spectrometry (MS) data sets used in metabolomics and describes in detail the steps necessary for extracting large-scale qualitative and quantitative information. This process involves noise filtering, peak picking and deconvolution, peak identification, peak alignment, and the creation of a final data matrix for statistical processing. Multivariate tools for comparative analysis are presented and illustrated using data for Medicago truncatula. Additional tools for visualizing and integrating metabolomics data within a biological context are discussed. Two tables are provided listing current metabolomics data processing and visualization software. Because metabolomics is rapidly maturing, a final section is presented concerning the need for data standardization and current efforts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Reference
Sumner, L., Mendes, P., and Dixon, R. (2003) Plant metabolomics: large-scale phytochemistry in the functional genomics era. Phytochemistry 62, 817–836.
Fiehn, O. (2002) Metabolomics–the link between genotypes and phenotypes. Plant Mol. Biol. 48, 155–171.
Oliver, S., Winson, M., Kell, D., and Baganz, F. (1998) Systematic functional analysis of the yeast genome. Trends Biotechnol. 16, 373–378.
Fiehn, O., Kopka, J., Dormann, P., Altmann, T., Trethewey, R., and Willmitzer, L. (2000) Metabolite profiling for plant functional genomics. Nat. Biotechnol. 18, 1157–1161.
Trethewey, R. N., Krotzky, A. J., and Willmitzer, L. (1999) Metabolic profiling: a rosetta stone for genomics? Curr. Opin. Biotechnol. 2, 83–85.
Weckwerth, W. (2003) Metabolomics in systems biology. Annu. Rev. Plant Biol. 54, 669–689.
Fernie, A., Trethewey, R., Krotzky, A., and Willmitzer, L. (2004) Metabolite profiling: from diagnostics to systems biology. Nat. Rev. Mol. Cell Biol. 5, 763–769.
Broeckling, C. D., Huhman, D. V., Farag, M. A., Smith, J. T., May, G. D., Mendes, P., Dixon, R. A., and Sumner, L. W. (2005) Metabolic profiling of Medicago truncatula cell cultures reveals the effects of biotic and abiotic elicitors on metabolism. J. Exp. Bot. 56, 323–336.
Gygi, S. P., Rochon, Y., Franza, B. R., and Aebersold, R. (1999) Correlation between protein and mRNA abundance in yeast. Mol. Cell. Biol. 19, 1720–1730.
Somerville, C., and Dangl, J. (2000) Plant biology in 2010. Science 290, 2077–2078.
Somerville, C., and Somerville, S. (1999) Plant functional genomics. Science 285, 380–383.
Dixon, R. A. (2001) Phytochemistry in the genomics and post-genomics eras. Phytochemistry 57, 145–148.
Hartman, T., Kutchan, T. M., and Strack, D. (2005) Evolution of metabolic diversity. Phytochemistry 66, 1198–1199.
Roessner, U., Luedemann, A., Brust, D., Fiehn, O., Linke, T., Willmitzer, L., and Fernie, A. R. (2001) Metabolic profiling allows comprehensive phenotyping of genetically or environmentally modified plant systems. Plant Cell 13, 11–29.
Roessner, U., Wagner, C., Kopka, J., Trethewey, R. N., and Willmitzer, L. (2000) Simultaneous analysis of metabolites in potato tuber by gas chromatography-mass spectrometry. Plant J. 23, 131–142.
Schauer, N., Steinhauser, D., Strelkov, S., Schomburg, D., Allison, G., Moritz, T., Lundgren, K., Roessner-Tunali, U., Forbes, M., Willmitzer, L., Fernie, A., and Kopka, J. (2005) GC-MS libraries for the rapid identification of metabolites in complex biological samples. FEBS Lett. 579, 1332–1337.
Wagner, C., Sefkow, M., and Kopka, J. (2003) Construction and application of a mass spectral and retention time index database generated from plant GC/EI-TOF-MS metabolite profiles. Phytochemistry 62, 887–900.
Welthagen, W., Shellie, R. A., Spranger, J., Ristow, M., Zimmermannn, R., and Fiehn, O. (2005) Comprehensive two-dimensional gas chromatography-time-of-flight mass spectrometry (GC x GC-TOF) for high resolution metabolomics: biomarker discovery on spleen tissue extracts of obese NZO compared to lean C57BL/6 mice. Metabolomics 1, 65–73.
Huhman, D., and Sumner, L. (2002) Metabolic profiling of saponins in Medicago sativa and Medicago truncatula using HPLC coupled to an electrospray ion-trap mass spectrometer. Phytochemistry 59, 347–360.
Sumner, L. W. (2006) Current status and forward looking thoughts on LC-MS metabolomics, in Biotechnology in Agriculture and Forestry: Plant Metabolomics (Saito, K., Dixon, R.A., Willmitzer, L., Ed.), Springer-Verlag, Berlin, Vol. 57, pp. 21–32.
Sumner, L. W., Huhman, D. V., Urbanczyk-Wochniak, E., and Lei, Z. (2007) Methods, Applications, and Concepts of Metabolic Profiling: Secondary metabolism, in Plant System Biology, Fernie, A., Baginsky, S. Eds, Bierkenhauser-Verlag, Berlin, Germany 195–212. (ISBN 13: 978-3-7643-7261-3).
Tolstikov, V. V., and Fiehn, O. (2002) Analysis of highly polar compounds of plant origin: combination of hydrophilic interaction chromatography and electrospray ion trap mass spectrometry. Anal. Biochem. 301, 298–307.
Takats, Z., Wiseman, J. M., Gologan, B., and Cooks, R. G. (2004) Mass spectrometry sampling under ambient conditions with desorption electrospray ionization. Science 306, 471–473.
Bino, R. J., Hall, R. D., Fiehn, O., Kopka, J., Saito, K., Draper, J., Nikolau, B. J., Mendes, P., Roessner-Tunali, U., Beale, M. H., Trethewey, R. N., Lange, B. M., Wurtele, E. S., and Sumner, L. W. (2004) Potential of metabolomics as a functional genomics tool. Trends Plant Sci. 9, 418–425.
Birkemeyer, C., Kolasa, A., and Kopka, J. (2003) Comprehensive chemical derivatization for gas chromatography-mass spectrometry-based multi-targeted profiling of the major phytohormones. J. Chromatogr. A 993, 89–102.
Muller, A., Duchting, P., and Weiler, E. (2002) A multiplex GC-MS/MS technique for the sensitive and quantitative single-run analysis of acidic phytohormones and related compounds, and its application to Arabidopsis thaliana. Planta 216, 44–56.
Huhman, D., Berhow, M., and Sumner, L. (2005) Quantification of saponins in aerial and subterranean tissues of Medicago truncatula. J. Agric. Food Chem. 53, 1914–1920.
Frydman, A., Weisshaus, O., Bar-Peled, M., Huhman, D. V., Sumner, L. W., Marin, F. R., Lewinsohn, E., Fluhr, R., Gressel, J., and Eyal, Y. (2004) Citrus fruit bitter flavors: isolation and functional characterization of the gene encoding a 1,2 rhamnosyltransferase, a key enzyme in the biosynthesis of the bitter flavonoids of citrus. Plant J. 40, 88–100.
Liu, C., Huhman, D., Sumner, L., and Dixon, R. (2003) Regiospecific hydroxylation of isoflavones by cytochrome p450 81E enzymes from Medicago truncatula. Plant J. 36, 471–484.
Baggett, B. R., Cooper, J. D., Hogan, E. T., Carper, J., Paiva, N. L., and Smith, J. T. (2002) Profiling isoflavonoids found in legume root extracts using capillary electrophoresis. Electrophoresis 23, 1642–1651.
Zhang, J., Broeckling, C., Blancaflor, E., Sledge, M., Sumner, L., and Wang, Z. (2005) Overexpression of WXP1, a putative Medicago truncatula AP2 domain-containing transcription factor gene, increases cuticular wax accumulation and enhances drought tolerance in transgenic alfalfa (Medicago sativa). Plant J. 42, 689–707.
Wilson, I., Nicholson, J., Castro-Perez, J., Granger, J., Johnson, K., Smith, B., and Plumb, R. (2005) High resolution “ultra performance” liquid chromatography coupled to oa-TOF mass spectrometry as a tool for differential metabolic pathway profiling in functional genomic studies. J. Proteome Res. 4, 591–598.
Danielsson, R., Bylund, D., and Markides, K. (2002) Matched filtering with background suppression for improved quality of base peak chromatograms and mass spectra in liquid chromatography-mass spectrometry. Anal. Chim. Acta 454, 167–184.
Duran, A. L., Yang, J., Wang, L., and Sumner, L. W. (2003) Metabolomics spectral formatting, alignment and conversion tools (MSFACTs). Bioinformatics 19, 2283–2293.
Smith, C., Want, E., O’Maille, G., Abagyan, R., and Siuzdak, G. (2006) XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal. Chem. 78, 779–787.
Kopka, J., Schauer, N., Krueger, S., Birkemeyer, C., Usadel, B., Bergmuller, E., Dormann, P., Weckwerth, W., Gibon, Y., Stitt, M., Willmitzer, L., Fernie, A. R., and Steinhauser, D. (2005) GMD@CSB.DB: the Golm Metabolome Database. Bioinformatics 21, 1635–1638.
Halket, J., Przyborowska, A., Stein, S., Mallard, W., Down, S., and Chalmers, R. (1999) Deconvolution gas chromatography/mass spectrometry of urinary organic acids–potential for pattern recognition and automated identification of metabolic disorders. Rapid Commun. Mass Spectrom. 13, 279–284.
Nielsen, N.-P. V., Carstensen, J. M., and Smedsgaard, J. (1998) Aligning of single and multiple wavelength chromatographic profiles form chemometric data analysis using correlation optimized warping. J. Chromatogr. A 805, 17–35.
Miller, J. N., and Miller, J. C. (2000) Statistics and Chemometrics for Analytical Chemistry, Prentice Hall, Harlow, England.
Hotellin, H. (1933) Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24, 417–441.
Raamsdonk, L., Teusink, B., Broadhurst, D., Zhang, N., Hayes, A., Walsh, M., Berden, J., Brindle, K., Kell, D., Rowland, J., Westerhoff, H., van Dam, K., and Oliver, S. (2001) A functional genomics strategy that uses metabolome data to reveal the phenotype of silent mutations. Nat. Biotechnol. 19, 45–50.
Lachenbruch, P. A. (1975) Discriminant Analysis, Hafner Press, New York.
Cowan, J. D., and Sharp, D. H. (1988) Neural nets. Q. Rev. Biophys. 21, 365–427.
Cristianini, N., and Shawe-Taylor, J. (2000) An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press, Cambridge.
Goldberg, D. E. (1989) Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading, Mass.
Koza, J. R. (1992) Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press, Cambridge, Mass.
Goodacre, R., and Kell, D. B. (1996) Pyrolysis mass spectrometry and its applications in biotechnology. Curr. Opin. Biotechnol. 7, 20–28.
McGovern, A. C., Broadhurst, D., Taylor, J., Kaderbhai, N., Winson, M. K., Small, D. A., Rowland, J. J., Kell, D. B., and Goodacre, R. (2002) Monitoring of complex industrial bioprocesses for metabolite concentrations using modern spectroscopies and machine learning: application to gibberellic acid production. Biotechnol. Bioeng. 78, 527–538.
Shaw, A. D., Winson, M. K., Woodward, A. M., McGovern, A. C., Davey, H. M., Kaderbhai, N., Broadhurst, D., Gilbert, R. J., Taylor, J., Timmins, E. M., Goodacre, R., Kell, D. B., Alsberg, B. K., and Rowland, J. J. (2000) Rapid analysis of high-dimensional bioprocesses using multivariate spectroscopies and advanced chemometrics. Adv. Biochem. Eng. Biotechnol. 66, 83–113.
Goodacre, R. (2005) Making sense of the metabolome using evolutionary computation: seeing the wood with the trees. J. Exp. Bot. 56, 245–254.
Thimm, O., Blasing, O., Gibon, Y., Nagel, A., Meyer, S., Kruger, P., Selbig, J., Muller, L., Rhee, S., and Stitt, M. (2004) MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. Plant J. 37, 914–939.
Usadel, B., Nagel, A., Thimm, O., Redestig, H., Blaesing, O. E., Palacios-Rojas, N., Selbig, J., Hannemann, J., Piques, M. C., Steinhauser, D., Scheible, W.-R., Gibon, Y., Morcuende, R., Weicht, D., Meyer, S., and Stitt, M. (2005) Extension of the visualization tool mapman to allow statistical analysis of arrays, display of coresponding genes, and comparison with known responses. Plant Physiol. 138, 1195–1204.
Urbanczyk-Wochniak, E., Usadel, B., Thimm, O., Nunes-Nesi, A., Carrari, F., Davey, M., Blasing, O., Kowalczyk, M., Weicht, D., Polinceusz, A., Meyer, S., Stitt, M., and Fernie, A. R. (2006) Conversion of MapMan to allow the analysis of transcript data from Solanaceous species: effects of genetic and environmental alterations in energy metabolism in the leaf. Plant Mol. Biol. 60, 773–792.
Mehrotra, B., and Mendes, P. (2006) Bioinformatics approaches to integrate metabolomics and other systems biology data, in Biotechnology in Agriculture and Forestry: Plant Metabolomics (Saito, K., Dixon, R.A., Willmitzer, L., Ed.), Springer-Verlag, Berlin, Vol. 57, pp. 105–115.
Gonzales, M. D., Arcchuleta, E., Farmer, A., Gajendran, K., Gant, D., Shoemaker, R., Beavis, W. D., and Waugh, M. E. (2005) The Legume Information System (LIS): an integrated information resource for comparative legume biology. Nucleic Acids Res. 33, D660–D665.
Lange, B., and Ghassemian, M. (2005) Comprehensive post-genomic data analysis approaches integrating biochemical pathway maps. Phytochemistry 66, 413–451.
Brazma, A., Hingamp, P., Quackenbush, J., Sherlock, G., Spellman, P., Stoeckert, C., Aach, J., Ansorge, W., Ball, C. A., Causton, H. C., Gaasterland, T., Glenisson, P., Holstege, F. C. P., Kim, I. F., Markowitz, V., Matese, J. C., Parkinson, H., Robinson, A., Sarkans, U., Schulze-Kremer, S., Stewart, J., Taylor, R., Vilo, J., and Vingron, M. (2001) Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat. Genet. 29, 365–371.
Taylor, C. F., Paton, N. W., Garwood, K. L., Kirby, P. D., Stead, D. A., Yin, Z., Deutsch, E. W., Selway, L., Walker, J., Riba-Garcia, I., Mohammed, S., Deery, M. J., Howard, J. A., Dunkley, T., Aebersold, R., Kell, D. B., Lilley, K. S., Roepstorff, P., Yates, J. R., Brass, A., Brown, A. J. P., Cash, P., Gaskell, S. J., Hubbard, S. J., and Oliver, S. G. (2003) A systematic approach to modeling, capturing, and disseminating proteomics experimental data. Nat. Biotechnol. 21, 247–254.
Orchard, S., Hermjakob, H., and Apweiler, R. (2003) The proteomics standards initiative. Proteomics 3, 1374–1376.
Orchard, S., Hermjakob, H., Taylor, C., Potthast, F., Jones, P., Zhu, W., Julian, R., and Apweiler, R. (2005) Further steps in standardisation. Report of the second annual Proteomics Standards Initiative Spring Workshop (Siena, Italy 17–20 April 2005). Proteomics 5, 3552–3555.
Orchard, S., Hermjakob, H., Binz, P., Hoogland, C., Taylor, C., Zhu, W., Julian, R., and Apweiler, R. (2005) Further steps towards data standardisation: the Proteomic Standards Initiative HUPO 3(rd) annual congress, Beijing 25–27(th) October, 2004. Proteomics 5, 337–339.
Bino, R. J., Hall, R. D., Fiehn, O., Kopka, J., Saito, K., Draper, J., Nikolau, B. J., Mendes, P., Roessner-Tunali, U., Beale, M. H., Trethewey, R. N., Lange, B. M., Wurtele, E. S., and Sumner, L. W. (2004) Potential of metabolomics as a functional genomics tool. Trends Plant Sci. 9, 418–425.
Jenkins, H., Hardy, N., Beckmann, M., Draper, J., Smith, A., Taylor, J., Fiehn, O., Goodacre, R., Bino, R., Hall, R., Kopka, J., Lane, G., Lange, B., Liu, J., Mendes, P., Nikolau, B., Oliver, S., Paton, N., Rhee, S., Roessner-Tunali, U., Saito, K., Smedsgaard, J., Sumner, L., Wang, T., Walsh, S., Wurtele, E., and Kell, D. (2004) A proposed framework for the description of plant metabolomics experiments and their results. Nat. Biotechnol. 22, 1601–1606.
Jenkins, H., Johnson, H., Kular, B., Wang, T., and Hardy, N. (2005) Toward supportive data collection tools for plant metabolomics. Plant Physiol. 138, 67–77.
Lindon, J., Nicholson, J., Holmes, E., Keun, H., Craig, A., Pearce, J., Bruce, S., Hardy, N., Sansone, S., Antti, H., Jonsson, P., Daykin, C., Navarange, M., Beger, R., Verheij, E., Amberg, A., Baunsgaard, D., Cantor, G., Lehman-McKeeman, L., Earll, M., Wold, S., Johansson, E., Haselden, J., Kramer, K., Thomas, C., Lindberg, J., Schuppe-Koistinen, I., Wilson, I., Reily, M., Robertson, D., Senn, H., Krotzky, A., Kochhar, S., Powell, J., van der Ouderaa, F., Plumb, R., Schaefer, H., Spraul, M., and (2005) Summary recommendations for standardization and reporting of metabolic analyses. Nat. Biotechnol. 23, 833–838.
Fiehn, O., Kristal, B., van Ommen, B., Sumner, L. W., Assuant-Sansone, S., Taylor, C., Hardy, N., and Kaddurah-Daouk, R. (2006) Establishing Reporting Standards for Metabolomic and Metabonomic Studies: A Call for Participation. Omics 10, 158–163.
Ball, C. A., Awad, I. A. B., Demeter, J., Gollub, J., Hebert, J. M., Hernandez-Boussard, T., Jin, H., Matese, J. C., Nitzberg, M., Wymore, F., Zachariah, Z. K., Brown, P. O., and Sherlock, G. (2005) The Stanford microarray database accommodates additional microarray platforms and data formats. Nucleic Acids Res. 33, D580–D582.
Barrett, T., Suzek, T. O., Troup, D. B., Wilhite, S. E., Ngau, W.-C., Ledoux, P., Rudnev, D., Lash, A. E., Fujibuchi, W., and Edgar, R. (2005) NCBI GEO: mining millions of expression profiles–database and tools. Nucleic Acids Res. 33, D562–D566.
Bairoch, A., Apweiler, R., Wu, C. H., Barker, W. C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M. J., Natale, D. A., O’Donovan, C., Redaschi, N., and Yeh, L.-S. L. (2005) The Universal Protein Resource (UniProt). Nucleic Acids Res. 33, D154–D159.
Schauer, N., Steinhauser, D., Strelkov, S., Schomburg, D., Allison, G., Moritz, T., Lundgren, K., Roessner-Tunali, U., Forbes, M., Willmitzer, L., Fernie, A., and Kopka, J. (2005) GC-MS libraries for the rapid identification of metabolites in complex biological samples. FEBS Lett. 579, 1332–1337.
Fiehn, O., Wohlgemuth, G., Scholz, G. (2005) Setup and Annotation of Metabolomic Experiment by Intergrating Biological and Mass Spectrometric Metadata. In B. Ludascher, L. Raschid, eds, LNBI, Vol 3615. Springer-Verlag, Berlin, Germany, pp. 224–239.
Tikunov, Y., Lommen, A., de Vos, C. H. R, Verhoeven, H. A., Bino, R. J., Hall, R. D., Bovy, A. G. (2005) A Novel Approach for Nontargeted Data Analysis for Metabolomics. Large-Scale Profiling of Tomato Fruit Volatiles. Plant Physiol. 139, 1125–1137.
Broeckling, C., Reddy, I., Duran, A., Zhao, X., Sumner, L. (2006) MET-IDEA: Data Extraction Tool for Mass Spectrometry-Based Metabolomics. Anal. Chem. 78, 4334–4341.
Katajamaa, M., Miettinen, J., Oresic, M. (2006) MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22, 634–636.
Mueller, L. A., Zhang, P., Rhee, S. Y. (2003) AraCyc: A Biochemical Pathway Database for Arabidopsis. Plant Physiology 132: 453–460.
Tokimatsu, T., Sakurai, N., Suzuki, H., Ohta, H., Nishitani, K., Koyama, T., Umezawa, T., Misawa, N., Saito, K., Shibata, D. (2005) KaPPA-View. A Web-Baseed Analysis Tool for Integration of Transcript and Metabolite Data on Plant Metabolic Pathway Maps. Plant Physiology 138: 1289–1300.
Bajic, V. B., Veronika, M., Veladandi, P. S., Meka, A., Heng, M.-W., Rajaraman, K., Pan H., Swarup, S. (2005) Dragon Plant Biology Explorer. A Text-Mining Tool for Integrating Associations between Genetic and Biochemical Entities with Genome Annotation and Biochemical Terms Lists. Plant Physiology 138: 1914–1925.
Yang, Y., Engin, L., Wurtele, E. S., Cruz-Neira, C., Dickerson, J. A. (2005) Integration of metabolic networks and gene expression in virtual reality. Bioinformatics 21, 3645–3650.
Acknowledgments
The Sumner lab is supported by The National Science Foundation Plant Genome Research Program Award no. DBI-0109732, NSF 2010 MCB-0520283, NSF 2010 MCB-0520140, State of Oklahoma, and The Samuel Roberts Noble Foundation.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this protocol
Cite this protocol
Sumner, L.W., Urbanczyk-Wochniak, E., Broeckling, C.D. (2005). Metabolomics Data Analysis, Visualization, and Integration. In: Edwards, D. (eds) Plant Bioinformatics. Methods in Molecular Biology™, vol 406. Humana Press. https://doi.org/10.1007/978-1-59745-535-0_20
Download citation
DOI: https://doi.org/10.1007/978-1-59745-535-0_20
Publisher Name: Humana Press
Print ISBN: 978-1-58829-653-5
Online ISBN: 978-1-59745-535-0
eBook Packages: Springer Protocols