Abstract
The identification of compounds from mass spectrometry (MS) data is still seen as a major bottleneck in the interpretation of MS data. This is particularly the case for the identification of small compounds such as metabolites, where until recently little progress has been made. Here we review the available approaches to annotation and identification of chemical compounds based on electrospray ionization (ESI-MS) data. The methods are not limited to metabolomics applications, but are applicable to any small compounds amenable to MS analysis. Starting with the definition of identification, we focus on the analysis of tandem mass and MSn spectra, which can provide a wealth of structural information. Searching in libraries of reference spectra provides the most reliable source of identification, especially if measured on comparable instruments. We review several choices for the distance functions. The identification without reference spectra is even more challenging, because it requires approaches to interpret tandem mass spectra with regard to the molecular structure. Both commercial and free tools are capable of mining general-purpose compound libraries, and identifying candidate compounds. The holy grail of computational mass spectrometry is the de novo deduction of structure hypotheses for compounds, where method development has only started thus far. In a case study, we apply several of the available methods to the three compounds, kaempferol, reserpine, and verapamil, and investigate whether this results in reliable identifications.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Zhang J, Gonzalez E, Hestilow T, Haskins W, Huang Y (2009) Review of peak detection algorithms in liquid-chromatography-mass spectrometry. Curr Genomics 10(6):388–401
America AHP, Cordewener JHG (2008) Comparative LC-MS: a landscape of peaks and valleys. Proteomics 8(4):731–749
Lange E, Tautenhahn R, Neumann S, Gröpl C (2008) Critical assessment of alignment procedures for LC-MS proteomics and metabolomics measurements. BMC Bioinformatics 9:375+
Vandenbogaert M, Li-Thiao-Té S, Kaltenbach H-M, Zhang R, Aittokallio T, Schwikowski B (2008) Alignment of LC-MS images, with applications to biomarker discovery and protein identification. Proteomics 8(4):650–672
Meija J (2006) Mathematical tools in analytical mass spectrometry. Anal Bioanal Chem 385(3):486–499
Iijima Y, Nakamura Y, Ogata Y, Tanaka K, Sakurai N, Suda K, Suzuki T, Suzuki H, Okazaki K, Kitayama M, Kanaya S, Aoki K, Shibata D (2008) Metabolite annotations based on the integration of mass spectral information. Plant J 54(5):949–962
Böttcher C, von Roepenack-Lahaye E, Schmidt J, Schmotz C, Neumann S, Scheel D, Clemens S (2008) Metabolome analysis of biosynthetic mutants reveals a diversity of metabolic changes and allows identification of a large number of new compounds in arabidopsis. Plant Physiol 147(4):2107–2120
Glauser G, Guillarme D, Grata E, Boccard J, Thiocone A, Carrupt P-A, Veuthey J-L, Rudaz S, Wolfender JL (2008) Optimized liquid chromatography-mass spectrometry approach for the isolation of minor stress biomarkers in plant extracts and their identification by capillary nuclear magnetic resonance. J Chromatogr A 1180(1–2):90–98
Sumner LW, Amberg A, Barrett D, Beale M, Beger R, Daykin C, Fan T, Fiehn O, Goodacre R, Griffin JL, Hankemeier T, Hardy N, Harnly J, Higashi R, Kopka J, Lane A, Lindon JC, Marriott P, Nicholls A, Reily M, Thaden J, Viant MR (2007) Proposed minimum reporting standards for chemical analysis. Metabolomics 3(3):211–221
Wishart DS, Tzur D, Knox C, Eisner R, Guo AC, Young N, Cheng D, Jewell K, Arndt D, Sawhney S, Fung C, Nikolai L, Lewis M, Coutouly M-A, Forsythe I, Tang P, Shrivastava S, Jeroncic K, Stothard P, Amegbey G, Block D, Hau DD, Wagner J, Miniaci J, Clements M, Gebremedhin M, Guo N, Zhang Y, Duggan GE, MacInnis GD, Weljie AM, Dowlatabadi R, Bamforth F, Clive D, Greiner R, Li L, Marrie T, Sykes BD, Vogel HJ, Querengesser L (2007) HMDB: the human metabolome database. Nucleic Acids Res 35(suppl 1):D521–D526
Böcker S, Letzel M, Lipták ZS, Pervukhin A (2009) STRTUS: decomposing isotope patterns for metabolite identification. Bioinformatics 25(2):218–224
Kind T, Fiehn O (2006) Metabolomic database annotations via query of elemental compositions: mass accuracy is insufficient even at less than 1 ppm. BMC Bioinformatics 7(1):234
Bristow T, Constantine J, Harrison M, Cavoit F (2008) Performance optimisation of a new-generation orthogonal-acceleration quadrupole-time-of-flight mass spectrometer. Rapid Commun Mass Spectrom 22(8):1213–1222
Laures AM-F, Wolff J-C, Eckers C, Borman PJ, Chatfield MJ (2007) Investigation into the factors affecting accuracy of mass measurements on a time-of-flight mass spectrometer using Design of Experiment. Rapid Commun Mass Spectrom 21(4):529–535
Xu Y, Heilier J-F, Madalinski G, Genin E, Ezan E, Tabet J-C, Junot C (2010) Evaluation of accurate mass and relative isotopic abundance measurements in the LTQ-Orbitrap mass spectrometer for further metabolomics database building. Anal Chem 82(13):5490–5501. doi:10.1021/ac100271j
Miura D, Tsuji Y, Takahashi K, Wariishi H, Saito K (2010) A strategy for the determination of the elemental composition by Fourier transform ion cyclotron resonance mass spectrometry based on isotopic peak ratios. Anal Chem 82(13):5887–5891
Matsuda F, Shinbo Y, Oikawa A, Hirai MY, Fiehn O, Kanaya S, Saito K (2009) Assessment of metabolome annotation quality: a method for evaluating the false discovery rate of elemental composition searches. PLoS ONE 4(10):e7490
Matsuda F, Yonekura-Sakakibara K, Niida R, Kuromori T, Shinozaki K, Saito K (2009) MS/MS spectral tag-based annotation of non-targeted profile of plant secondary metabolites. Plant J 57(3):555–577
Matsuda F, Hirai MY, Sasaki E, Akiyama K, Yonekura-Sakakibara K, Provart NJ, Sakurai T, Shimada Y, Saito K (2010) AtMetExpress development: a phytochemical atlas of Arabidopsis development. Plant Physiol 152(2):566–578
Plumb RS, Johnson KA, Rainville P, Smith BW, Wilson ID, Castro-Perez JM, Nicholson JK (2006) UPLC/MS(E); a new approach for generating molecular fragment information for biomarker structure elucidation. Rapid Commun Mass Spectrom 20(13):1989–1994
Ipsen A, Want EJ, Lindon JC, Ebbels TMD (2010) A statistically rigorous test for the identification of parent-fragment pairs in LC-MS datasets. Anal Chem 82(5):1766–1778
Tautenhahn R, Böttcher C, Neumann S (2007) Annotation of LC/ESI-MS mass signals. In: Hochreichter S, Wagner R (eds) Bioinformatics research and development (BIRD 2007). Lecture notes in computer science, vol 4414. Springer, Heidelberg, pp 371–380
Borland L, Brickhouse M, Thomas T, Fountain AW (2010) Review of chemical signature databases. Anal Bioanal Chem 397(3):1019–1028
Gower JC, Legendre P (1986) Metric and Euclidean properties of dissimilarity coefficients. J Classif 3(1):5–48
Stein SE (1994) Estimating probabilities of correct identification from results of mass spectral library searches. J Am Soc Mass Spectrom 5(4):316–323
Horai H, Arita M, Kanaya S, Nihei Y, Ikeda T, Suwa K, Ojima Y, Tanaka K, Tanaka S, Aoshima K, Oda Y, Kakazu Y, Kusano M, Tohge T, Matsuda F, Sawada Y, Nakanishi H, Ikeda K, Akimoto N, Maoka T, Takahashi H, Ara T, Shibata D, Neumann S, Iida T, Tanaka K, Funatsu K, Matsuura F, Soga T, Taguchi R, Saito K, Nishioka T (2010) MassBank: a public repository for sharing mass spectral data for life sciences. J Mass Spectrom 45:703–714
Dworzanski JP, Snyder AP, Chen R, Zhang H, Wishart D, Li L (2004) Identification of bacteria using tandem mass spectrometry combined with a proteome database and statistical scoring. Anal Chem 76(8):2355–2366
Pavlic M, Libiseller K, Oberacher H (2006) Combined use of ESI-QqTOF-MS and ESI-QqTOF-MS/MS with mass-spectral library search for qualitative analysis of drugs. Anal Bioanal Chem 386(1):69–82
Oberacher H, Pavlic M, Libiseller K, Schubert B, Sulyok M, Schuhmacher R, Csaszar E, Köfeler HC (2009) On the inter-instrument and the inter-laboratory transferability of a tandem mass spectral reference library: 2. Optimization and characterization of the search algorithm. J Mass Spectrom 44(4):494–502
Mylonas R, Mauron Y, Masselot A, Binz P-A, Budin N, Fathi M, Viette V, Hochstrasser DF, Lisacek F (2009) X-Rank: a robust algorithm for small molecule identification using tandem mass spectrometry. Anal Chem 81(18):7604–7610
Smith CA, Maille GO, Want EJ, Qin C, Trauger SA, Brandon TR, Custodio DE, Abagyan R, Siuzdak G (2005) METLIN: a metabolite mass spectral database. In: Proceedings of the 9th international congress of therapeutic drug monitoring and clinical toxicology, Louisville, Kentucky, vol 27, pp 747–751
Reemtsma T (2009) Determination of molecular formulas of natural organic matter molecules by (ultra-) high-resolution mass spectrometry: status and needs. J Chromatogr A 1216(18):3687–3701
Mohamed R, Varesio E, Ivosev G, Burton L, Bon-ner R, Hopfgartner G (2009) Comprehensive analytical strategy for biomarker identification based on liquid chromatography coupled to mass spectrometry and new candidate confirmation tools. Anal Chem 81(18):7677–7694
Böcker S, Rasche F (2008) Towards de novo identification of metabolites by analyzing tandem mass spectra. Bioinformatics 24:T49–T55, Proc. of European Conference on Computational Biology (ECCB 2008)
Advanced Chemistry Development, Inc (2010) ACD/MS Fragmenter. http://www.acdlabs.com/products/adh/ms/ms_frag/
Pelander A, Tyrkkö E, Ojanperä I (2009) In silico methods for predicting metabolism and mass fragmentation applied to quetiapine in liquid chromatography/time-of-flight mass spectrometry urine drug screening. Rapid Commun Mass Spectrom 23(4):506–514
Tyrkkö E, Pelander A, Ojanperä I (2010) Differentiation of structural isomers in a target drug database by LC/Q-TOFMS using fragmentation prediction. Drug Test Anal 2(6):259–270
Highchem, Ltd (2010) Mass Frontier. http://www.highchem.com/massfrontier/mass-frontier.html
Horai H, Arita M, Ojima Y, Nihei Y, Kanaya S, Nishioka T (2009) Traceable analysis of multiple-stage mass spectra through precursor-product annotations. In: Grosse I, Neumann S, Posch S, Schreiber F, Stadler PF (eds) GCB. Lecture notes in informatics (GI), vol 157, pp 173–178
Heinonen M, Rantanen A, Mielikäinen T, Kokkonen J, Kiuru J, Ketola RA, Rousu J (2008) FiD: a software for ab initio structural identification of product ions from tandem mass spectrometric data. Rapid Commun Mass Spectrom 22(19):3043–3052
Hill AW, Mortishire-Smith RJ (2005) Automated assignment of high-resolution collisionally activated dissociation mass spectra using a systematic bond disconnection approach. Rapid Commun Mass Spectrom 19(21):3111–3118
Heinonen M, Rantanen A, Mielikäinen T, Pitkänen E, Kokkonen J, Rousu J (2006) Ab initio prediction of molecular fragments from tandem mass spectrometry data. In: Proceedings of the German conference on bioinformatics (GCB 2006). Lecture notes in informatics, pp 40–53
Böcker S, Rasche F, Steijger T (2009) Annotating fragmentation patterns. In: Proceedings of the workshop on algorithms in bioinformatics (WABI 2009). Lecture notes in computer science, vol 5724. Springer, Heidelberg, pp 13–24
Hill DW, Kertesz TM, Fontaine D, Friedman R, Grant DF (2008) Mass spectral metabonomics beyond elemental formula: chemical database querying by matching experimental with computational fragmentation spectra. Anal Chem 80(14):5574–5582
Wolf S, Schmidt S, Müller-Hannemann M, Neumann S (2010) In silico fragmentation for computer assisted identification of metabolite mass spectra. BMC Bioinformatics 11(1):148
Levsen K, Schiebel H-M, Terlouw JK, Jobst KJ, Elend M, Preiss A, Thiele H, Ingendoh A (2007) Even-electron ions: a systematic study of the neutral species lost in the dissociation of quasi-molecular ions. J Mass Spectrom 42(8):1024–1044
Alex A, Harvey S, Parsons T, Pullen FS, Wright P, Riley J-A (2009) Can density functional theory (DFT) be used as an aid to a deeper understanding of tandem mass spectrometric fragmentation pathways? Rapid Commun Mass Spectrom 23(17):2619–2627
Wright P, Alex A, Nyaruwata T, Parsons T, Pullen F (2010) Using density functional theory to rationalise the mass spectral fragmentation of maraviroc and its metabolites. Rapid Commun Mass Spectrom 24(7):1025–1031
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Online Resource 1
Molecular structures and details used for the identification case studies (PDF 890 kb)
Online Resource 2
Computational mass spectrometry for metabolomics: focus on the identification of metabolites and small molecules (TXT 11.2 kb)
Online Resource 3
Results of the FiD software (PDF 482 kb)
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Neumann, S., Böcker, S. Computational mass spectrometry for metabolomics: Identification of metabolites and small molecules. Anal Bioanal Chem 398, 2779–2788 (2010). https://doi.org/10.1007/s00216-010-4142-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00216-010-4142-5