, Volume 11, Issue 1, pp 98–110

Competitive fragmentation modeling of ESI-MS/MS spectra for putative metabolite identification

Original Article

DOI: 10.1007/s11306-014-0676-4

Cite this article as:
Allen, F., Greiner, R. & Wishart, D. Metabolomics (2015) 11: 98. doi:10.1007/s11306-014-0676-4


Electrospray tandem mass spectrometry (ESI-MS/MS) is commonly used in high throughput metabolomics. One of the key obstacles to the effective use of this technology is the difficulty in interpreting measured spectra to accurately and efficiently identify metabolites. Traditional methods for automated metabolite identification compare the target MS or MS/MS spectrum to the spectra in a reference database, ranking candidates based on the closeness of the match. However the limited coverage of available databases has led to an interest in computational methods for predicting reference MS/MS spectra from chemical structures. This work proposes a probabilistic generative model for the MS/MS fragmentation process, which we call competitive fragmentation modeling (CFM), and a machine learning approach for learning parameters for this model from MS/MS data. We show that CFM can be used in both a MS/MS spectrum prediction task (ie, predicting the mass spectrum from a chemical structure), and in a putative metabolite identification task (ranking possible structures for a target MS/MS spectrum). In the MS/MS spectrum prediction task, CFM shows significantly improved performance when compared to a full enumeration of all peaks corresponding to substructures of the molecule. In the metabolite identification task, CFM obtains substantially better rankings for the correct candidate than existing methods (MetFrag and FingerID) on tripeptide and metabolite data, when querying PubChem or KEGG for candidate structures of similar mass.


Tandem mass spectrometry MS/MS Metabolite identification Machine learning 

Supplementary material

11306_2014_676_MOESM1_ESM.pdf (446 kb)
Supplementary material 1 (pdf 446 KB)
11306_2014_676_MOESM2_ESM.txt (121 kb)
Supplementary material 2 (txt 120 KB)
11306_2014_676_MOESM3_ESM.txt (77 kb)
Supplementary material 3 (txt 76 KB)
11306_2014_676_MOESM4_ESM.txt (12 kb)
Supplementary material 4 (txt 12 KB)

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of Computing ScienceUniversity of AlbertaEdmontonCanada

Personalised recommendations