, 12:113

Regular expressions of MS/MS spectra for partial annotation of metabolite features

Original Article

DOI: 10.1007/s11306-016-1052-3

Cite this article as:
Matsuda, F. Metabolomics (2016) 12: 113. doi:10.1007/s11306-016-1052-3



Partial annotation and characterization of metabolite structures on the basis of data from tandem mass spectrometry (MS/MS) spectra are technical bottlenecks in metabolomics. Novel approaches should be explored for evaluation of spectral similarities among structurally related compounds as well as for description of fragmentation motifs commonly observed in MS/MS spectra.


A regular expression of MS/MS data was developed to search for structurally similar metabolites and to describe spectral motifs for partial annotation and characterization of metabolite structures.


After definition of an MS/MS string as a text representation of an MS/MS spectrum, a regular expression of MS/MS strings involving meta characters, anchors, and quantifiers was introduced. Here it was also demonstrated that spectral motifs can be described by a regular expression to define a common fragmentation pattern observed among structurally related metabolites.


The regular expression was applied to a search for similar MS/MS spectra. Analysis of MassBank data with fragment assignment information (fragment ion and neutral loss matrix, suggested that the regular expression of MS/MS spectra can detect spectral similarities among structurally related metabolites. Analysis of MS/MS spectral libraries of Arabidopsis and rice revealed that the metabolite features can be partially annotated or characterized by the spectral motifs and can be assigned the corresponding ontology codes produced by Chemical Entities of Biological Interest (ChEBI).


The MS/MS spectral motifs represent a method for partial annotation or characterization of metabolite features. A regular expression of MS/MS data holds promise for further enrichment of metabolite annotations and for easy sharing of ambiguous annotation data among metabolomic studies.


MS/MS spectrum Regular expression Small molecule identification Mass spectral motif Fragmentation 

Supplementary material

11306_2016_1052_MOESM1_ESM.xlsx (700 kb)
Supplementary material 1 (XLSX 701 kb)
11306_2016_1052_MOESM2_ESM.doc (738 kb)
Supplementary material 2 (DOC 737 kb) (1.6 mb)
Supplementary Data 1The tandem mass spectrometry (MS/MS) string datasets of MassBank data (ZIP 1687 kb)

Funding information

Funder NameGrant NumberFunding Note
JST, Strategic International Collaborative Research Program, SICORP for JP-US Metabolomics
    Grant in Aid for Scientific Research (B)
    • 25820400

    Copyright information

    © Springer Science+Business Media New York 2016

    Authors and Affiliations

    1. 1.Department of Bioinformatics Engineering, Graduate School of Information Science and TechnologyOsaka UniversitySuitaJapan
    2. 2.RIKEN Center for Sustainable Resource ScienceYokohamaJapan

    Personalised recommendations