Regular expressions of MS/MS spectra for partial annotation of metabolite features
- 559 Downloads
Partial annotation and characterization of metabolite structures on the basis of data from tandem mass spectrometry (MS/MS) spectra are technical bottlenecks in metabolomics. Novel approaches should be explored for evaluation of spectral similarities among structurally related compounds as well as for description of fragmentation motifs commonly observed in MS/MS spectra.
A regular expression of MS/MS data was developed to search for structurally similar metabolites and to describe spectral motifs for partial annotation and characterization of metabolite structures.
After definition of an MS/MS string as a text representation of an MS/MS spectrum, a regular expression of MS/MS strings involving meta characters, anchors, and quantifiers was introduced. Here it was also demonstrated that spectral motifs can be described by a regular expression to define a common fragmentation pattern observed among structurally related metabolites.
The regular expression was applied to a search for similar MS/MS spectra. Analysis of MassBank data with fragment assignment information (fragment ion and neutral loss matrix, http://metabolomics.jp/wiki/Index:MassBank) suggested that the regular expression of MS/MS spectra can detect spectral similarities among structurally related metabolites. Analysis of MS/MS spectral libraries of Arabidopsis and rice revealed that the metabolite features can be partially annotated or characterized by the spectral motifs and can be assigned the corresponding ontology codes produced by Chemical Entities of Biological Interest (ChEBI).
The MS/MS spectral motifs represent a method for partial annotation or characterization of metabolite features. A regular expression of MS/MS data holds promise for further enrichment of metabolite annotations and for easy sharing of ambiguous annotation data among metabolomic studies.
KeywordsMS/MS spectrum Regular expression Small molecule identification Mass spectral motif Fragmentation
I am highly grateful to Prof. Takaaki Nishioka (Nara Institute of Science and Technology), Prof. Masanori Arita (National Institute of Genetics), and Mr. Yuya Ojima (MassBank) for providing the MS/MS spectra dataset from Metabolome.jp (http://metabolomics.jp/wiki/Index:MassBank). I also thank Dr. Yuji Sawada, Dr. Yutaka Yamada (RIKEN CSRS), Dr. Nozomu Sakurai and Dr. Nayumi Akimoto (Kazusa DNA research institute) for their helpful comments on this manuscript and databasing work.
This work was partially supported by the JST, Strategic International Collaborative Research Program, SICORP for JP-US Metabolomics, and a Grant-in-Aid for Scientific Research (B) No. 25820400.
Compliance with ethical standards
Conflict of interest
The authors declared that they have no conflict of interest in the submission of this manuscript.
All institutional and national guidelines for the care and use of laboratory animals were followed.
- Mulder, N. J., & Apweiler, R. (2002). Tools and resources for identifying protein families, domains and motifs. Genome Biology, 3, REVIEWS2001Google Scholar