Abstract
Mass spectrometry is a core analytical chemistry technique for elucidating the structure and identity of compounds. Broadly, the technique involves the ionization of an analyte and analysis of the resulting mass spectrum, a representation of ion intensity as a function of mass to charge ratios. In this article, the notion of similarity as it applies to mass spectra is explored. In particular, several modes of approximating distances and similarities in patterns are touched upon: ℓ 1 and ℓ 2 distances, the Wasserstein metric (earth mover’s distance) and cosine similarity derived measures. Concluding the manuscript is a report on the performance of the similarity measures on a small test set of data, followed by a discussion of mass spectral library searching and prospects for quantifying uncertainty in compound identifications leveraging mass spectral similarity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The specification of “unit-mass resolution” indicates that the mass-to-charge ratio of ions will always be positive integer values. This resolution of electron ionization mass spectra are commonly used in many industrial applications.
- 2.
The term “replicate spectra” is used here to indicate spectra of one compound sourced from two different commercial libraries, differing from the usual convention of a repeated measurement by a single individual/source.
- 3.
The term “non-replicate spectra” is used here to indicate a pair of spectra from two different compounds, each spectrum from a different library.
References
Barupal, D.K., Fan, S., Fiehn, O.: Integrating bioinformatics approaches for a comprehensive interpretation of metabolomics datasets. Curr. Opin. Biotechnol. 54, 1–9 (2018)
Blaženović, I., Kind, T., Ji, J., Fiehn, O.: Software tools and approaches for compound identification of LC-MS/MS data in metabolomics. Metabolites 8(2), 31 (2018)
Blaženović, I., Oh, Y.T., Li, F., Ji, J., Nguyen, A.K., Wancewicz, B., Bender, J.M., Fiehn, O., Youn, J.H.: Effects of gut bacteria depletion and high-Na+ and low-K+ intake on circulating levels of biogenic amines. Mol. Nutr. Food Res. 63(4), 1801184 (2019)
Burke, M.C., Mirokhin, Y.A., Tchekhovskoi, D.V., Markey, S.P., Heidbrink Thompson, J., Larkin, C., Stein, S.E.: The hybrid search: a mass spectral library search method for discovery of modifications in proteomics. J. Proteome Res. 16(5), 1924–1935 (2017)
Burke, M.C., Zhang, Z., Mirokhin, Y.A., Tchekovskoi, D.V., Liang, Y., Stein, S.E.: False discovery rate estimation for hybrid mass spectral library search identifications in bottom-up proteomics. J. Proteome Res. 18(9), 3223–3234 (2019)
Cooper, B., Yan, X., Simón-Manso, Y., Tchekhovskoi, D., Mirokhin, Y., Stein, S.: Hybrid search: a method for identifying metabolites absent from tandem mass spectrometry libraries. Anal. Chem. 91(21), 13924–13932 (2019)
Griffiths, J.: A brief history of mass spectrometry. Anal. Chem. 80(15), 5678–5683 (2008)
Jang, I., Lee, J.u., Lee, J.m., Kim, B.H., Moon, B., Hong, J., Oh, H.B.: LC–MS/MS software for screening unknown erectile dysfunction drugs and analogs: artificial neural network classification, peak-count scoring, simple similarity search, and hybrid similarity search algorithms. Anal. Chem. 91(14), 9119–9128 (2019)
Kim, S., Zhang, X.: Comparative analysis of mass spectral similarity measures on peak alignment for comprehensive two-dimensional gas chromatography mass spectrometry. Computat. Math. Methods Med. 2013 (2013). Article ID 509761
Kim, S., Koo, I., Wei, X., Zhang, X.: A method of finding optimal weight factors for compound identification in gas chromatography–mass spectrometry. Bioinformatics 28(8), 1158–1163 (2012)
Koo, I., Kim, S., Zhang, X.: Comparative analysis of mass spectral matching-based compound identification in gas chromatography–mass spectrometry. J. Chromatogr. A 1298, 132–138 (2013)
Li, W., Ryu, E.K., Osher, S., Yin, W., Gangbo, W.: A parallel method for earth mover’s distance. J. Sci. Comput. 75(1), 182–197 (2018)
Mass Bank of North America (MoNA). https://mona.fiehnlab.ucdavis.edu/. Accessed 26 Nov 2019
Moorthy, A.S., Wallace, W.E., Kearsley, A.J., Tchekhovskoi, D.V., Stein, S.E.: Combining fragment-ion and neutral-loss matching during mass spectral library searching: a new general purpose algorithm applicable to illicit drug identification. Anal. Chem. 89(24), 13261–13268 (2017)
NIST 2017 Mass Spectral Library (demo). https://chemdata.nist.gov. Accessed 26 Nov 2019
R Core Team: R: a Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2019). https://www.R-project.org/
Remoroza, C.A., Mak, T.D., De Leoz, M.L.A., Mirokhin, Y.A., Stein, S.E.: Creating a mass spectral reference library for oligosaccharides in human milk. Anal. Chem. 90(15), 8977–8988 (2018)
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)
Scientific Working Group for the Analysis of Seized Drugs (SWGDRUG) Mass Spectral Library v.3.6. https://swgdrug.org. Accessed 26 Nov 2019
Shirdhonkar, S., Jacobs, D.W.: Approximate earth mover’s distance in linear time. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, New York (2008)
Stein, S.E., Scott, D.R.: Optimization and testing of mass spectral library search algorithms for compound identification. J. Am. Soc. Mass Spectrom. 5(9), 859–866 (1994)
Watson, J.T., Sparkman, O.D.: Introduction to Mass Spectrometry: Instrumentation, Applications, and Strategies for Data Interpretation. Wiley, Chichester (2007)
Wei, X., Koo, I., Kim, S., Zhang, X.: Compound identification in GC-MS by simultaneously evaluating the mass spectrum and retention index. Analyst 139(10), 2507–2514 (2014)
Acknowledgements
The first author would like to thank Prof. Peregrina Quintella Estevez (Universidade de Santiago de Compostella) for coordinating Industry Day at the International Congress of Industrial and Applied Mathematics 2019 meeting. The meeting has generated meaningful relationships and discussion that will greatly benefit future work in this field. The authors also acknowledge Christopher Schanzle (National Institute of Standards and Technology) for an implementation of the earth mover’s distance, and Dr. Gary Mallard (National Institute of Standards and Technology) for his guidance in preparing this manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Moorthy, A.S., Kearsley, A.J. (2021). Pattern Similarity Measures Applied to Mass Spectra. In: Cruz, M., Parés, C., Quintela, P. (eds) Progress in Industrial Mathematics: Success Stories. SEMA SIMAI Springer Series(), vol 5. Springer, Cham. https://doi.org/10.1007/978-3-030-61844-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-61844-5_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61843-8
Online ISBN: 978-3-030-61844-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)