Skip to main content

Pattern Similarity Measures Applied to Mass Spectra

  • Conference paper
  • First Online:
Progress in Industrial Mathematics: Success Stories

Part of the book series: SEMA SIMAI Springer Series ((ICIAM2019SSSS,volume 5))

Abstract

Mass spectrometry is a core analytical chemistry technique for elucidating the structure and identity of compounds. Broadly, the technique involves the ionization of an analyte and analysis of the resulting mass spectrum, a representation of ion intensity as a function of mass to charge ratios. In this article, the notion of similarity as it applies to mass spectra is explored. In particular, several modes of approximating distances and similarities in patterns are touched upon: 1 and 2 distances, the Wasserstein metric (earth mover’s distance) and cosine similarity derived measures. Concluding the manuscript is a report on the performance of the similarity measures on a small test set of data, followed by a discussion of mass spectral library searching and prospects for quantifying uncertainty in compound identifications leveraging mass spectral similarity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The specification of “unit-mass resolution” indicates that the mass-to-charge ratio of ions will always be positive integer values. This resolution of electron ionization mass spectra are commonly used in many industrial applications.

  2. 2.

    The term “replicate spectra” is used here to indicate spectra of one compound sourced from two different commercial libraries, differing from the usual convention of a repeated measurement by a single individual/source.

  3. 3.

    The term “non-replicate spectra” is used here to indicate a pair of spectra from two different compounds, each spectrum from a different library.

References

  1. Barupal, D.K., Fan, S., Fiehn, O.: Integrating bioinformatics approaches for a comprehensive interpretation of metabolomics datasets. Curr. Opin. Biotechnol. 54, 1–9 (2018)

    Article  Google Scholar 

  2. Blaženović, I., Kind, T., Ji, J., Fiehn, O.: Software tools and approaches for compound identification of LC-MS/MS data in metabolomics. Metabolites 8(2), 31 (2018)

    Article  Google Scholar 

  3. Blaženović, I., Oh, Y.T., Li, F., Ji, J., Nguyen, A.K., Wancewicz, B., Bender, J.M., Fiehn, O., Youn, J.H.: Effects of gut bacteria depletion and high-Na+ and low-K+ intake on circulating levels of biogenic amines. Mol. Nutr. Food Res. 63(4), 1801184 (2019)

    Google Scholar 

  4. Burke, M.C., Mirokhin, Y.A., Tchekhovskoi, D.V., Markey, S.P., Heidbrink Thompson, J., Larkin, C., Stein, S.E.: The hybrid search: a mass spectral library search method for discovery of modifications in proteomics. J. Proteome Res. 16(5), 1924–1935 (2017)

    Article  Google Scholar 

  5. Burke, M.C., Zhang, Z., Mirokhin, Y.A., Tchekovskoi, D.V., Liang, Y., Stein, S.E.: False discovery rate estimation for hybrid mass spectral library search identifications in bottom-up proteomics. J. Proteome Res. 18(9), 3223–3234 (2019)

    Article  Google Scholar 

  6. Cooper, B., Yan, X., Simón-Manso, Y., Tchekhovskoi, D., Mirokhin, Y., Stein, S.: Hybrid search: a method for identifying metabolites absent from tandem mass spectrometry libraries. Anal. Chem. 91(21), 13924–13932 (2019)

    Article  Google Scholar 

  7. Griffiths, J.: A brief history of mass spectrometry. Anal. Chem. 80(15), 5678–5683 (2008)

    Article  Google Scholar 

  8. Jang, I., Lee, J.u., Lee, J.m., Kim, B.H., Moon, B., Hong, J., Oh, H.B.: LC–MS/MS software for screening unknown erectile dysfunction drugs and analogs: artificial neural network classification, peak-count scoring, simple similarity search, and hybrid similarity search algorithms. Anal. Chem. 91(14), 9119–9128 (2019)

    Google Scholar 

  9. Kim, S., Zhang, X.: Comparative analysis of mass spectral similarity measures on peak alignment for comprehensive two-dimensional gas chromatography mass spectrometry. Computat. Math. Methods Med. 2013 (2013). Article ID 509761

    Google Scholar 

  10. Kim, S., Koo, I., Wei, X., Zhang, X.: A method of finding optimal weight factors for compound identification in gas chromatography–mass spectrometry. Bioinformatics 28(8), 1158–1163 (2012)

    Article  Google Scholar 

  11. Koo, I., Kim, S., Zhang, X.: Comparative analysis of mass spectral matching-based compound identification in gas chromatography–mass spectrometry. J. Chromatogr. A 1298, 132–138 (2013)

    Article  Google Scholar 

  12. Li, W., Ryu, E.K., Osher, S., Yin, W., Gangbo, W.: A parallel method for earth mover’s distance. J. Sci. Comput. 75(1), 182–197 (2018)

    Article  MathSciNet  Google Scholar 

  13. Mass Bank of North America (MoNA). https://mona.fiehnlab.ucdavis.edu/. Accessed 26 Nov 2019

  14. Moorthy, A.S., Wallace, W.E., Kearsley, A.J., Tchekhovskoi, D.V., Stein, S.E.: Combining fragment-ion and neutral-loss matching during mass spectral library searching: a new general purpose algorithm applicable to illicit drug identification. Anal. Chem. 89(24), 13261–13268 (2017)

    Article  Google Scholar 

  15. NIST 2017 Mass Spectral Library (demo). https://chemdata.nist.gov. Accessed 26 Nov 2019

  16. R Core Team: R: a Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2019). https://www.R-project.org/

  17. Remoroza, C.A., Mak, T.D., De Leoz, M.L.A., Mirokhin, Y.A., Stein, S.E.: Creating a mass spectral reference library for oligosaccharides in human milk. Anal. Chem. 90(15), 8977–8988 (2018)

    Article  Google Scholar 

  18. Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)

    Article  Google Scholar 

  19. Scientific Working Group for the Analysis of Seized Drugs (SWGDRUG) Mass Spectral Library v.3.6. https://swgdrug.org. Accessed 26 Nov 2019

  20. Shirdhonkar, S., Jacobs, D.W.: Approximate earth mover’s distance in linear time. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, New York (2008)

    Google Scholar 

  21. Stein, S.E., Scott, D.R.: Optimization and testing of mass spectral library search algorithms for compound identification. J. Am. Soc. Mass Spectrom. 5(9), 859–866 (1994)

    Article  Google Scholar 

  22. Watson, J.T., Sparkman, O.D.: Introduction to Mass Spectrometry: Instrumentation, Applications, and Strategies for Data Interpretation. Wiley, Chichester (2007)

    Book  Google Scholar 

  23. Wei, X., Koo, I., Kim, S., Zhang, X.: Compound identification in GC-MS by simultaneously evaluating the mass spectrum and retention index. Analyst 139(10), 2507–2514 (2014)

    Article  Google Scholar 

Download references

Acknowledgements

The first author would like to thank Prof. Peregrina Quintella Estevez (Universidade de Santiago de Compostella) for coordinating Industry Day at the International Congress of Industrial and Applied Mathematics 2019 meeting. The meeting has generated meaningful relationships and discussion that will greatly benefit future work in this field. The authors also acknowledge Christopher Schanzle (National Institute of Standards and Technology) for an implementation of the earth mover’s distance, and Dr. Gary Mallard (National Institute of Standards and Technology) for his guidance in preparing this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arun S. Moorthy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Moorthy, A.S., Kearsley, A.J. (2021). Pattern Similarity Measures Applied to Mass Spectra. In: Cruz, M., Parés, C., Quintela, P. (eds) Progress in Industrial Mathematics: Success Stories. SEMA SIMAI Springer Series(), vol 5. Springer, Cham. https://doi.org/10.1007/978-3-030-61844-5_4

Download citation

Publish with us

Policies and ethics