Similarity of High-Resolution Tandem Mass Spectrometry Spectra of Structurally Related Micropollutants and Transformation Products

  • Jennifer E. Schollée
  • Emma L. Schymanski
  • Michael A. Stravs
  • Rebekka Gulde
  • Nikolaos S. Thomaidis
  • Juliane Hollender
Research Article


High-resolution tandem mass spectrometry (HRMS2) with electrospray ionization is frequently applied to study polar organic molecules such as micropollutants. Fragmentation provides structural information to confirm structures of known compounds or propose structures of unknown compounds. Similarity of HRMS2 spectra between structurally related compounds has been suggested to facilitate identification of unknown compounds. To test this hypothesis, the similarity of reference standard HRMS2 spectra was calculated for 243 pairs of micropollutants and their structurally related transformation products (TPs); for comparison, spectral similarity was also calculated for 219 pairs of unrelated compounds. Spectra were measured on Orbitrap and QTOF mass spectrometers and similarity was calculated with the dot product. The influence of different factors on spectral similarity [e.g., normalized collision energy (NCE), merging fragments from all NCEs, and shifting fragments by the mass difference of the pair] was considered. Spectral similarity increased at higher NCEs and highest similarity scores for related pairs were obtained with merged spectra including measured fragments and shifted fragments. Removal of the monoisotopic peak was critical to reduce false positives. Using a spectral similarity score threshold of 0.52, 40% of related pairs and 0% of unrelated pairs were above this value. Structural similarity was estimated with the Tanimoto coefficient and pairs with higher structural similarity generally had higher spectral similarity. Pairs where one or both compounds contained heteroatoms such as sulfur often resulted in dissimilar spectra. This work demonstrates that HRMS2 spectral similarity may indicate structural similarity and that spectral similarity can be used in the future to screen complex samples for related compounds such as micropollutants and TPs, assisting in the prioritization of non-target compounds.

Graphical Abstract


High-resolution tandem mass spectrometry Micropollutants Transformation products Non-target screening Spectral similarity 



Birgit Beck, Heinz Singer, and many members of the Department of Environmental Chemistry at Eawag are gratefully acknowledged for the measurement of the standards for MassBank. The authors additionally thank Nikiforos Alygizakis from the University of Athens for the measurement of the QTOFMS spectra. Uwe Schmitt (ETH Zurich) and Leon Bichmann (Eawag), Sebastian Böcker and Kai Dührkop (University of Jena), and Oscar Yanes (Center for Omic Sciences, Spain) are thanked for helpful discussions. Funding for JES was provided by the EDA-Emerge project through the EU Seventh Framework Programme (FP7-PEOPLE-2011-ITN) under grant agreement number 290100 and from the Swiss Federal Office for the Environment. ELS was supported by the SOLUTIONS project (EU FP7, grant number 603437). Funding for M.S. and R.G. was provided by the Swiss National Science Foundation.

Supplementary material

13361_2017_1797_MOESM1_ESM.docx (1.5 mb)
ESM 1 (DOCX 1501 kb)


  1. 1.
    Wishart, D., Tzur, D., Knox, C., Eisner, R., Guo, A., Young, N., Cheng, D., Jewell, K., Arndt, D., Sawhney, S., Fung, C., Nikolai, L., Lewis, M., Coutouly, M., Forsythe, I., Tang, P., Shrivastava, S., Jeroncic, K., Stothard, P., Amegbey, G., Block, D., Hau, D., Wagner, J., Miniaci, J., Clements, M., Gebremedhin, M., Guo, N., Zhang, Y., Duggan, G., MacInnis, G.: HMDB: The human metabolome database. Nucleic Acids Res. 35, D521–D526 (2007)CrossRefGoogle Scholar
  2. 2.
    Neumann, S., Böcker, S.: Computational mass spectrometry for metabolomics: identification of metabolites and small molecules. Anal. Bioanal. Chem. 398, 2779–2788 (2010)CrossRefGoogle Scholar
  3. 3.
    Stein, S.: Mass spectral reference libraries: an ever-expanding resource for chemical identification. Anal. Chem. 84, 7274–7282 (2012)CrossRefGoogle Scholar
  4. 4.
    Stein, S.E., Scott, D.R.: Optimization and testing of mass spectral library search algorithms for compound identification. J. Am. Soc. Mass Spectrom. 5, 859–866 (1994)CrossRefGoogle Scholar
  5. 5.
    Allen, F., Greiner, R., Wishart, D.: Competitive fragmentation modeling of ESI-MS/MS spectra for putative metabolite identification. Metabolomics. 11, 98–110 (2015)CrossRefGoogle Scholar
  6. 6.
    Mylonas, R., Mauron, Y., Masselot, A., Binz, P., Budin, N., Fathi, M., Viette, V., Hochstrasser, D., Lisacek, F.: X-rank: a robust algorithm for small molecule identification using tandem mass spectrometry. Anal. Chem. 81, 7604–7610 (2009)CrossRefGoogle Scholar
  7. 7.
    Rasche, F., Scheubert, K., Hufsky, F., Zichner, T., Kai, M., Svatos, A., Bocker, S.: Identifying the unknowns by aligning fragmentation trees. Anal. Chem. 84, 3417–3426 (2012)CrossRefGoogle Scholar
  8. 8.
    Smith, C., O'Maille, G., Want, E., Qin, C., Trauger, S., Brandon, T., Custodio, D., Abagyan, R., Siuzdak, G.: METLIN: a metabolite mass spectral database. Ther. Drug Monit. 27, 747–751 (2005)CrossRefGoogle Scholar
  9. 9.
    Ma, Y., Kind, T., Yang, D., Leon, C., Fiehn, O.: MS2Analyzer: a software for small molecule substructure annotations from accurate tandem mass spectra. Anal. Chem. 86, 10724–10731 (2014)CrossRefGoogle Scholar
  10. 10.
    Dührkop, K., Shen, H., Meusel, M., Rousu, J., Böcker, S.: Searching molecular structure databases with tandem mass spectra using CSI:FingerID. Proc.Natl. Acad. Sci. 112, 12580–12585 (2015)CrossRefGoogle Scholar
  11. 11.
    Kern, S., Fenner, K., Singer, H.P., Schwarzenbach, R.P., Hollender, J.: Identification of transformation products of organic contaminants in natural waters by computer-aided prediction and high-resolution mass spectrometry. Environ. Sci. Technol. 43, 7039–7046 (2009)CrossRefGoogle Scholar
  12. 12.
    Majewsky, M., Glauner, T., Horn, H.: Systematic suspect screening and identification of sulfonamide antibiotic transformation products in the aquatic environment. Anal. Bioanal. Chem. 1–11 (2015)Google Scholar
  13. 13.
    Demuth, W., Karlovits, M., Varmuza, K.: Spectral similarity versus structural similarity: mass spectrometry. Anal. Chim. Acta. 516, 75–85 (2004)CrossRefGoogle Scholar
  14. 14.
    Schollée, J.E., Schymanski, E.L., Avak, S.E., Loos, M., Hollender, J.: Prioritizing unknown transformation products from biologically-treated wastewater using high-resolution mass spectrometry, multivariate statistics, and metabolic logic. Anal. Chem. 87, 12121–12129 (2015)CrossRefGoogle Scholar
  15. 15.
    Stravs, M.A., Schymanski, E.L., Singer, H.P., Hollender, J.: Automatic recalibration and processing of tandem mass spectra using formula annotation. J. Mass Spectrom. 48, 89–99 (2013)CrossRefGoogle Scholar
  16. 16.
    Horai, H., Arita, M., Kanaya, S., Nihei, Y., Ikeda, T., Suwa, K., Ojima, Y., Tanaka, K., Tanaka, S., Aoshima, K., Oda, Y., Kakazu, Y., Kusano, M., Tohge, T., Matsuda, F., Sawada, Y., Hirai, M.Y., Nakanishi, H., Ikeda, K., Akimoto, N., Maoka, T., Takahashi, H., Ara, T., Sakurai, N., Suzuki, H., Shibata, D., Neumann, S., Iida, T., Tanaka, K., Funatsu, K., Matsuura, F., Soga, T., Taguchi, R., Saito, K., Nishioka, T.: MassBank: a public repository for sharing mass spectral data for life sciences. J. Mass Spectrom. 45, 703–714 (2010)CrossRefGoogle Scholar
  17. 17.
    Gago-Ferrero, P., Schymanski, E.L., Bletsou, A.A., Aalizadeh, R., Hollender, J., Thomaidis, N.S.: Extended suspect and non-target strategies to characterize emerging polar organic contaminants in raw wastewater with LC-HRMS/MS. Environ. Sci. Technol. 49, 12333–12341 (2015)CrossRefGoogle Scholar
  18. 18.
    A language and environment for statistical computing. R Foundation for Statistical Computing (2014)
  19. 19.
    Stein, S.E.: Chemical substructure identification by mass spectral library searching. J. Am. Soc. Mass Spectrom. 6, 644–655 (1995)CrossRefGoogle Scholar
  20. 20.
    Huan, T., Tang, C., Li, R., Shi, Y., Lin, G., Li, L.: MyCompoundID MS/MS search: metabolite identification using a library of predicted fragment-ion-spectra of 383,830 possible human metabolites. Anal. Chem. 87, 10619–10626 (2015)CrossRefGoogle Scholar
  21. 21.
    OrgMassSpecR: Organic mass spectrometry. R package ver. 0.4-4 (2014)
  22. 22.
    Sarkar, D.: Lattice: Multivariate Data Visualization with R. Springer, New York (2008)Google Scholar
  23. 23.
    R package 'sm': nonparametric smoothing methods, (2014)
  24. 24.
    Watrous, J., Roach, P., Alexandrov, T., Heath, B., Yang, J., Kersten, R., van der Voort, M., Pogliano, K., Gross, H., Raaijmakers, J., Moore, B., Laskin, J., Bandeira, N., Dorrestein, P.: Mass spectral molecular networking of living microbial colonies. Proc. Natl. Acad. Sci. USA. 109, E1743–E1752 (2012)CrossRefGoogle Scholar
  25. 25.
    Hand, D.J.: Measuring classifier performance: a coherent alternative to the area under the ROC curve. Machine Learning. 77, 103–123 (2009)CrossRefGoogle Scholar
  26. 26.
    hmeasure: The H-measure and other scalar classification performance metrics (2012)
  27. 27.
    boot: Bootstrap R (S-Plus) Functions (2015)Google Scholar
  28. 28.
    JChem for Office (2015)
  29. 29.
    Daylight Chemical Information Systems, Inc.:
  30. 30.
    Cao, Y., Charisi, A., Cheng, L.-C., Jiang, T., Girke, T.: ChemmineR: a compound mining framework for R. Bioinformatics. 24, 1733–1734 (2008)CrossRefGoogle Scholar
  31. 31.
    Wang, Y., Backman, T.W.H., Horan, K., Girke, T.: fmcsR: mismatch tolerant maximum common substructure searching in R. Bioinformatics. 29, 2792–2794 (2013)CrossRefGoogle Scholar
  32. 32.
    Böcker, S., Dührkop, K.: Fragmentation trees reloaded. J. Cheminformatics. 8, 1–26 (2016)CrossRefGoogle Scholar
  33. 33.
    GNPS: Global natural products social molecular networking (2015)
  34. 34.
    López-Ratón, M., Rodríguez-Álvarez, M.X., Cadarso-Suárez, C., Gude-Sampedro, F.: OptimalCutpoints: An R package for selecting optimal cutpoints in diagnostic tests. 61, 36 (2014)Google Scholar
  35. 35.
    Holčapek, M., Jirásko, R., Lísa, M.: Basic rules for the interpretation of atmospheric pressure ionization mass spectra of small molecules. J. Chromatogr. A. 1217, 3908–3921 (2010)CrossRefGoogle Scholar
  36. 36.
    Salim, N., Holliday, J., Willett, P.: Combination of fingerprint-based similarity coefficients using data fusion. J. Chem. Information Computer Sci. 43, 435–442 (2003)CrossRefGoogle Scholar
  37. 37.
    Schymanski, E.L., Singer, H.P., Longrée, P., Loos, M., Ruff, M., Stravs, M.A., Ripollés Vidal, C., Hollender, J.: Strategies to characterize polar organic contamination in wastewater: exploring the capability of high resolution mass spectrometry. Environ. Sci. Technol. 48, 1811–1818 (2014)CrossRefGoogle Scholar
  38. 38.
    Schymanski, E.L., Singer, H.P., Slobodnik, J., Ipolyi, I., Oswald, P., Krauss, M., Schulze, T., Haglund, P., Letzel, T., Grosse, S., Thomaidis, N.S., Bletsou, A., Zwiener, C., Ibáñez, M., Portolés, T., de Boer, R., Reid, M., Onghena, M., Kunkel, U., Schulz, W., Guillon, A., Noyon, N., Leroy, G., Bados, P., Bogialli, S., Stipaničev, D., Rostkowski, P., Hollender, J.: Non-target screening with high-resolution mass spectrometry: critical review using a collaborative trial on water analysis. Anal. Bioanal. Chem. 407, 6237–6255 (2015)CrossRefGoogle Scholar
  39. 39.
    Barupal, D.K., Haldiya, P.K., Wohlgemuth, G., Kind, T., Kothari, S.L., Pinkerton, K.E., Fiehn, O.: MetaMapp: mapping and visualizing metabolomic data by integrating information from biochemical pathways and chemical and mass spectral similarity. BMC Bioinformatics. 13, 1–15 (2012)CrossRefGoogle Scholar
  40. 40.
    Allard, P.-M., Péresse, T., Bisson, J., Gindro, K., Marcourt, L., Pham, V.C., Roussi, F., Litaudon, M., Wolfender, J.-L.: Integration of molecular networking and in-silico MS/MS fragmentation for natural products dereplication. Anal. Chem. 88, 3317–3323 (2016)CrossRefGoogle Scholar

Copyright information

© American Society for Mass Spectrometry 2017

Authors and Affiliations

  • Jennifer E. Schollée
    • 1
    • 2
  • Emma L. Schymanski
    • 1
  • Michael A. Stravs
    • 1
    • 2
  • Rebekka Gulde
    • 1
  • Nikolaos S. Thomaidis
    • 3
  • Juliane Hollender
    • 1
    • 2
  1. 1.Eawag, Swiss Federal Institute of Aquatic Science and TechnologyDübendorfSwitzerland
  2. 2.Institute of Biogeochemistry and Pollutant DynamicsETH ZürichZürichSwitzerland
  3. 3.Laboratory of Analytical Chemistry, Department of ChemistryNational and Kapodistrian University of AthensAthensGreece

Personalised recommendations