Analytical and Bioanalytical Chemistry

, Volume 409, Issue 7, pp 1729–1735 | Cite as

Identifying known unknowns using the US EPA’s CompTox Chemistry Dashboard

Rapid Communication

Abstract

Chemical features observed using high-resolution mass spectrometry can be tentatively identified using online chemical reference databases by searching molecular formulae and monoisotopic masses and then rank-ordering of the hits using appropriate relevance criteria. The most likely candidate “known unknowns,” which are those chemicals unknown to an investigator but contained within a reference database or literature source, rise to the top of a chemical list when rank-ordered by the number of associated data sources. The U.S. EPA’s CompTox Chemistry Dashboard is a curated and freely available resource for chemistry and computational toxicology research, containing more than 720,000 chemicals of relevance to environmental health science. In this research, the performance of the Dashboard for identifying known unknowns was evaluated against that of the online ChemSpider database, one of the primary resources used by mass spectrometrists, using multiple previously studied datasets reported in the peer-reviewed literature totaling 162 chemicals. These chemicals were examined using both applications via molecular formula and monoisotopic mass searches followed by rank-ordering of candidate compounds by associated references or data sources. A greater percentage of chemicals ranked in the top position when using the Dashboard, indicating an advantage of this application over ChemSpider for identifying known unknowns using data source ranking. Additional approaches are being developed for inclusion into a non-targeted analysis workflow as part of the CompTox Chemistry Dashboard. This work shows the potential for use of the Dashboard in exposure assessment and risk decision-making through significant improvements in non-targeted chemical identification.

Graphical abstract

Identifying known unknowns in the US EPA's CompTox Chemistry Dashboard from molecular formula and monoisotopic mass inputs

Keywords

Non-targeted analysis Suspect screening DSSTox High-resolution mass spectrometry 

Supplementary material

216_2016_139_MOESM1_ESM.pdf (1.1 mb)
ESM 1(PDF 694 kb)
216_2016_139_MOESM2_ESM.xlsx (33 kb)
Table S1Identifiers, masses, and formulae of the 162 chemicals in the study. DTXSID is the DSSTox substance identifier and is the unique identifier in the US EPAs DSSTox Database. (XLSX 32 kb)

References

  1. 1.
    Rager JE, Strynar MJ, Liang S, McMahen RL, Richard AM, Grulke CM, et al. Linking high resolution mass spectrometry data with exposure and toxicity forecasts to advance high-throughput environmental monitoring. Environ Int. 2016;88:269–80.CrossRefGoogle Scholar
  2. 2.
    Schymanski EL, Singer HP, Slobodnik J, Ipolyi IM, Oswald P, Krauss M, et al. Non-target screening with high-resolution mass spectrometry: critical review using a collaborative trial on water analysis. Anal Bioanal Chem. 2015;407(21):6237–55.CrossRefGoogle Scholar
  3. 3.
    Schymanski EL, Jeon J, Gulde R, Fenner K, Ruff M, Singer HP, et al. Identifying small molecules via high resolution mass spectrometry: communicating confidence. Environ Sci Technol. 2014;48(4):2097–8.CrossRefGoogle Scholar
  4. 4.
    Letzel T, Bayer A, Schulz W, Heermann A, Lucke T, Greco G, et al. LC–MS screening techniques for wastewater analysis and analytical data handling strategies: Sartans and their transformation products as an example. Chemosphere. 2015;137:198–206.CrossRefGoogle Scholar
  5. 5.
    Letzel T, Lucke T, Schulz W, Sengl M, Letzel M. OMI (Organic Molecule Identification) in water using LC-MS (/MS): steps from “unknown” to “identified”: a contribution to the discussion In a class of its own. Lab More. 2014;4:24–28. http://www.int.laborundmore.com/archive/921107/OMI-(Organic-Molecule-Identification)-in-water-using-LC-MS(-MS)%3A-Steps-from-%E2%80%9Cunknown%E2%80%9D-to-%E2%80%9Cidentified%E2%80%9D%3A-a-contribution-to-the-discussion.html.
  6. 6.
    Little JL, Cleven CD, Brown SD. Identification of “known unknowns” utilizing accurate mass data and chemical abstracts service databases. J Am Soc Mass Spectr. 2011;22(2):348–59.CrossRefGoogle Scholar
  7. 7.
    Little JL, Williams AJ, Pshenichnov A, Tkachenko V. Identification of “known unknowns” utilizing accurate mass data and ChemSpider. J Am Soc Mass Spectr. 2012;23(1):179–85.CrossRefGoogle Scholar
  8. 8.
    Pence HE, Williams A. ChemSpider: an online chemical information resource. J Chem Educ. 2010;87(11):1123–4.CrossRefGoogle Scholar
  9. 9.
    Royal Society of Chemistry. ChemSpider. 2016. http://www.chemspider.com/.
  10. 10.
    Schymanski EL, Singer HP, Longrée P, Loos M, Ruff M, Stravs MA, et al. Strategies to characterize polar organic contamination in wastewater: exploring the capability of high resolution mass spectrometry. Environ Sci Technol. 2014;48(3):1811–8. doi:10.1021/es4044374.CrossRefGoogle Scholar
  11. 11.
    Godfrey AR, Brenton AG. Accurate mass measurements and their appropriate use for reliable analyte identification. Anal Bioanal Chem. 2012;404(4):1159–64. doi:10.1007/s00216-012-6136-y.CrossRefGoogle Scholar
  12. 12.
    Ruttkies C, Schymanski EL, Wolf S, Hollender J, Neumann S. MetFrag relaunched: incorporating strategies beyond in silico fragmentation. J Cheminform. 2016;8(1):1–16. doi:10.1186/s13321-016-0115-9.CrossRefGoogle Scholar
  13. 13.
    Bade R, Causanilles A, Emke E, Bijlsma L, Sancho JV, Hernandez F, et al. Facilitating high resolution mass spectrometry data processing for screening of environmental water samples: an evaluation of two deconvolution tools. Sci Total Environ. 2016;569:434–41.CrossRefGoogle Scholar
  14. 14.
    Zedda M, Zwiener C. Is nontarget screening of emerging contaminants by LC-HRMS successful? A plea for compound libraries and computer tools. Anal Bioanal Chem. 2012;403(9):2493–502. doi:10.1007/s00216-012-5893-y.CrossRefGoogle Scholar
  15. 15.
    Richard AM, Williams CR. Distributed structure-searchable toxicity (DSSTox) public database network: a proposal. Mutat Res-Fund Mol M. 2002;499(1):27–52. doi:10.1016/S0027-5107(01)00289-5.CrossRefGoogle Scholar
  16. 16.
    McEachran AD, Shea D, Bodnar W, Nichols EG. Pharmaceutical occurrence in groundwater and surface waters in forests land-applied with municipal wastewater. Environ Toxicol Chem. 2016;35(4):898–905. doi:10.1002/etc.3216.CrossRefGoogle Scholar
  17. 17.
    R Team Core. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2016.Google Scholar
  18. 18.
    Kolpin DW, Furlong ET, Meyer MT, Thurman EM, Zaugg SD, Barber LB, et al. Pharmaceuticals, hormones, and other organic wastewater contaminants in U.S. streams, 1999-2000: a national reconnaissance. Environ Sci Technol. 2002;36(6):1202–11.CrossRefGoogle Scholar
  19. 19.
    Klosterhaus SL, Grace R, Hamilton MC, Yee D. Method validation and reconnaissance of pharmaceuticals, personal care products, and alkylphenols in surface waters, sediments, and mussels in an urban estuary. Environ Int. 2013;54:92–9.CrossRefGoogle Scholar
  20. 20.
    Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, et al. PubChem substance and compound databases. Nucleic Acids Res. 2016;44(D1):D1202–13. doi:10.1093/nar/gkv951.CrossRefGoogle Scholar
  21. 21.
    Dionisio KL, Frame AM, Goldsmith M-R, Wambaugh JF, Liddell A, Cathey T, et al. Exploring consumer exposure pathways and patterns of use for chemicals in the environment. Toxicol Rep. 2015;2:228–37.CrossRefGoogle Scholar
  22. 22.
    Mansouri K, Abdelaziz A, Rybacka A, Roncaglioni A, Tropsha A, Varnek A, et al. CERAPP: collaborative estrogen receptor activity prediction project. Environ Health Persp. 2016. doi:10.1289/ehp.1510267.Google Scholar
  23. 23.
    RISK-IDENT. STOFF-IDENT. 2013. http://risk-ident.hswt.de/pages/de/links.php.
  24. 24.
    Horai H, Arita M, Kanaya S, Nihei Y, Ikeda T, Suwa K, et al. MassBank: a public repository for sharing mass spectral data for life sciences. J Mass Spectrom. 2010;45(7):703–14. doi:10.1002/jms.1777.CrossRefGoogle Scholar
  25. 25.
    HighChem. mzCloud. 2016. https://www.mzcloud.org/. 16 August 2016.
  26. 26.
    Wolf S, Schmidt S, Müller-Hannemann M, Neumann S. In silico fragmentation for computer assisted identification of metabolite mass spectra. BMC Bioinformat. 2010;11(1):1.CrossRefGoogle Scholar
  27. 27.
    Smith CA, O’Maille G, Want EJ, Qin C, Trauger SA, Brandon TR, et al. METLIN: a metabolite mass spectral database. Ther Drug Monit. 2005;27(6):747–51.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg (outside the USA) 2016

Authors and Affiliations

  1. 1.Oak Ridge Institute for Science and Education (ORISE) Research Participation Program, U.S. Environmental Protection AgencyDurhamUSA
  2. 2.National Exposure Research Laboratory, Office of Research and Development, U.S. Environmental Protection AgencyDurhamUSA
  3. 3.National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection AgencyDurhamUSA

Personalised recommendations