Analytical and Bioanalytical Chemistry

, Volume 409, Issue 7, pp 1729–1735

Identifying known unknowns using the US EPA’s CompTox Chemistry Dashboard

Rapid Communication

DOI: 10.1007/s00216-016-0139-z

Cite this article as:
McEachran, A.D., Sobus, J.R. & Williams, A.J. Anal Bioanal Chem (2017) 409: 1729. doi:10.1007/s00216-016-0139-z


Chemical features observed using high-resolution mass spectrometry can be tentatively identified using online chemical reference databases by searching molecular formulae and monoisotopic masses and then rank-ordering of the hits using appropriate relevance criteria. The most likely candidate “known unknowns,” which are those chemicals unknown to an investigator but contained within a reference database or literature source, rise to the top of a chemical list when rank-ordered by the number of associated data sources. The U.S. EPA’s CompTox Chemistry Dashboard is a curated and freely available resource for chemistry and computational toxicology research, containing more than 720,000 chemicals of relevance to environmental health science. In this research, the performance of the Dashboard for identifying known unknowns was evaluated against that of the online ChemSpider database, one of the primary resources used by mass spectrometrists, using multiple previously studied datasets reported in the peer-reviewed literature totaling 162 chemicals. These chemicals were examined using both applications via molecular formula and monoisotopic mass searches followed by rank-ordering of candidate compounds by associated references or data sources. A greater percentage of chemicals ranked in the top position when using the Dashboard, indicating an advantage of this application over ChemSpider for identifying known unknowns using data source ranking. Additional approaches are being developed for inclusion into a non-targeted analysis workflow as part of the CompTox Chemistry Dashboard. This work shows the potential for use of the Dashboard in exposure assessment and risk decision-making through significant improvements in non-targeted chemical identification.

Graphical abstract

Identifying known unknowns in the US EPA's CompTox Chemistry Dashboard from molecular formula and monoisotopic mass inputs


Non-targeted analysisSuspect screeningDSSToxHigh-resolution mass spectrometry

Supplementary material

216_2016_139_MOESM1_ESM.pdf (1.1 mb)
ESM 1(PDF 694 kb)
216_2016_139_MOESM2_ESM.xlsx (33 kb)
Table S1Identifiers, masses, and formulae of the 162 chemicals in the study. DTXSID is the DSSTox substance identifier and is the unique identifier in the US EPAs DSSTox Database. (XLSX 32 kb)

Copyright information

© Springer-Verlag Berlin Heidelberg (outside the USA) 2016

Authors and Affiliations

  1. 1.Oak Ridge Institute for Science and Education (ORISE) Research Participation Program, U.S. Environmental Protection AgencyDurhamUSA
  2. 2.National Exposure Research Laboratory, Office of Research and Development, U.S. Environmental Protection AgencyDurhamUSA
  3. 3.National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection AgencyDurhamUSA