Earth Science Informatics

, Volume 7, Issue 4, pp 249–264 | Cite as

SWEET ontology coverage for earth system sciences

  • Nicholas DiGiuseppeEmail author
  • Line C. Pouchard
  • Natalya F. Noy
Research Article


Scientists in the Earth and Environmental Sciences (EES) domain increasingly use ontologies to analyze and integrate their data. For example, the NASA’s SWEET ontologies (Semantic Web for Earth and Environmental Terminology) have become the de facto standard ontologies to represent the EES domain formally (Raskin 2010). Now we must develop principled ways both to evaluate existing ontologies and to ascertain their quality in a quantitative manner. Existing literature describes many potential quality metrics for ontologies. Among these metrics is the coverage metric, which approximates the relevancy of an ontology to a corpus (Yao et al. (PLoS Comput Biol 7(1):e1001055+, 2011)). This paper has three primary contributions to the EES domain: (1) we present an investigation of the applicability of existing coverage techniques for the EES domain; (2) we present a novel expansion of existing techniques that uses thesauri to generate equivalence and subclass axioms automatically; and (3) we present an experiment to establish an upper-bound coverage expectation for the SWEET ontologies against real-world EES corpora from DataONE (Michener et al. (Ecol Inform 11:5–15, 2012)), and a corpus designed from research articles to specifically match the topics covered by the SWEET ontologies. This initial evaluation suggests that the SWEET ontology can accurately represent real corpora within the EES domain.


Ontology Ontology coverage Semantic web Empirical 



This material is based upon work supported by the National Science Foundation, through Award CCF-1116943 and through Graduate Research Fellowship under Grant No. DGE-0808392. Michael Huhns was extremely helpful in directing and crystalizing this research. We would also like to thank Andrey Rzhetsky for providing the seven thesauri used in our experiment.


  1. Bird S, Loper E, Klein E (2009) Natural language processing with pythonGoogle Scholar
  2. Brank J, Grobelnik M, Mladenić D (2005) A survey of ontology evaluation techniques. In: Conference on data mining and data warehouses (SiKDD 2005)Google Scholar
  3. Cimiano P, Hotho A, Staab S (2005) Learning concept hierarchies from text corpora using formal concept analysis. J Artif Intell Res (JAIR) 24:305–339Google Scholar
  4. Dellschaft K, Staab S (2006) On how to perform a gold standard based evaluation of ontology learning. In: The Semantic Web-ISWC 2006. Springer, pp 228–241Google Scholar
  5. Devlin J (1961) A dictionary of synonyms and antonymsGoogle Scholar
  6. Doan A, Madhavan J, Dhamankar R, Domingos P, Halevy A (2003) Learning to match ontologies on the semantic web. VLDB J - Int J Very Large Data Bases 12(4):303–319CrossRefGoogle Scholar
  7. Gibson A, Wolstencroft K, Stevens R (2007) Promotion of ontological comprehension: Exposing terms and metadata with web 2.0. In: Workshop on social and collaborative construction of structured knowledge at WWW 2007Google Scholar
  8. Hahn U, Schnattinger K (1998) Towards text knowledge engineering. Hypothesis 1:2Google Scholar
  9. Hoehndorf R, Dumontier M, Gkoutos GV (2012) Evaluation of research in biomedical ontologies. Briefings in BioinformaticsGoogle Scholar
  10. S I (2001) Scholastic dictionary of synonyms, antonyms, and homonymsGoogle Scholar
  11. Kauppinen T, Pouchard L, Kessler C (2011) Proceedings of the First International Workshop on Linked Science (LISC 2011), volume CEUR Workshop Proceedings, p 783Google Scholar
  12. Kipfer BA (1993) 21st century synonym and antonym finder. DellGoogle Scholar
  13. Laird CG (2003) Webster’s New World Roget’s A-Z Thesaurus. SimonandSchuster.comGoogle Scholar
  14. LaRoche N, Rodale JJI, Urdang L (1978) The Synonym Finder. RodaleGoogle Scholar
  15. Lawrie D, Binkley D, Morrell C (2010) Normalizing source code vocabulary. In: 17th Working Conference on Reverse Engineering (WCRE) 2010. IEEE, pp 3–12Google Scholar
  16. Lynnes C (2012) Toolmatch. Proceedings of the ESIP Summer MeetingGoogle Scholar
  17. Maedche A, Staab S (2002) Measuring similarity between ontologies. In: Knowledge engineering and knowledge management: Ontologies and the semantic web. Springer, pp 251–263Google Scholar
  18. Maynard D, Peters W, Li Y (2006) Metrics for evaluation of ontology-based information extraction. In: International world wide web conferenceGoogle Scholar
  19. Michener WK, Allard S, Budden A, Cook RB, Douglass K, Frame M, Kelling S, Koskela R, Tenopir C, Vieglais DA (2012) Participatory design of dataoneenabling cyberinfrastructure for the biological and environmental sciences. Ecol Inform 11:5–15CrossRefGoogle Scholar
  20. Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):2CrossRefGoogle Scholar
  21. Pouchard L, Cook R, Green J, Palanisamy G, Noy N (2011) Semantic technologies improving the recall and precision of the mercury metadata search engine. AGU Fall Meet Abstr 1:1437Google Scholar
  22. Raskin RG (2010) SWEET 2.1 Ontologies. AGU Fall Meeting Abstracts, p B6+Google Scholar
  23. Rozell E, Fox P, Zheng J, Hendler J (2012) S2s architecture and faceted browsing applications. In: Proceedings of the 21st international conference companion on World Wide Web, WWW ’12 Companion. ACM, New York, pp 413–416Google Scholar
  24. Spooner A (2007) The Oxford dictionary of synonyms and antonyms. Oxford University PressGoogle Scholar
  25. Tripathi A, Babaie HA (2008) Developing a modular hydrogeology ontology by extending the sweet upper-level ontologies. Comput Geosci 34(9):1022–1033CrossRefGoogle Scholar
  26. Verspoor K, Cohn J, Mniszewski S, Joslyn C (2006) A categorization approach to automated ontological function annotation. Protein Sci 15(6):1544–1549CrossRefGoogle Scholar
  27. Wiegand N, Garcia C (2007) A task-based ontology approach to automate geospatial data retrieval. Trans GIS 11(3):355–376CrossRefGoogle Scholar
  28. Yao L, Divoli A, Mayzus I, Evans JA, Rzhetsky A (2011) Benchmarking Ontologies: Bigger or Better?PLoS Comput Biol 7(1):e1001055+CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Nicholas DiGiuseppe
    • 1
    Email author
  • Line C. Pouchard
    • 2
  • Natalya F. Noy
    • 3
  1. 1.University of California, IrvineIrvineUSA
  2. 2.Oak Ridge National LaboratoryOak RidgeUSA
  3. 3.Stanford Center for Biomedical Informatics ResearchStanford UniversityStanfordUSA

Personalised recommendations