Earth Science Informatics

, Volume 6, Issue 3, pp 175–185 | Cite as

A Linked Science investigation: enhancing climate change data discovery with semantic technologies

  • Line C. Pouchard
  • Marcia L. Branstetter
  • Robert B. Cook
  • Ranjeet Devarakonda
  • Jim Green
  • Giri Palanisamy
  • Paul Alexander
  • Natalya F. Noy
Software Article

Abstract

Linked Science is the practice of inter-connecting scientific assets by publishing, sharing and linking scientific data and processes in end-to-end loosely coupled workflows that allow the sharing and re-use of scientific data. Much of this data does not live in the cloud or on the Web, but rather in multi-institutional data centers that provide tools and add value through quality assurance, validation, curation, dissemination, and analysis of the data. In this paper, we make the case for the use of scientific scenarios in Linked Science. We propose a scenario in river-channel transport that requires biogeochemical experimental data and global climate-simulation model data from many sources. We focus on the use of ontologies—formal machine-readable descriptions of the domain—to facilitate search and discovery of this data. Mercury, developed at Oak Ridge National Laboratory, is a tool for distributed metadata harvesting, search and retrieval. Mercury currently provides uniform access to more than 100,000 metadata records; 30,000 scientists use it each month. We augmented search in Mercury with ontologies, such as the ontologies in the Semantic Web for Earth and Environmental Terminology (SWEET) collection by prototyping a component that provides access to the ontology terms from Mercury. We evaluate the coverage of SWEET for the ORNL Distributed Active Archive Center (ORNL DAAC).

Keywords

Linked Science Ontologies BioPortal Semantic search Climate change Data discovery 

Abbreviations

API

Application programming interface

EC2l

Amazon’s elastic compute cloud

EPA

Environmental protection agency

ESGF

Earth system grid federation

ESIP

Earth science information partners

FTP

File transfer protocol

ISO

International standards organization

MODIS

Moderate resolution imaging spectroradiometer

NASA

National areonautics and space administration

NCBO

National center for bio-medical ontologies

OAI-PMH

Open archives initiatives-protocol for metadata harvesting

ORNL

Oak ridge national laboratory

ORNL DAAC

ORNL distributed active archive center

OWL

(W3C) web ontology language

RDF

(W3C) resource description format

SKOS

(W3C) simple knowledge organization system

SPARQL

W3C query language for RDF

SWEET

Semantic web earth and environmental terminology

SWSE

Semantic web search engine

TB

Terabyte

URL

Uniform resource locator

USGS

US geological survey

VA

Virtual appliance

W3C

World wide web consortium

XML

(W3C) extensible markup language

References

  1. Bernholdt D, Bharathi S, Brown D, Chanchio K, Chen ML, Chervenak A et al (2005) The Earth System Grid: supporting the next generation of climate modeling research. Proc IEEE 93:485–495CrossRefGoogle Scholar
  2. Bruskiewich R, Coe EH, Jaiswal P, McCouch S, Polacco M, Stein L et al (2002) The plant ontology (TM) consortium and plant ontologies. Comp Funct Genom 3:137–142CrossRefGoogle Scholar
  3. Castells P, Fernandez M, Vallet D (2007) An adaptation of the vector-space model for ontology-based information retrieval. IEEE Trans Knowl Data Eng 19:261–272CrossRefGoogle Scholar
  4. Chu-Carroll J, Prager J, Czuba K, Ferrucci D, Duboue P (2006) Semantic search via XML fragments: a high-precision approach to IR. In: (ed) Vol. ACM, pp 445–452Google Scholar
  5. Devarakonda R, Palanisamy G, Wilson BE, Green JM (2010) Mercury: reusable metadata management, data discovery and access system. Earth Sci Inform 3:87–94CrossRefGoogle Scholar
  6. Devarakonda R, Palanisamy G, Green JM, Wilson BE (2011) Data sharing and retrieval using OAI-PMH. Earth Sci Inform 4:1–5CrossRefGoogle Scholar
  7. Doney SC (2010) The growing human footprint on coastal and open-ocean biogeochemistry. Science 328:1512–1516CrossRefGoogle Scholar
  8. Fink D, Hochachka WM, Zuckerberg B, Winkler DW, Shaby B, Munson MA et al (2010) Spatiotemporal exploratory models for broad-scale survey data. Ecol Appl 20:2131–2147CrossRefGoogle Scholar
  9. Gent PR, Danabasoglu G, Donner LJ, Holland MM, Hunke EC, Jayne SR et al (2011) The community climate system model version 4. J Clim 24:4973–4991CrossRefGoogle Scholar
  10. Gil Y, Ratankar V, Hanson P (2012) Organic data publishing: a novel approach to scientific data sharing. In: (ed) Proceedings of the 2nd international workshop on linked science, vol. 951, pp. CEUR, Boston, MAGoogle Scholar
  11. Hogan A, Harth A, Umbrich Jr, Kinsella S, Polleres A, Decker S (2011) Searching and browsing linked data with SWSE: the semantic web search engine. Web Semant Sci Serv Agents World Wide Web 9:365–401CrossRefGoogle Scholar
  12. Kauppinen T, de Espindola G (2011) Linked open science? communicating, sharing and evaluating data, methods and results for executable papers. In: (ed) International conference on computational science, ICCS 2011, vol. 4. pp 726–731Google Scholar
  13. Kauppinen T, Pouchard LC, Kessler C (2011) Proceedings of the first international workshop on linked science (LISC 2011). In: (ed) Vol. CEUR workshop proceedings 783, ppGoogle Scholar
  14. Kauppinen T, Pouchard L, Kessler C (2012) Proceedings of the first international linked science workshop. In: (ed) Second international linked science workshop, vol. 951, pp. CEUR, Boston, MAGoogle Scholar
  15. Kelling S, Fink D, Hochachka W, Rosenberg K, Cook R, Damoulas T et al (2013) Estimating species distributions—across space through time and with features of the environment. In: Atkinson M, Baxter R, Brezany P, Corcho O, Galea M, Parsons M, Snelling D, and van Hemert J (eds) Data intensive research improving knowledge discovery in science, engineering, and business, John Wiley, pp 442–458Google Scholar
  16. Knoblock CA, Szekely P, Ambite JL, Gupta S, Goel A, Muslea M et al (2011) Interactively mapping data sources into the semantic web. In: (ed) First international workshop on linked science (LISC2011) at ISWC 2011, vol. CEUR Workshop Proceedings 783, pp. Bonn, GermanyGoogle Scholar
  17. Madin J, Bowers S, Schildhauer M, Krivov S, Pennington D, Villa F (2007) An ontology for describing and synthesizing ecological observation data. Ecol Inform 2:279–296CrossRefGoogle Scholar
  18. Mäs S, Müller M, Henzen C, Bernard L (2011) Linking the outcomes of scientific research: requirements from the perspective of geosciences. In: (ed) First international workshop on linked science (LISC2011) at ISWC 2011, vol. CEUR Workshop Proceedings 783, pp. Bonn, GermanyGoogle Scholar
  19. McCusker JP, Lebo T, Ding L, Chang C, Pinheiro da Silva P, McGuinness DL (2011) Where did you hear that? information and the sources they come from. In: (ed) First international workshop on linked science (LISC2011) at ISWC 2011, vol. CEUR Workshop Proceedings 783, pp. Bonn, GermanyGoogle Scholar
  20. Michener W, Allard S, Budden A, Cook R et al (2012) Participatory design of DataONE—enabling cyberinfrastructure for the biological and environmental sciences. Ecol Informa 11:5–15Google Scholar
  21. Musen MA, Noy NF, Shah NH, Whetzel P, Chute CG, Storey MA et al (2011) The national center for biomedical ontology. J Am Med Inform Assoc (JAMIA) 19(2):190–195Google Scholar
  22. Navigli R, Velardi P (2003) An analysis of ontology-based query expansion strategies. In: (ed) Proceedings of the 14th European Conference on Machine Learning, Workshop on Adaptive Text Extraction and Mining, Cavtat-Dubrovnik, Croatia, vol. pp 42–49Google Scholar
  23. Patton EW, Wang P, Zheng J, Fu L, Lebo T, Ding L et al (2011) Assessing health effects of water pollution using a semantic water quality portal. In: (ed) Proceedings of 10th international semantic web conference, vol. pp. Bonn, GermanyGoogle Scholar
  24. Pouchard L, Huhns M, Depriest A (2012) Lessons learned in deploying a cloud-based knowledge platform for the ESIP Federation. American Geo-physical Union Fall Meeting, San Francisco, December 2012Google Scholar
  25. Raskin RG, Pan MJ (2005) Knowledge representation in the semantic web for Earth and environmental terminology (SWEET). Comput Geosci 31:1119–1125CrossRefGoogle Scholar
  26. Tapley BD, Bettadpur S, Watkins M, Reigber C (2004) The gravity recovery and climate experiment: mission overview and early results. Geophys Res Lett 31:9Google Scholar
  27. Theon JS (1993) The tropical rainfall measuring mission (Trmm). Remote Sens Earths Surf Atmos 14:159–165Google Scholar
  28. Tran T, Herzig DM, Ladwig G (2011) SemSearchPro–using semantics throughout the search process. Web Semant Sci Serv Agents World Wide Web 9:349–364CrossRefGoogle Scholar
  29. Tummarello G, Delbru R, Oren E (2007) Sindice.com: weaving the open linked data. In: Aberer K, Choi K-S, Noy N, Allemang D, Lee K-I, Nixon L, Golbeck J, Mika P, Maynard D, Mizoguchi R, Schreiber G, Cudré-Mauroux P (eds) International semantic web conference (ISWC-2007), vol. 4825. Busan, Korea, Springer Berlin/Heidelberg, pp 552–565Google Scholar
  30. Uren V, Lei Y, Lopez V, Liu H, Motta E, Giordano M (2007) The usability of semantic search tools: a review. Knowl Eng Rev 22:361–377CrossRefGoogle Scholar
  31. Vision T, Blake J, Lapp H, Mabee P, Westerfield M (2011) Similarity between semantic description sets: addressing needs beyond data integration. In: (ed) First international workshop on linked science (LISC2011) at ISWC 2011, vol. CEUR Workshop Proceedings 783, pp. Bonn, GermanyGoogle Scholar
  32. Whetzel PL, Noy NF, Shah NH, Alexander PR, Nyulas CI, Tudorache T et al (2011) BioPortal: enhanced functionality via new web services from the national center for biomedical ontology to access and use ontologies in software applications. Nucleic Acids Res (NAR) 39:W541–W545CrossRefGoogle Scholar

Copyright information

© # Springer-Verlag Berlin Heidelberg (outside the USA) 2013

Authors and Affiliations

  • Line C. Pouchard
    • 1
  • Marcia L. Branstetter
    • 1
  • Robert B. Cook
    • 1
  • Ranjeet Devarakonda
    • 1
  • Jim Green
    • 1
  • Giri Palanisamy
    • 1
  • Paul Alexander
    • 2
  • Natalya F. Noy
    • 2
  1. 1.Oak Ridge National LaboratoryOak RidgeUSA
  2. 2.Stanford Center for Biomedical Informatics ResearchStanford UniversityStanfordUSA

Personalised recommendations