Abstract
Linked Science is the practice of inter-connecting scientific assets by publishing, sharing and linking scientific data and processes in end-to-end loosely coupled workflows that allow the sharing and re-use of scientific data. Much of this data does not live in the cloud or on the Web, but rather in multi-institutional data centers that provide tools and add value through quality assurance, validation, curation, dissemination, and analysis of the data. In this paper, we make the case for the use of scientific scenarios in Linked Science. We propose a scenario in river-channel transport that requires biogeochemical experimental data and global climate-simulation model data from many sources. We focus on the use of ontologies—formal machine-readable descriptions of the domain—to facilitate search and discovery of this data. Mercury, developed at Oak Ridge National Laboratory, is a tool for distributed metadata harvesting, search and retrieval. Mercury currently provides uniform access to more than 100,000 metadata records; 30,000 scientists use it each month. We augmented search in Mercury with ontologies, such as the ontologies in the Semantic Web for Earth and Environmental Terminology (SWEET) collection by prototyping a component that provides access to the ontology terms from Mercury. We evaluate the coverage of SWEET for the ORNL Distributed Active Archive Center (ORNL DAAC).
Similar content being viewed by others
Abbreviations
- API:
-
Application programming interface
- EC2l:
-
Amazon’s elastic compute cloud
- EPA:
-
Environmental protection agency
- ESGF:
-
Earth system grid federation
- ESIP:
-
Earth science information partners
- FTP:
-
File transfer protocol
- ISO:
-
International standards organization
- MODIS:
-
Moderate resolution imaging spectroradiometer
- NASA:
-
National areonautics and space administration
- NCBO:
-
National center for bio-medical ontologies
- OAI-PMH:
-
Open archives initiatives-protocol for metadata harvesting
- ORNL:
-
Oak ridge national laboratory
- ORNL DAAC:
-
ORNL distributed active archive center
- OWL:
-
(W3C) web ontology language
- RDF:
-
(W3C) resource description format
- SKOS:
-
(W3C) simple knowledge organization system
- SPARQL:
-
W3C query language for RDF
- SWEET:
-
Semantic web earth and environmental terminology
- SWSE:
-
Semantic web search engine
- TB:
-
Terabyte
- URL:
-
Uniform resource locator
- USGS:
-
US geological survey
- VA:
-
Virtual appliance
- W3C:
-
World wide web consortium
- XML:
-
(W3C) extensible markup language
References
Bernholdt D, Bharathi S, Brown D, Chanchio K, Chen ML, Chervenak A et al (2005) The Earth System Grid: supporting the next generation of climate modeling research. Proc IEEE 93:485–495
Bruskiewich R, Coe EH, Jaiswal P, McCouch S, Polacco M, Stein L et al (2002) The plant ontology (TM) consortium and plant ontologies. Comp Funct Genom 3:137–142
Castells P, Fernandez M, Vallet D (2007) An adaptation of the vector-space model for ontology-based information retrieval. IEEE Trans Knowl Data Eng 19:261–272
Chu-Carroll J, Prager J, Czuba K, Ferrucci D, Duboue P (2006) Semantic search via XML fragments: a high-precision approach to IR. In: (ed) Vol. ACM, pp 445–452
Devarakonda R, Palanisamy G, Wilson BE, Green JM (2010) Mercury: reusable metadata management, data discovery and access system. Earth Sci Inform 3:87–94
Devarakonda R, Palanisamy G, Green JM, Wilson BE (2011) Data sharing and retrieval using OAI-PMH. Earth Sci Inform 4:1–5
Doney SC (2010) The growing human footprint on coastal and open-ocean biogeochemistry. Science 328:1512–1516
Fink D, Hochachka WM, Zuckerberg B, Winkler DW, Shaby B, Munson MA et al (2010) Spatiotemporal exploratory models for broad-scale survey data. Ecol Appl 20:2131–2147
Gent PR, Danabasoglu G, Donner LJ, Holland MM, Hunke EC, Jayne SR et al (2011) The community climate system model version 4. J Clim 24:4973–4991
Gil Y, Ratankar V, Hanson P (2012) Organic data publishing: a novel approach to scientific data sharing. In: (ed) Proceedings of the 2nd international workshop on linked science, vol. 951, pp. CEUR, Boston, MA
Hogan A, Harth A, Umbrich Jr, Kinsella S, Polleres A, Decker S (2011) Searching and browsing linked data with SWSE: the semantic web search engine. Web Semant Sci Serv Agents World Wide Web 9:365–401
Kauppinen T, de Espindola G (2011) Linked open science? communicating, sharing and evaluating data, methods and results for executable papers. In: (ed) International conference on computational science, ICCS 2011, vol. 4. pp 726–731
Kauppinen T, Pouchard LC, Kessler C (2011) Proceedings of the first international workshop on linked science (LISC 2011). In: (ed) Vol. CEUR workshop proceedings 783, pp
Kauppinen T, Pouchard L, Kessler C (2012) Proceedings of the first international linked science workshop. In: (ed) Second international linked science workshop, vol. 951, pp. CEUR, Boston, MA
Kelling S, Fink D, Hochachka W, Rosenberg K, Cook R, Damoulas T et al (2013) Estimating species distributions—across space through time and with features of the environment. In: Atkinson M, Baxter R, Brezany P, Corcho O, Galea M, Parsons M, Snelling D, and van Hemert J (eds) Data intensive research improving knowledge discovery in science, engineering, and business, John Wiley, pp 442–458
Knoblock CA, Szekely P, Ambite JL, Gupta S, Goel A, Muslea M et al (2011) Interactively mapping data sources into the semantic web. In: (ed) First international workshop on linked science (LISC2011) at ISWC 2011, vol. CEUR Workshop Proceedings 783, pp. Bonn, Germany
Madin J, Bowers S, Schildhauer M, Krivov S, Pennington D, Villa F (2007) An ontology for describing and synthesizing ecological observation data. Ecol Inform 2:279–296
Mäs S, Müller M, Henzen C, Bernard L (2011) Linking the outcomes of scientific research: requirements from the perspective of geosciences. In: (ed) First international workshop on linked science (LISC2011) at ISWC 2011, vol. CEUR Workshop Proceedings 783, pp. Bonn, Germany
McCusker JP, Lebo T, Ding L, Chang C, Pinheiro da Silva P, McGuinness DL (2011) Where did you hear that? information and the sources they come from. In: (ed) First international workshop on linked science (LISC2011) at ISWC 2011, vol. CEUR Workshop Proceedings 783, pp. Bonn, Germany
Michener W, Allard S, Budden A, Cook R et al (2012) Participatory design of DataONE—enabling cyberinfrastructure for the biological and environmental sciences. Ecol Informa 11:5–15
Musen MA, Noy NF, Shah NH, Whetzel P, Chute CG, Storey MA et al (2011) The national center for biomedical ontology. J Am Med Inform Assoc (JAMIA) 19(2):190–195
Navigli R, Velardi P (2003) An analysis of ontology-based query expansion strategies. In: (ed) Proceedings of the 14th European Conference on Machine Learning, Workshop on Adaptive Text Extraction and Mining, Cavtat-Dubrovnik, Croatia, vol. pp 42–49
Patton EW, Wang P, Zheng J, Fu L, Lebo T, Ding L et al (2011) Assessing health effects of water pollution using a semantic water quality portal. In: (ed) Proceedings of 10th international semantic web conference, vol. pp. Bonn, Germany
Pouchard L, Huhns M, Depriest A (2012) Lessons learned in deploying a cloud-based knowledge platform for the ESIP Federation. American Geo-physical Union Fall Meeting, San Francisco, December 2012
Raskin RG, Pan MJ (2005) Knowledge representation in the semantic web for Earth and environmental terminology (SWEET). Comput Geosci 31:1119–1125
Tapley BD, Bettadpur S, Watkins M, Reigber C (2004) The gravity recovery and climate experiment: mission overview and early results. Geophys Res Lett 31:9
Theon JS (1993) The tropical rainfall measuring mission (Trmm). Remote Sens Earths Surf Atmos 14:159–165
Tran T, Herzig DM, Ladwig G (2011) SemSearchPro–using semantics throughout the search process. Web Semant Sci Serv Agents World Wide Web 9:349–364
Tummarello G, Delbru R, Oren E (2007) Sindice.com: weaving the open linked data. In: Aberer K, Choi K-S, Noy N, Allemang D, Lee K-I, Nixon L, Golbeck J, Mika P, Maynard D, Mizoguchi R, Schreiber G, Cudré-Mauroux P (eds) International semantic web conference (ISWC-2007), vol. 4825. Busan, Korea, Springer Berlin/Heidelberg, pp 552–565
Uren V, Lei Y, Lopez V, Liu H, Motta E, Giordano M (2007) The usability of semantic search tools: a review. Knowl Eng Rev 22:361–377
Vision T, Blake J, Lapp H, Mabee P, Westerfield M (2011) Similarity between semantic description sets: addressing needs beyond data integration. In: (ed) First international workshop on linked science (LISC2011) at ISWC 2011, vol. CEUR Workshop Proceedings 783, pp. Bonn, Germany
Whetzel PL, Noy NF, Shah NH, Alexander PR, Nyulas CI, Tudorache T et al (2011) BioPortal: enhanced functionality via new web services from the national center for biomedical ontology to access and use ontologies in software applications. Nucleic Acids Res (NAR) 39:W541–W545
Acknowledgments
This work has been in part performed at Oak Ridge National Laboratory, Managed by UT Battelle, LLC under Contract No. De-AC05-00OR22725 for the U.S. Department of Energy.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: H. A. Babaie
Rights and permissions
About this article
Cite this article
Pouchard, L.C., Branstetter, M.L., Cook, R.B. et al. A Linked Science investigation: enhancing climate change data discovery with semantic technologies. Earth Sci Inform 6, 175–185 (2013). https://doi.org/10.1007/s12145-013-0118-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-013-0118-2