Skip to main content
Log in

Intelligent polar cyberinfrastructure: enabling semantic search in geospatial metadata catalogue to support polar data discovery

Earth Science Informatics Aims and scope Submit manuscript

Cite this article


Polar regions have garnered substantial research attention in recent years because they are key drivers of the Earth’s climate, a source of rich mineral resources, and the home of a variety of marine life. Nevertheless, global warming over the past century is pushing the polar systems towards a tipping point: the systems are at high-risk from melting snow and sea ice covers, permafrost thawing, and acidification of the Arctic oceans. To increase understanding of the polar environment, the National Science Foundation established a Polar Cyberinfrastructure (CI) program, aimed at utilizing advanced software architecture to support polar data analysis and decision-making. At the center of this Polar CI research are data resources and data discovery components that facilitate the search and retrieval of polar data. This paper reports our development of a semantic search tool that supports the intelligent discovery of polar datasets. This tool is built on latent semantic analysis techniques, which improves search performance by identifying hidden semantic associations between terminologies used in the various datasets’ metadata. The software tool is implemented using an object-oriented design pattern and has been successfully integrated into a popular open source metadata catalog as a new semantic search support. A semantic matrix is maintained persistently within the catalogue to store the semantic associations. A dynamic update mechanism was also developed to allow automated update of semantics once more metadata are loaded into or removed from the catalog. We explored the effects of rank reduction to the effectiveness of this semantic search module and demonstrated its better performance than the traditional search techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  • Aguilar-Lopez D, Lopez-Arevalo I, Sosa V (2009) Usage of domain ontologies for web search, International Symposium on Distributed Computing and Artificial Intelligence 2008 (DCAI 2008). Springer, pp. 319–328

  • Alhabashneh O, Iqbal R, Shah N, Amin S, James A (2011) Towards the development of an integrated framework for enhancing enterprise search using latent semantic indexing, Conceptual Structures for Discovering Knowledge. Springer, New York, pp 346–352

    Google Scholar 

  • Beran B (2007) Hydroseek: an ontology-aided data discovery system for hydrologic sciences. Citeseer

  • Bernard L, Einspanier U, Haubrock S, Hubner S, Kuhn W, Lessing R, Lutz M, Visser U (2003) Ontologies for intelligent search and semantic translation in spatial data infrastructures. Photogrammetrie Fernerkundung Geoinformation, 451–462

  • Bhogal J, Macfarlane A, Smith P (2007) A review of ontology based query expansion. Inf Proc Management 43:866–886

    Article  Google Scholar 

  • Budak Arpinar I, Sheth A, Ramakrishnan C, Lynn Usery E, Azami M, Kwan MP (2006) Geospatial ontology development and semantic analytics. Trans GIS 10:551–575

    Article  Google Scholar 

  • Castells P, Fernandez M, Vallet D (2007) An adaptation of the vector-space model for ontology-based information retrieval. Knowl Data Eng IEEE Trans 19:261–272

    Article  Google Scholar 

  • Celikyilmaz A, Hakkani-Tur D, Tur G (2010) LDA based similarity modeling for question answering, Proceedings of the NAACL HLT 2010 Workshop on Semantic Search. Association for Computational Linguistics, pp. 1–9

  • Chen H, Martin B, Daimon CM, Maudsley S (2013) Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications. Front Physiol 4:8

    Google Scholar 

  • Christidis K, Mentzas G, Apostolou D (2012) Using latent topics to enhance search and recommendation in enterprise social software. Expert Syst Appl 39:9297–9307

    Article  Google Scholar 

  • Cimiano P, Haase P, Heizmann J (2007) Porting natural language interfaces between domains – a case study with the ORAKEL system –. In: Proceedings of the International Conference on Intelligent User Interfaces (IUI), pp. 180–189

  • Cochran PA (2013) Impacts on indigenous peoples from ecosystem changes in the Arctic Ocean, environmental security in the Arctic Ocean. Springer, New York, pp 75–79

    Google Scholar 

  • Daniel C, Wood FS (1999) Fitting equations to data: computer analysis of multifactor data. John Wiley & Sons, New York

    Google Scholar 

  • Deerwester S et al (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407

    Article  Google Scholar 

  • Dhillon IS, Fan J, Guan Y (2001) Efficient clustering of very large document collections, data mining for scientific and engineering applications. Springer, New York, pp 357–381

    Book  Google Scholar 

  • Dumais ST (2004) Latent semantic analysis. Annu Rev Inf Sci Technol 38:189–230

    Google Scholar 

  • Fernández M, Cantador I, López V, Vallet D, Castells P, Motta E (2011) Semantically enhanced information retrieval: an ontology-based approach. Web Semant Sci Serv Agents World Wide Web 9:434–452

    Article  Google Scholar 

  • Gao S, Li L, Li W, Janowicz K, Zhang Y (2014) Constructing gazetteers from volunteered big geo-data based on Hadoop. Comput Environ Urban Syst. doi:10.1016/j.compenvurbsys.2014.02.004

    Google Scholar 

  • Goelzer H, Huybrechts P, Loutre M-F, Goosse H, Fichefet T, Mouchet A (2011) Impact of Greenland and Antarctic ice sheet interactions on climate sensitivity. Clim Dyn 37:1005–1018

    Article  Google Scholar 

  • Gosling S, Taylor R, Arnell N, Todd M (2011) A comparative analysis of projected impacts of climate change on river runoff from global and catchment-scale hydrological models. Hydrol Earth Syst Sci 15:279–294

    Article  Google Scholar 

  • Harvey F, Kuhn W, Pundt H, Bishr Y, Riedemann C (1999) Semantic interoperability: a central issue for sharing geographic information. Ann Reg Sci 33(2):213–232

    Article  Google Scholar 

  • Hjørland B (2010) The foundation of the concept of relevance. J Am Soc Inf Sci Technol 61:217–237

    Article  Google Scholar 

  • Holland MM, Bitz CM, Tremblay B (2006) Future abrupt reductions in the summer Arctic sea ice. Geophysical Research Letters 33

  • Hyvönen E, Saarela S, Viljanen K (2004) Application of ontology techniques to view-based semantic search and browsing. In the semantic web: research and applications. Springer, Berlin Heidelberg, pp 92–106

    Google Scholar 

  • Janowicz K (2012) Observation‐driven geo‐ontology engineering. Trans GIS 16:351–374

    Article  Google Scholar 

  • Jones CB, Abdelmoty AI, Finch D, Fu G, Vaid S (2004) The spirit spatial search engine: architecture, ontologies and spatial indexing, geographic information science. Springer, New York, pp 125–139

    Google Scholar 

  • Li W, Yang C, Raskin R (2008a) A semantic enhanced search for spatial web portals. AAAI Spring Symp Tech Rep SS-08–05:47–50

  • Li W, Yang P, Zhou B (2008b) Internet-based spatial information retrieval. In Encyclopedia of GIS, pp. 596–599, Springer US

  • Li W, Yang C, Nebert D, Raskin R, Houser P, Wu H, Li Z (2011a) Semantic-based web service discovery and chaining for building an Arctic spatial data infrastructure. Comput Geosci 37:1752–1762

    Article  Google Scholar 

  • Li Z, Yang CP, Wu H, Li W, Miao L (2011b) An optimized framework for seamlessly integrating OGC web services to support geospatial sciences. Int J Geogr Inf Sci 25:595–613

    Article  Google Scholar 

  • Li W, Goodchild MF, Raskin R (2012) Towards geospatial semantic search: exploiting latent semantic relations in geospatial data. Int J Digit Earth. doi:10.1080/17538947.2012.674561

    Google Scholar 

  • Li W, Li L, Goodchild MF, Anselin L (2013) A geospatial cyberinfrastructure for urban economic analysis and spatial decision-making. ISPRS Int J Geo-Inf 2:413–431

    Article  Google Scholar 

  • Liu K, Yang C, Li W, Li Z, Wu H, Rezgui A, Xia J (2011) The GEOSS clearinghouse high performance search engine. In Geoinformatics, 2011 19th International Conference on (pp. 1–4). IEEE

  • Lopez V, Pasin M, Motta E (2005) Aqualog: An ontology-portable question answering system for the semantic web. In: Gómez-Pérez A, Euzenat J (eds) ESWC 2005. LNCS, vol. 3532. Springer, Heidelberg, pp 546–562

    Google Scholar 

  • MacKenzie CM, Laskey K, McCabe F, Brown PF, Metz R, Hamilton BA (2006) Reference model for service oriented architecture 1.0. OASIS Standard 12

  • Mangold C (2007) A survey and classification of semantic search approaches. Int J Metadata Semant Ontologies 2(1):23–34

    Article  Google Scholar 

  • Marshall J, Armour K, Scott J, Ferreira D, Shepherd TG, Bitz CM (2013) The ocean’s role in polar climate change: asymmetric Arctic and Antarctic responses to greenhouse gas and ozone forcing

    Google Scholar 

  • Nicholls RJ, Marinova N, Lowe JA, Brown S, Vellinga P, De Gusmao D, Hinkel J, Tol RS (2011) Sea-level rise and its possible impacts given a ‘beyond 4 C world’in the twenty-first century. Philos Trans R Soc A Math Phys Eng Sci 369:161–181

    Article  Google Scholar 

  • Overpeck J, Hughen K, Hardy D, Bradley R, Case R, Douglas M, Finney B, Gajewski K, Jacoby G, Jennings A (1997) Arctic environmental change of the last four centuries. Science 278:1251–1256

    Article  Google Scholar 

  • Pundsack J, Bell R, Broderson D, Fox GC, Dozier J, Helly J, Li W, Morin P, Parsons M, Roberts A, Tweedie C, and Yang C (2013) Report on workshop on cyberinfrastructure for polar sciences. St. Paul, Minnesota. University of Minnesota Polar Geospatial Center, 17pp

  • Ramachandran R, Movva S, Graves S, Tanner S (2006) Ontology-based semantic search tool for atmospheric science, Proceedings of 22nd International Conference on Interactive Information Processing Systems for Meteorology, Oceanography, and Hydrology,

  • Rose L (2004) Geospatial portal reference architecture: a community guide to implementing standards-based geospatial portals. OpenGIS Disscusion Paper, OGC, 04–039

  • Scudellari M (2013) An unrecognizable Arctic, Global climate change. NASA, Greenbelt, MD.

  • Singhal A (2001) Modern information retrieval: a brief overview. IEEE Data Eng Bull 24:35–43

    Google Scholar 

  • Skedsmo M, Taylor F, Palmer O, Guomundsson M (2011) Arctic Spatial Data Infrastructure (SDI): Pan-Arctic Cooperation among Ten Mapping Agencies. Available from:

  • Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15:72–101

    Article  Google Scholar 

  • Steinbach M, Karypis G, Kumar V (2000) A comparison of document clustering techniques, KDD workshop on text mining. Boston, pp. 525–526

  • Stouffer RJ, Yin J, Gregory J, Dixon K, Spelman M, Hurlin W, Weaver A, Eby M, Flato G, Hasumi H (2006) Investigating the causes of the response of the thermohaline circulation to past and future climate changes. Journal of Climate 19

  • Tran T, Cimiano P, Rudolph S, Studer R (2007) Ontology-based interpretation of keywords for semantic search. Springer, Berlin Heidelberg, pp 523–536

    Google Scholar 

  • Wang H (2013) Distributed catalogue search of earth observation data. George Mason University

  • Xiong J, Huang W, Jin C (2009) An ontology-based semantic search approach for geosciences, Knowledge Acquisition and Modeling, 2009. KAM’09. Second International Symposium on. IEEE, pp. 87–90

  • Zimov SA, Schuur EA, Chapin FS III (2006) Permafrost and the global carbon budget. Sci (Wash) 312:1612–1613

    Article  Google Scholar 

Download references


This paper is supported by National Science Foundation Award #1349259 and Open Geospatial Consortium Award #027216.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Wenwen Li.

Additional information

Communicated by: H. A. Babaie

Published in the Special Issue “Semantic e-Sciences” with Guest Editors Dr. Xiaogang Ma, Dr. Peter Fox, Dr. Thomas Narock and Dr. Brian Wilson

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, W., Bhatia, V. & Cao, K. Intelligent polar cyberinfrastructure: enabling semantic search in geospatial metadata catalogue to support polar data discovery. Earth Sci Inform 8, 111–123 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: