Abstract
In the context of disaster management, geospatial information plays a crucial role in the decision-making process to protect and save the population. Gathering a maximum of information from different sources to oversee the current situation is a complex task due to the diversity of data formats and structures. Although several approaches have been designed to integrate data from different sources into an ontology, they mainly require background knowledge of the data. However, non-standard data set schema (NSDS) of relational geospatial data retrieved from e.g. web feature services are not always documented. This lack of background knowledge is a major challenge for automatic semantic data integration. Focusing on this problem, this article presents an automatic approach for geospatial data integration in NSDS. This approach does a schema mapping according to the result of an ontology matching corresponding to a semantic interpretation process. This process is based on geocoding and natural language processing. This article extends work done in a previous publication by an improved unit detection algorithm, data quality and provenance enrichments, the detection of feature clusters. It also presents an improved evaluation process to better assess the performance of this approach compared to a manually created ontology. These experiments have shown the automatic approach obtains an error of semantic interpretation around 10% according to a manual approach.
This is a preview of subscription content, access via your institution.






Notes
- 1.
- 2.
- 3.
https://offenedaten-koeln.de Open data portal of Cologne to retrieve data that we have converted in shapefiles.
- 4.
http://geoportal.saarland.de/arcgis/services/Internet/Gesundheit/MapServer/WFSServer Web service allowing for retrieving data from Saarland that we have converted in shapefiles.
- 5.
A non-expert is someone who knows about Semantic Web technologies but does not know the context and the goal of the data set.
References
- 1.
Alt H, Godau M (1995) Computing the Fréchet distance between two polygonal curves. Int J Comput Geom Appl 5(01n02):75–91
- 2.
Arenas M, Bertails A, Prud’hommeaux E, Sequeda J (2012) A direct mapping of relational data to RDF. W3C recommendation. https://www.w3.org/TR/rdb-direct-mapping/
- 3.
Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) Dbpedia: a nucleus for a web of open data. In: The semantic web, Springer, pp 722–735
- 4.
Auer S, Lehmann J, Hellmann S (2009) Linkedgeodata: adding a spatial dimension to the web of data. In: International semantic web conference, Springer, pp 731–746
- 5.
Barron C, Neis P, Zipf A (2014) A comprehensive framework for intrinsic openstreetmap quality analysis. Trans GIS 18(6):877–895
- 6.
Battle R, Kolas D (2011) Geosparql: enabling a geospatial semantic web. Semant Web J 3(4):355–370
- 7.
Berretti S, Del Bimbo A, Pala P (2000) Retrieval by shape similarity with perceptual distance and effective indexing. IEEE Trans Multimed 2(4):225–239
- 8.
Bizid I, Faiz S, Boursier Patriceand Yusuf JCM (2014) Integration of heterogeneous spatial databases for disaster management. In: Parsons J, Chiu D (eds) Advances in conceptual modeling: ER 2013 workshops, LSAWM, MoBiD, RIGiM, SeCoGIS, WISM, DaSeM, SCME, and PhD symposium, Hong Kong, China, November, 2013, revised selected papers. Springer, Cham, pp 77–86. https://doi.org/10.1007/978-3-319-14139-8_10
- 9.
Brassel K, Bucher F, Stephan EM, Vckovski A (1995) Completeness. In: Guptill SC, Morrison JL (eds) Elements of spatial data quality. Elsevier, Amsterdam, pp 81–108
- 10.
Burggraf DS (2006) Geography markup language. Data Sci J 5:178–204
- 11.
Buscaldi D, Rosso P (2008) Geo-wordnet: automatic georeferencing of wordnet. In: LREC
- 12.
Das S, Sundara S, Cyganiak R (2012) R2RML: RDB to RDF mapping language, W3C recommendation. World Wide Web Consortium, Cambridge
- 13.
Debruyne C, McGlinn K, McNerney L, O’Sullivan D (2017) A lightweight approach to explore, enrich and use data with a geospatial dimension with semantic web technologies. In: Proceedings of the fourth international ACM workshop on managing and mining enriched geo-spatial data, ACM, p 1
- 14.
Debruyne C, Meehan A, Clinton É, McNerney L, Nautiyal A, Lavin P, O’Sullivan D (2017) Ireland’s authoritative geospatial linked data. In: International semantic web conference, Springer, pp 66–74
- 15.
Do HH, Rahm E (2002) Coma: a system for flexible combination of schema matching approaches. In: Proceedings of the 28th international conference on very large data bases, VLDB endowment, pp 610–621
- 16.
Eren H (2016) 8 standards in process control and automation. In: Liptak BG, Eren H (eds) Instrument engineers’ handbook, volume 3: process software and digital networks, vol 3. CRC Press, Boca Raton, p 155
- 17.
ESRI E (1998) Shapefile technical description. An ESRI white paper
- 18.
Euzenat J, Shvaiko P (2007) Ontology matching. Springer, Berlin
- 19.
Gao S, Sperberg-McQueen CM, Thompson HS, Mendelsohn N, Beech D, Maloney M (2009) W3C XML schema definition language (XSD) 1.1 part 1: structures. W3C Candidate Recomm 30(7.2):16
- 20.
Goodchild MF, Hunter GJ (1997) A simple positional accuracy measure for linear features. Int J Geogr Inf Sci 11(3):299–306
- 21.
Grantner E (2007) ISO 8000: a standard for data quality. Logist Spectr 41(4):4–6
- 22.
Guo H, Song GF, Ma L, Wang SH (2009) Design and implementation of address geocoding system. Comput Eng 35(1):250–251
- 23.
Hartig O, Zhao J (2009) Using web data provenance for quality assessment. CEUR workshop proceedings
- 24.
Hillner S, Ngomo ACN (2011) Parallelizing limes for large-scale link discovery. In: 7th international conference on semantic systems, ACM, pp 9–16
- 25.
Homburg T, Prudhomme C, Würriehausen F, Karmacharya A, Boochs F, Roxin A, Cruz C (2016) Interpreting heterogeneous geospatial data using semantic web technologies. In: International conference on computational science and its applications, Springer, pp 240–255
- 26.
Huttenlocher DP, Klanderman GA, Rucklidge WJ (1993) Comparing images using the Hausdorff distance. IEEE Trans Pattern Anal Mach Intell 15(9):850–863
- 27.
Jiménez-Ruiz E, Grau BC (2011) Logmap: logic-based and scalable ontology matching. In: International semantic web conference, Springer, pp 273–288
- 28.
Jiménez-Ruiz E, Kharlamov E, Zheleznyakov D, Horrocks I, Pinkel C, Skjæveland MG, Thorstensen E, Mora J (2015) Bootox: practical mapping of RDBS to OWL 2. In: International semantic web conference, Springer, pp 113–132
- 29.
Kainz W (1995) Logical consistency. Elem Spat Data Qual 202:109–137
- 30.
Kalemi E, Martiri E (2011) FOAF-academic ontology: a vocabulary for the academic community. In: 2011 third international conference on intelligent networking and collaborative systems (INCoS), IEEE, pp 440–445
- 31.
Lanter DP (1990) Lineage in GIS: the problem and a solution, NCGIA National Center for Geographic Information and Analysis. http://infoscience.epfl.ch/record/51713
- 32.
Le Grange JJ, Lehmann J, Athanasiou S, Garcia-Rojas A, Giannopoulos G, Hladky D, Isele R, Ngomo ACN, Sherif MA, Stadler C, et al (2014) The geoknow generator: managing geospatial data in the linked data web. In: Linking geospatial data
- 33.
Lebo T, Sahoo S, McGuinness D, Belhajjame K, Cheney J, Corsar D, Garijo D, Soiland-Reyes S, Zednik S, Zhao J (2013) PROV-O: the PROV ontology. W3C recommendation. https://www.w3.org/TR/prov-o/
- 34.
Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys Dokl 10:707–710
- 35.
Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: 52nd annual meeting of the association for computational linguistics: system demonstrations, pp 55–60
- 36.
Melnik S, Garcia-Molina H, Rahm E (2002) Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In: 18th international conference on data engineering, 2002. Proceedings, IEEE, pp 117–128
- 37.
Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41
- 38.
Navigli R, Ponzetto SP (2010) BabelNet: building a very large multilingual semantic network. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Association for computational linguistics, pp 216–225
- 39.
Nentwig M, Hartung M, Ngonga Ngomo AC, Rahm E (2017) A survey of current link discovery frameworks. Semant Web 8(3):419–436
- 40.
Ngomo ACN, Auer S (2011) Limes-a time-efficient approach for large-scale link discovery on the web of data. In: IJCAI, pp 2312–2317
- 41.
Niu X, Rong S, Zhang Y, Wang H (2011) Zhishi.links results for OAEI 2011. In: Ontology matching, vol 220
- 42.
Niwattanakul S, Singthongchai J, Naenudorn E, Wanapu S (2013) Using of Jaccard coefficient for keywords similarity. In: Proceedings of the international multiconference of engineers and computer scientists, vol 1
- 43.
OGC (2011) OGC geosparql—a geographic query language for RDF data. Technical report
- 44.
Otero-Cerdeira L, Rodríguez-Martínez FJ, Gómez-Rodríguez A (2015) Ontology matching: a literature review. Expert Syst Appl 42(2):949–971
- 45.
Pan JZ (2009) Resource description framework. In: Staab S, Studer R (eds) Handbook on ontologies. Springer, Berlin, pp 71–90
- 46.
Patroumpas K, Alexakis M, Giannopoulos G, Athanasiou S (2014) Triplegeo: an ETL tool for transforming geospatial data into RDF triples. In: ICDT workshops, pp 275–278
- 47.
Pinkel C, Binnig C, Jiménez-Ruiz E, Kharlamov E, May W, Nikolov A, Sasa Bastinos A, Skjæveland MG, Solimando A, Taheriyan M et al (2016) RODI: benchmarking relational-to-ontology mapping generation quality. Semant Web 9(1):25–52
- 48.
Pinkel C, Binnig C, Jimenez-Ruiz E, Kharlamov E, Nikolov A, Schwarte A, Heupel C, Kraska T (2017) IncMap: a journey towards ontology-based data integration. In: Mitschang B, Nicklas D, Leymann F, Schöning H, Herschel M, Teubner J, Härder T, Kopp O, Wieland M (eds) Datenbanksysteme für Business, Technologie und Web (BTW 2017). Gesellschaft für Informatik, Bonn
- 49.
Prudhomme C, Homburg T, Ponciano JJ, Boochs F, Roxin A, Cruz C (2017) Automatic integration of spatial data into the semantic web. In: WebIST 2017
- 50.
Prud E, Seaborne A, et al (2008) SPARQL query language for RDF. W3C Recommendation. https://www.w3.org/2001/sw/DataAccess/rq23/
- 51.
Rahm E, Bernstein PA (2001) A survey of approaches to automatic schema matching. VLDB J 10(4):334–350
- 52.
Repici J (2010) The comma separated value (CSV) file format. Creativyst Inc, San Carlos
- 53.
Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. arXiv:cmp-lg/9511007
- 54.
Rijgersberg H, van Assem M, Top J (2013) Ontology of units of measure and related concepts. Semant Web 4(1):3–13
- 55.
Scharffe F, Atemezing G, Troncy R, Gandon F, Villata S, Bucher B, Hamdi F, Bihanic L, Képéklian G, Cotton F, et al (2012) Enabling linked-data publication with the datalift platform. In: Proceedings of AAAI workshop on semantic cities
- 56.
Schwering A (2008) Approaches to semantic similarity measurement for geo-spatial data: a survey. Trans GIS 12(1):5–29
- 57.
Shvaiko P, Euzenat J (2013) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng 25(1):158–176
- 58.
Stadler C, Unbehauen J, Lehmann J, Auer S (2013) Connecting crowdsourced spatial information to the data web with sparqlify. Technical report, University of Leipzig
- 59.
Svennerberg, G (2010) Beginning Google Maps API 3. Apress
- 60.
Tarasowa D, Lange C, Auer S (2015) Measuring the quality of relational-to-RDF mappings. In: International conference on knowledge engineering and the semantic web, Springer, pp 210–224
- 61.
van Rees E (2013) Open geospatial consortium (OGC). Geoinformatics 16(8):28
- 62.
Veltkamp RC (2001) Shape matching: similarity measures and algorithms. In: SMI 2001 international conference on shape modeling and applications, IEEE, pp 188–197
- 63.
Vertan C, Wozu O (2007) Web ontology language (OWL). W3C Recommendation. https://www.w3.org/TR/owl-features/
- 64.
Volz J, Bizer C, Gaedke M, Kobilarov G (2009) Silk-a link discovery framework for the web of data. In: LDOW, vol 538
- 65.
Vrandečić D, Krötzsch M (2014) Wikidata: a free collaborative knowledgebase. Commun ACM 57(10):78–85
- 66.
Vretanos PA (2005) Web feature service implementation specification. Open Geospatial Consort Specif 1325:04–094
- 67.
Wick M, Vatant B, Christophe B (2015) Geonames ontology. http://www.geonames.org/ontology/documentation.html
- 68.
Zaveri A, Rula A, Maurino A, Pietrobon R, Lehmann J, Auer S (2016) Quality assessment for linked data: a survey. Semant Web 7(1):63–93
Acknowledgements
We are funded by the German Federal Ministry of Education and Research (https://www.bmbf.de/en/index.html Project Reference: 03FH032IX4).
Author information
Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Prudhomme, C., Homburg, T., Ponciano, JJ. et al. Interpretation and automatic integration of geospatial data into the Semantic Web. Computing 102, 365–391 (2020). https://doi.org/10.1007/s00607-019-00701-y
Received:
Accepted:
Published:
Issue Date:
Keywords
- Semantic interpretation
- Data quality
- Natural language processing
- Ontologies
- Spatial fusion
- Semantic Web