Interpretation and automatic integration of geospatial data into the Semantic Web

Towards a process of automatic geospatial data interpretation, classification and integration using semantic technologies


In the context of disaster management, geospatial information plays a crucial role in the decision-making process to protect and save the population. Gathering a maximum of information from different sources to oversee the current situation is a complex task due to the diversity of data formats and structures. Although several approaches have been designed to integrate data from different sources into an ontology, they mainly require background knowledge of the data. However, non-standard data set schema (NSDS) of relational geospatial data retrieved from e.g. web feature services are not always documented. This lack of background knowledge is a major challenge for automatic semantic data integration. Focusing on this problem, this article presents an automatic approach for geospatial data integration in NSDS. This approach does a schema mapping according to the result of an ontology matching corresponding to a semantic interpretation process. This process is based on geocoding and natural language processing. This article extends work done in a previous publication by an improved unit detection algorithm, data quality and provenance enrichments, the detection of feature clusters. It also presents an improved evaluation process to better assess the performance of this approach compared to a manually created ontology. These experiments have shown the automatic approach obtains an error of semantic interpretation around 10% according to a manual approach.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. 1.

  2. 2.

  3. 3. Open data portal of Cologne to retrieve data that we have converted in shapefiles.

  4. 4. Web service allowing for retrieving data from Saarland that we have converted in shapefiles.

  5. 5.

    A non-expert is someone who knows about Semantic Web technologies but does not know the context and the goal of the data set.


  1. 1.

    Alt H, Godau M (1995) Computing the Fréchet distance between two polygonal curves. Int J Comput Geom Appl 5(01n02):75–91

    MATH  Article  Google Scholar 

  2. 2.

    Arenas M, Bertails A, Prud’hommeaux E, Sequeda J (2012) A direct mapping of relational data to RDF. W3C recommendation.

  3. 3.

    Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) Dbpedia: a nucleus for a web of open data. In: The semantic web, Springer, pp 722–735

    Google Scholar 

  4. 4.

    Auer S, Lehmann J, Hellmann S (2009) Linkedgeodata: adding a spatial dimension to the web of data. In: International semantic web conference, Springer, pp 731–746

  5. 5.

    Barron C, Neis P, Zipf A (2014) A comprehensive framework for intrinsic openstreetmap quality analysis. Trans GIS 18(6):877–895

    Article  Google Scholar 

  6. 6.

    Battle R, Kolas D (2011) Geosparql: enabling a geospatial semantic web. Semant Web J 3(4):355–370

    Article  Google Scholar 

  7. 7.

    Berretti S, Del Bimbo A, Pala P (2000) Retrieval by shape similarity with perceptual distance and effective indexing. IEEE Trans Multimed 2(4):225–239

    Article  Google Scholar 

  8. 8.

    Bizid I, Faiz S, Boursier Patriceand Yusuf JCM (2014) Integration of heterogeneous spatial databases for disaster management. In: Parsons J, Chiu D (eds) Advances in conceptual modeling: ER 2013 workshops, LSAWM, MoBiD, RIGiM, SeCoGIS, WISM, DaSeM, SCME, and PhD symposium, Hong Kong, China, November, 2013, revised selected papers. Springer, Cham, pp 77–86.

    Google Scholar 

  9. 9.

    Brassel K, Bucher F, Stephan EM, Vckovski A (1995) Completeness. In: Guptill SC, Morrison JL (eds) Elements of spatial data quality. Elsevier, Amsterdam, pp 81–108

    Google Scholar 

  10. 10.

    Burggraf DS (2006) Geography markup language. Data Sci J 5:178–204

    Google Scholar 

  11. 11.

    Buscaldi D, Rosso P (2008) Geo-wordnet: automatic georeferencing of wordnet. In: LREC

  12. 12.

    Das S, Sundara S, Cyganiak R (2012) R2RML: RDB to RDF mapping language, W3C recommendation. World Wide Web Consortium, Cambridge

    Google Scholar 

  13. 13.

    Debruyne C, McGlinn K, McNerney L, O’Sullivan D (2017) A lightweight approach to explore, enrich and use data with a geospatial dimension with semantic web technologies. In: Proceedings of the fourth international ACM workshop on managing and mining enriched geo-spatial data, ACM, p 1

  14. 14.

    Debruyne C, Meehan A, Clinton É, McNerney L, Nautiyal A, Lavin P, O’Sullivan D (2017) Ireland’s authoritative geospatial linked data. In: International semantic web conference, Springer, pp 66–74

    Google Scholar 

  15. 15.

    Do HH, Rahm E (2002) Coma: a system for flexible combination of schema matching approaches. In: Proceedings of the 28th international conference on very large data bases, VLDB endowment, pp 610–621

  16. 16.

    Eren H (2016) 8 standards in process control and automation. In: Liptak BG, Eren H (eds) Instrument engineers’ handbook, volume 3: process software and digital networks, vol 3. CRC Press, Boca Raton, p 155

    Google Scholar 

  17. 17.

    ESRI E (1998) Shapefile technical description. An ESRI white paper

  18. 18.

    Euzenat J, Shvaiko P (2007) Ontology matching. Springer, Berlin

    Google Scholar 

  19. 19.

    Gao S, Sperberg-McQueen CM, Thompson HS, Mendelsohn N, Beech D, Maloney M (2009) W3C XML schema definition language (XSD) 1.1 part 1: structures. W3C Candidate Recomm 30(7.2):16

  20. 20.

    Goodchild MF, Hunter GJ (1997) A simple positional accuracy measure for linear features. Int J Geogr Inf Sci 11(3):299–306

    Article  Google Scholar 

  21. 21.

    Grantner E (2007) ISO 8000: a standard for data quality. Logist Spectr 41(4):4–6

    Google Scholar 

  22. 22.

    Guo H, Song GF, Ma L, Wang SH (2009) Design and implementation of address geocoding system. Comput Eng 35(1):250–251

    Google Scholar 

  23. 23.

    Hartig O, Zhao J (2009) Using web data provenance for quality assessment. CEUR workshop proceedings

  24. 24.

    Hillner S, Ngomo ACN (2011) Parallelizing limes for large-scale link discovery. In: 7th international conference on semantic systems, ACM, pp 9–16

  25. 25.

    Homburg T, Prudhomme C, Würriehausen F, Karmacharya A, Boochs F, Roxin A, Cruz C (2016) Interpreting heterogeneous geospatial data using semantic web technologies. In: International conference on computational science and its applications, Springer, pp 240–255

  26. 26.

    Huttenlocher DP, Klanderman GA, Rucklidge WJ (1993) Comparing images using the Hausdorff distance. IEEE Trans Pattern Anal Mach Intell 15(9):850–863

    Article  Google Scholar 

  27. 27.

    Jiménez-Ruiz E, Grau BC (2011) Logmap: logic-based and scalable ontology matching. In: International semantic web conference, Springer, pp 273–288

  28. 28.

    Jiménez-Ruiz E, Kharlamov E, Zheleznyakov D, Horrocks I, Pinkel C, Skjæveland MG, Thorstensen E, Mora J (2015) Bootox: practical mapping of RDBS to OWL 2. In: International semantic web conference, Springer, pp 113–132

  29. 29.

    Kainz W (1995) Logical consistency. Elem Spat Data Qual 202:109–137

    Article  Google Scholar 

  30. 30.

    Kalemi E, Martiri E (2011) FOAF-academic ontology: a vocabulary for the academic community. In: 2011 third international conference on intelligent networking and collaborative systems (INCoS), IEEE, pp 440–445

  31. 31.

    Lanter DP (1990) Lineage in GIS: the problem and a solution, NCGIA National Center for Geographic Information and Analysis.

  32. 32.

    Le Grange JJ, Lehmann J, Athanasiou S, Garcia-Rojas A, Giannopoulos G, Hladky D, Isele R, Ngomo ACN, Sherif MA, Stadler C, et al (2014) The geoknow generator: managing geospatial data in the linked data web. In: Linking geospatial data

  33. 33.

    Lebo T, Sahoo S, McGuinness D, Belhajjame K, Cheney J, Corsar D, Garijo D, Soiland-Reyes S, Zednik S, Zhao J (2013) PROV-O: the PROV ontology. W3C recommendation.

  34. 34.

    Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys Dokl 10:707–710

    MathSciNet  Google Scholar 

  35. 35.

    Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: 52nd annual meeting of the association for computational linguistics: system demonstrations, pp 55–60

  36. 36.

    Melnik S, Garcia-Molina H, Rahm E (2002) Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In: 18th international conference on data engineering, 2002. Proceedings, IEEE, pp 117–128

  37. 37.

    Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41

    Article  Google Scholar 

  38. 38.

    Navigli R, Ponzetto SP (2010) BabelNet: building a very large multilingual semantic network. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Association for computational linguistics, pp 216–225

  39. 39.

    Nentwig M, Hartung M, Ngonga Ngomo AC, Rahm E (2017) A survey of current link discovery frameworks. Semant Web 8(3):419–436

    Article  Google Scholar 

  40. 40.

    Ngomo ACN, Auer S (2011) Limes-a time-efficient approach for large-scale link discovery on the web of data. In: IJCAI, pp 2312–2317

  41. 41.

    Niu X, Rong S, Zhang Y, Wang H (2011) Zhishi.links results for OAEI 2011. In: Ontology matching, vol 220

  42. 42.

    Niwattanakul S, Singthongchai J, Naenudorn E, Wanapu S (2013) Using of Jaccard coefficient for keywords similarity. In: Proceedings of the international multiconference of engineers and computer scientists, vol 1

  43. 43.

    OGC (2011) OGC geosparql—a geographic query language for RDF data. Technical report

  44. 44.

    Otero-Cerdeira L, Rodríguez-Martínez FJ, Gómez-Rodríguez A (2015) Ontology matching: a literature review. Expert Syst Appl 42(2):949–971

    Article  Google Scholar 

  45. 45.

    Pan JZ (2009) Resource description framework. In: Staab S, Studer R (eds) Handbook on ontologies. Springer, Berlin, pp 71–90

    Google Scholar 

  46. 46.

    Patroumpas K, Alexakis M, Giannopoulos G, Athanasiou S (2014) Triplegeo: an ETL tool for transforming geospatial data into RDF triples. In: ICDT workshops, pp 275–278

  47. 47.

    Pinkel C, Binnig C, Jiménez-Ruiz E, Kharlamov E, May W, Nikolov A, Sasa Bastinos A, Skjæveland MG, Solimando A, Taheriyan M et al (2016) RODI: benchmarking relational-to-ontology mapping generation quality. Semant Web 9(1):25–52

    Article  Google Scholar 

  48. 48.

    Pinkel C, Binnig C, Jimenez-Ruiz E, Kharlamov E, Nikolov A, Schwarte A, Heupel C, Kraska T (2017) IncMap: a journey towards ontology-based data integration. In: Mitschang B, Nicklas D, Leymann F, Schöning H, Herschel M, Teubner J, Härder T, Kopp O, Wieland M (eds) Datenbanksysteme für Business, Technologie und Web (BTW 2017). Gesellschaft für Informatik, Bonn

    Google Scholar 

  49. 49.

    Prudhomme C, Homburg T, Ponciano JJ, Boochs F, Roxin A, Cruz C (2017) Automatic integration of spatial data into the semantic web. In: WebIST 2017

  50. 50.

    Prud E, Seaborne A, et al (2008) SPARQL query language for RDF. W3C Recommendation.

  51. 51.

    Rahm E, Bernstein PA (2001) A survey of approaches to automatic schema matching. VLDB J 10(4):334–350

    MATH  Article  Google Scholar 

  52. 52.

    Repici J (2010) The comma separated value (CSV) file format. Creativyst Inc, San Carlos

    Google Scholar 

  53. 53.

    Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. arXiv:cmp-lg/9511007

  54. 54.

    Rijgersberg H, van Assem M, Top J (2013) Ontology of units of measure and related concepts. Semant Web 4(1):3–13

    Article  Google Scholar 

  55. 55.

    Scharffe F, Atemezing G, Troncy R, Gandon F, Villata S, Bucher B, Hamdi F, Bihanic L, Képéklian G, Cotton F, et al (2012) Enabling linked-data publication with the datalift platform. In: Proceedings of AAAI workshop on semantic cities

  56. 56.

    Schwering A (2008) Approaches to semantic similarity measurement for geo-spatial data: a survey. Trans GIS 12(1):5–29

    Article  Google Scholar 

  57. 57.

    Shvaiko P, Euzenat J (2013) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng 25(1):158–176

    Article  Google Scholar 

  58. 58.

    Stadler C, Unbehauen J, Lehmann J, Auer S (2013) Connecting crowdsourced spatial information to the data web with sparqlify. Technical report, University of Leipzig

  59. 59.

    Svennerberg, G (2010) Beginning Google Maps API 3. Apress

  60. 60.

    Tarasowa D, Lange C, Auer S (2015) Measuring the quality of relational-to-RDF mappings. In: International conference on knowledge engineering and the semantic web, Springer, pp 210–224

  61. 61.

    van Rees E (2013) Open geospatial consortium (OGC). Geoinformatics 16(8):28

    Google Scholar 

  62. 62.

    Veltkamp RC (2001) Shape matching: similarity measures and algorithms. In: SMI 2001 international conference on shape modeling and applications, IEEE, pp 188–197

  63. 63.

    Vertan C, Wozu O (2007) Web ontology language (OWL). W3C Recommendation.

  64. 64.

    Volz J, Bizer C, Gaedke M, Kobilarov G (2009) Silk-a link discovery framework for the web of data. In: LDOW, vol 538

  65. 65.

    Vrandečić D, Krötzsch M (2014) Wikidata: a free collaborative knowledgebase. Commun ACM 57(10):78–85

    Article  Google Scholar 

  66. 66.

    Vretanos PA (2005) Web feature service implementation specification. Open Geospatial Consort Specif 1325:04–094

    Google Scholar 

  67. 67.

    Wick M, Vatant B, Christophe B (2015) Geonames ontology.

  68. 68.

    Zaveri A, Rula A, Maurino A, Pietrobon R, Lehmann J, Auer S (2016) Quality assessment for linked data: a survey. Semant Web 7(1):63–93

    Article  Google Scholar 

Download references


We are funded by the German Federal Ministry of Education and Research ( Project Reference: 03FH032IX4).

Author information



Corresponding author

Correspondence to Claire Prudhomme.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Prudhomme, C., Homburg, T., Ponciano, JJ. et al. Interpretation and automatic integration of geospatial data into the Semantic Web. Computing 102, 365–391 (2020).

Download citation


  • Semantic interpretation
  • Data quality
  • Natural language processing
  • Ontologies
  • Spatial fusion
  • Semantic Web