pp 1–27 | Cite as

Interpretation and automatic integration of geospatial data into the Semantic Web

Towards a process of automatic geospatial data interpretation, classification and integration using semantic technologies
  • Claire PrudhommeEmail author
  • Timo Homburg
  • Jean-Jacques Ponciano
  • Frank Boochs
  • Christophe Cruz
  • Ana-Maria Roxin


In the context of disaster management, geospatial information plays a crucial role in the decision-making process to protect and save the population. Gathering a maximum of information from different sources to oversee the current situation is a complex task due to the diversity of data formats and structures. Although several approaches have been designed to integrate data from different sources into an ontology, they mainly require background knowledge of the data. However, non-standard data set schema (NSDS) of relational geospatial data retrieved from e.g. web feature services are not always documented. This lack of background knowledge is a major challenge for automatic semantic data integration. Focusing on this problem, this article presents an automatic approach for geospatial data integration in NSDS. This approach does a schema mapping according to the result of an ontology matching corresponding to a semantic interpretation process. This process is based on geocoding and natural language processing. This article extends work done in a previous publication by an improved unit detection algorithm, data quality and provenance enrichments, the detection of feature clusters. It also presents an improved evaluation process to better assess the performance of this approach compared to a manually created ontology. These experiments have shown the automatic approach obtains an error of semantic interpretation around 10% according to a manual approach.


Semantic interpretation Data quality Natural language processing Ontologies Spatial fusion Semantic Web 



We are funded by the German Federal Ministry of Education and Research ( Project Reference: 03FH032IX4).


  1. 1.
    Alt H, Godau M (1995) Computing the Fréchet distance between two polygonal curves. Int J Comput Geom Appl 5(01n02):75–91Google Scholar
  2. 2.
    Arenas M, Bertails A, Prud’hommeaux E, Sequeda J (2012) A direct mapping of relational data to RDF. W3C recommendation.
  3. 3.
    Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) Dbpedia: a nucleus for a web of open data. In: The semantic web, Springer, pp 722–735Google Scholar
  4. 4.
    Auer S, Lehmann J, Hellmann S (2009) Linkedgeodata: adding a spatial dimension to the web of data. In: International semantic web conference, Springer, pp 731–746Google Scholar
  5. 5.
    Barron C, Neis P, Zipf A (2014) A comprehensive framework for intrinsic openstreetmap quality analysis. Trans GIS 18(6):877–895Google Scholar
  6. 6.
    Battle R, Kolas D (2011) Geosparql: enabling a geospatial semantic web. Semant Web J 3(4):355–370Google Scholar
  7. 7.
    Berretti S, Del Bimbo A, Pala P (2000) Retrieval by shape similarity with perceptual distance and effective indexing. IEEE Trans Multimed 2(4):225–239Google Scholar
  8. 8.
    Bizid I, Faiz S, Boursier Patriceand Yusuf JCM (2014) Integration of heterogeneous spatial databases for disaster management. In: Parsons J, Chiu D (eds) Advances in conceptual modeling: ER 2013 workshops, LSAWM, MoBiD, RIGiM, SeCoGIS, WISM, DaSeM, SCME, and PhD symposium, Hong Kong, China, November, 2013, revised selected papers. Springer, Cham, pp 77–86.
  9. 9.
    Brassel K, Bucher F, Stephan EM, Vckovski A (1995) Completeness. In: Guptill SC, Morrison JL (eds) Elements of spatial data quality. Elsevier, Amsterdam, pp 81–108Google Scholar
  10. 10.
    Burggraf DS (2006) Geography markup language. Data Sci J 5:178–204Google Scholar
  11. 11.
    Buscaldi D, Rosso P (2008) Geo-wordnet: automatic georeferencing of wordnet. In: LRECGoogle Scholar
  12. 12.
    Das S, Sundara S, Cyganiak R (2012) R2RML: RDB to RDF mapping language, W3C recommendation. World Wide Web Consortium, CambridgeGoogle Scholar
  13. 13.
    Debruyne C, McGlinn K, McNerney L, O’Sullivan D (2017) A lightweight approach to explore, enrich and use data with a geospatial dimension with semantic web technologies. In: Proceedings of the fourth international ACM workshop on managing and mining enriched geo-spatial data, ACM, p 1Google Scholar
  14. 14.
    Debruyne C, Meehan A, Clinton É, McNerney L, Nautiyal A, Lavin P, O’Sullivan D (2017) Ireland’s authoritative geospatial linked data. In: International semantic web conference, Springer, pp 66–74Google Scholar
  15. 15.
    Do HH, Rahm E (2002) Coma: a system for flexible combination of schema matching approaches. In: Proceedings of the 28th international conference on very large data bases, VLDB endowment, pp 610–621Google Scholar
  16. 16.
    Eren H (2016) 8 standards in process control and automation. In: Liptak BG, Eren H (eds) Instrument engineers’ handbook, volume 3: process software and digital networks, vol 3. CRC Press, Boca Raton, p 155Google Scholar
  17. 17.
    ESRI E (1998) Shapefile technical description. An ESRI white paperGoogle Scholar
  18. 18.
    Euzenat J, Shvaiko P (2007) Ontology matching. Springer, BerlinzbMATHGoogle Scholar
  19. 19.
    Gao S, Sperberg-McQueen CM, Thompson HS, Mendelsohn N, Beech D, Maloney M (2009) W3C XML schema definition language (XSD) 1.1 part 1: structures. W3C Candidate Recomm 30(7.2):16Google Scholar
  20. 20.
    Goodchild MF, Hunter GJ (1997) A simple positional accuracy measure for linear features. Int J Geogr Inf Sci 11(3):299–306Google Scholar
  21. 21.
    Grantner E (2007) ISO 8000: a standard for data quality. Logist Spectr 41(4):4–6Google Scholar
  22. 22.
    Guo H, Song GF, Ma L, Wang SH (2009) Design and implementation of address geocoding system. Comput Eng 35(1):250–251Google Scholar
  23. 23.
    Hartig O, Zhao J (2009) Using web data provenance for quality assessment. CEUR workshop proceedingsGoogle Scholar
  24. 24.
    Hillner S, Ngomo ACN (2011) Parallelizing limes for large-scale link discovery. In: 7th international conference on semantic systems, ACM, pp 9–16Google Scholar
  25. 25.
    Homburg T, Prudhomme C, Würriehausen F, Karmacharya A, Boochs F, Roxin A, Cruz C (2016) Interpreting heterogeneous geospatial data using semantic web technologies. In: International conference on computational science and its applications, Springer, pp 240–255Google Scholar
  26. 26.
    Huttenlocher DP, Klanderman GA, Rucklidge WJ (1993) Comparing images using the Hausdorff distance. IEEE Trans Pattern Anal Mach Intell 15(9):850–863Google Scholar
  27. 27.
    Jiménez-Ruiz E, Grau BC (2011) Logmap: logic-based and scalable ontology matching. In: International semantic web conference, Springer, pp 273–288Google Scholar
  28. 28.
    Jiménez-Ruiz E, Kharlamov E, Zheleznyakov D, Horrocks I, Pinkel C, Skjæveland MG, Thorstensen E, Mora J (2015) Bootox: practical mapping of RDBS to OWL 2. In: International semantic web conference, Springer, pp 113–132Google Scholar
  29. 29.
    Kainz W (1995) Logical consistency. Elem Spat Data Qual 202:109–137Google Scholar
  30. 30.
    Kalemi E, Martiri E (2011) FOAF-academic ontology: a vocabulary for the academic community. In: 2011 third international conference on intelligent networking and collaborative systems (INCoS), IEEE, pp 440–445Google Scholar
  31. 31.
    Lanter DP (1990) Lineage in GIS: the problem and a solution, NCGIA National Center for Geographic Information and Analysis.
  32. 32.
    Le Grange JJ, Lehmann J, Athanasiou S, Garcia-Rojas A, Giannopoulos G, Hladky D, Isele R, Ngomo ACN, Sherif MA, Stadler C, et al (2014) The geoknow generator: managing geospatial data in the linked data web. In: Linking geospatial dataGoogle Scholar
  33. 33.
    Lebo T, Sahoo S, McGuinness D, Belhajjame K, Cheney J, Corsar D, Garijo D, Soiland-Reyes S, Zednik S, Zhao J (2013) PROV-O: the PROV ontology. W3C recommendation.
  34. 34.
    Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys Dokl 10:707–710MathSciNetGoogle Scholar
  35. 35.
    Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: 52nd annual meeting of the association for computational linguistics: system demonstrations, pp 55–60Google Scholar
  36. 36.
    Melnik S, Garcia-Molina H, Rahm E (2002) Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In: 18th international conference on data engineering, 2002. Proceedings, IEEE, pp 117–128Google Scholar
  37. 37.
    Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41Google Scholar
  38. 38.
    Navigli R, Ponzetto SP (2010) BabelNet: building a very large multilingual semantic network. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Association for computational linguistics, pp 216–225Google Scholar
  39. 39.
    Nentwig M, Hartung M, Ngonga Ngomo AC, Rahm E (2017) A survey of current link discovery frameworks. Semant Web 8(3):419–436Google Scholar
  40. 40.
    Ngomo ACN, Auer S (2011) Limes-a time-efficient approach for large-scale link discovery on the web of data. In: IJCAI, pp 2312–2317Google Scholar
  41. 41.
    Niu X, Rong S, Zhang Y, Wang H (2011) Zhishi.links results for OAEI 2011. In: Ontology matching, vol 220Google Scholar
  42. 42.
    Niwattanakul S, Singthongchai J, Naenudorn E, Wanapu S (2013) Using of Jaccard coefficient for keywords similarity. In: Proceedings of the international multiconference of engineers and computer scientists, vol 1Google Scholar
  43. 43.
    OGC (2011) OGC geosparql—a geographic query language for RDF data. Technical reportGoogle Scholar
  44. 44.
    Otero-Cerdeira L, Rodríguez-Martínez FJ, Gómez-Rodríguez A (2015) Ontology matching: a literature review. Expert Syst Appl 42(2):949–971Google Scholar
  45. 45.
    Pan JZ (2009) Resource description framework. In: Staab S, Studer R (eds) Handbook on ontologies. Springer, Berlin, pp 71–90Google Scholar
  46. 46.
    Patroumpas K, Alexakis M, Giannopoulos G, Athanasiou S (2014) Triplegeo: an ETL tool for transforming geospatial data into RDF triples. In: ICDT workshops, pp 275–278Google Scholar
  47. 47.
    Pinkel C, Binnig C, Jiménez-Ruiz E, Kharlamov E, May W, Nikolov A, Sasa Bastinos A, Skjæveland MG, Solimando A, Taheriyan M et al (2016) RODI: benchmarking relational-to-ontology mapping generation quality. Semant Web 9(1):25–52Google Scholar
  48. 48.
    Pinkel C, Binnig C, Jimenez-Ruiz E, Kharlamov E, Nikolov A, Schwarte A, Heupel C, Kraska T (2017) IncMap: a journey towards ontology-based data integration. In: Mitschang B, Nicklas D, Leymann F, Schöning H, Herschel M, Teubner J, Härder T, Kopp O, Wieland M (eds) Datenbanksysteme für Business, Technologie und Web (BTW 2017). Gesellschaft für Informatik, BonnGoogle Scholar
  49. 49.
    Prudhomme C, Homburg T, Ponciano JJ, Boochs F, Roxin A, Cruz C (2017) Automatic integration of spatial data into the semantic web. In: WebIST 2017Google Scholar
  50. 50.
    Prud E, Seaborne A, et al (2008) SPARQL query language for RDF. W3C Recommendation.
  51. 51.
    Rahm E, Bernstein PA (2001) A survey of approaches to automatic schema matching. VLDB J 10(4):334–350zbMATHGoogle Scholar
  52. 52.
    Repici J (2010) The comma separated value (CSV) file format. Creativyst Inc, San CarlosGoogle Scholar
  53. 53.
    Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. arXiv:cmp-lg/9511007
  54. 54.
    Rijgersberg H, van Assem M, Top J (2013) Ontology of units of measure and related concepts. Semant Web 4(1):3–13Google Scholar
  55. 55.
    Scharffe F, Atemezing G, Troncy R, Gandon F, Villata S, Bucher B, Hamdi F, Bihanic L, Képéklian G, Cotton F, et al (2012) Enabling linked-data publication with the datalift platform. In: Proceedings of AAAI workshop on semantic citiesGoogle Scholar
  56. 56.
    Schwering A (2008) Approaches to semantic similarity measurement for geo-spatial data: a survey. Trans GIS 12(1):5–29Google Scholar
  57. 57.
    Shvaiko P, Euzenat J (2013) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng 25(1):158–176Google Scholar
  58. 58.
    Stadler C, Unbehauen J, Lehmann J, Auer S (2013) Connecting crowdsourced spatial information to the data web with sparqlify. Technical report, University of LeipzigGoogle Scholar
  59. 59.
    Svennerberg, G (2010) Beginning Google Maps API 3. ApressGoogle Scholar
  60. 60.
    Tarasowa D, Lange C, Auer S (2015) Measuring the quality of relational-to-RDF mappings. In: International conference on knowledge engineering and the semantic web, Springer, pp 210–224Google Scholar
  61. 61.
    van Rees E (2013) Open geospatial consortium (OGC). Geoinformatics 16(8):28Google Scholar
  62. 62.
    Veltkamp RC (2001) Shape matching: similarity measures and algorithms. In: SMI 2001 international conference on shape modeling and applications, IEEE, pp 188–197Google Scholar
  63. 63.
    Vertan C, Wozu O (2007) Web ontology language (OWL). W3C Recommendation.
  64. 64.
    Volz J, Bizer C, Gaedke M, Kobilarov G (2009) Silk-a link discovery framework for the web of data. In: LDOW, vol 538Google Scholar
  65. 65.
    Vrandečić D, Krötzsch M (2014) Wikidata: a free collaborative knowledgebase. Commun ACM 57(10):78–85Google Scholar
  66. 66.
    Vretanos PA (2005) Web feature service implementation specification. Open Geospatial Consort Specif 1325:04–094Google Scholar
  67. 67.
    Wick M, Vatant B, Christophe B (2015) Geonames ontology.
  68. 68.
    Zaveri A, Rula A, Maurino A, Pietrobon R, Lehmann J, Auer S (2016) Quality assessment for linked data: a survey. Semant Web 7(1):63–93Google Scholar

Copyright information

© Springer-Verlag GmbH Austria, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Mainz University of Applied SciencesMainzGermany
  2. 2.Laboratoire d’Informatique de Bourgogne (LIB) - EA 7534University of Bourgogne Franche-ComtéDijonFrance

Personalised recommendations