Abstract
The joining of geospatial datasets is required to utilize the complete set of information available in each of them. There are many open source geospatial datasets available such as GeoNames, Open Street Map, Natural Earth and to get a comprehensive dataset with the union of all available information it is important that such datasets are linked optimally without redundancy or loss of information. Many of the geolocations on digital maps are not classified for importance because of the lack of additional information such as population or administrative level. A way to give an importance scale to the names is by linking the GeoNames to other datasets (OSM, natural earth). OpenStreetMap data provides a limited number of place classifications (such as city, town, village). For the best cartographic results we need classes that are a little more comprehensive about how they rank cities. The challenges faced include geometry searching, matching, buffer determination, local regional naming text inclusion and accuracy. This has been achieved by the current research work where presently GeoNames, Natural Earth and Open Street Map data tables have been merged with the union of all their attribute columns resulting in a complete geospatial dataset with place accuracy of atleast 95 % for any given country dataset. The data tables at global level consist of hundreds of thousands of rows with each row depicting a geolocation. The geometry, name and geo-id complete and fuzzy searching and matching around a buffer of 50 km took a minimum of 30 s to maximum 1 min in a commodity computer with 2 GHz, 2 GB memory, according to size and complexity of the query run for a country which could have a list of points ranging from a dozen to several hundreds. The future aim is to ultimately do this for global datasets to create an all-encompassing geodata bank having such information as administrative, political, ecological details from important databases as GAUL, SALB, GADM etc.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ashton AJ. Processing OpenStreetMap data for effective cartography, website: https://www.mapbox.com/blog/processing-osm/. Published 17 Oct 2012. Accessed 2 July 2013. https://www.mapbox.com/blog/2012-08-09-mapbox-streets-design-update/. Accessed 15 Aug 2013
Baltsavias EP (2004) Object extraction and revision by image analysis using existing geodata and knowledge: current status and steps towards operational systems. ISPRS J Photogr Remote Sensing 58(3–4):129–151. http://www.sciencedirect.com/science/article/pii/S0924271603000546
Batini C, Lenzerini M, Navathe SB (1986) A comparative analysis of methodologies for database schema integration. ACM Comput Surv 18(4):323–364. http://pdf.aminer.org/000/338/660/on_the_equivalence_among_data_base_schemata.pdf
Beeri C, Kanza Y, Safra E, Sagiv Y (2004) Object fusion in geographic information systems. In: Proceedings of the 13th international conference on very large data bases, Toronto, ON. http://www.vldb.org/conf/2004/RS21P4.PDF
Beeri C, Doytsher Y, Kanza Y, Safra E, Sagiv Y (2005) Finding corresponding objects when integrating several geo-spatial datasets. In: Proceedings of the 13th ACM international symposium on advances in geographic information systems, Bremen, 4–5 November 2005, pp 87–96. http://dl.acm.org/citation.cfm?id=1097078
Budak A, Sheth A, Ramakrishnan C, Lynn Usery E, Azami M, Kwan MP (2006) Geospatial ontology development and semantic analytics. Trans GIS 10:551–575. http://ncgia.ucsb.edu/projects/nga/docs/ontology.pdf
Butenuth M, Heipke C (2005) Network snakes-supported extraction of field boundaries from imagery. In: Kropatsch WG, Sablatnig R, Hanbury A (eds) 27th DAGM symposium, Wien, Österreich, LNCS, vol 3663. Springer, Berlin, Heidelberg, pp 417–424
Butenuth M, Gösseln G v, Tiedge M, Heipke C, Lipeck U, Sester M (2007) Integration of heterogeneous geospatial data in a federated database. ISPRS J Photogr Remote Sensing 62:328–346 (Elsevier). http://www.sciencedirect.com/science/article/pii/S0924271607000275
Chang Y-S, Park H-D (2006) XML web service-based development model for Internet GIS applications. Int J Geogr Inf Sci 20:371–399. http://www.tandfonline.com/doi/abs/10.1080/13658810600607857#.UzVm6KhdVHQ
Chen CC, Thakkar S, Knoblock C, Shahabi C (2003) Automatically annotating and integrating spatial datasets, advances in spatial and temporal databases. In: Hadzilacos T, Manolopoulos Y, Roddick J, Theodoridis Y (eds) 8th international symposium, SSTD 2003, Santorini Island, Greece, Proceedings, Lecture Notes in Computer Science, 2003. doi:10.1007/978-3-540-45072-6_27. Springer, Berlin, Heidelberg, pp 469–488
Devogele T (2002) A new merging process for data integration based on the discrete Frechet Distance. IAPRS, vol 34. http://link.springer.com/chapter/10.1007%2F978-3-642-56094-1_13#page-1
Doytsher Y (2000) A rubber sheeting algorithm for non-rectangular maps. Comput Geosci 26(9–10):1001–1010
Egenhofer MJ, Frank AU, Jackson JP (1989) A topological data model for spatial databases. Lecture Notes Comput Sci 409:271–286
ESRI (2004) Xml schema of the geodatabase – an ESRI technical paper, ESRI 380 New York St., Redlands, CA, 92373–8100, USA. http://downloads.esri.com/support/whitepapers/ao_/J9620_XML_Schema_of_Geodatabase.pdf
Friis-Christensen A, Nytun JP, Jensen CS, Skogan D (2005) A conceptual schema language for the management of multiple representations of geographic entities. Trans GIS 9:345–380
Gal A, Trombetta A, Anaby-Tavor A, Montesi D (2003) A model for schema integration in heterogeneous databases. IDEAS, pp 2–11. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1214906&tag=1
GEONAMES, website http://www.geonames.org/export/codes.html
Goesseln G v, Sester M (2004) Integration of geoscientific data sets and the German digital map using a matching approach. Int Arch Photogr Remote Sensing 35 (Part 4B):1249–1254. http://www.cartesia.org/geodoc/isprs2004/comm4/papers/534.pdf
Gravano L, Ipeirotis PG, Koudas N, Srivastava D (2003) Text joins in an RDBMS for web data integration. In: Proceedings of the 12th international conference on World Wide Web. http://www2.research.att.com/~divesh/papers/giks2003-textjoins.pdf
GSDI (2005) Spatial data infrastructure. Global Spatial Data Infrastructure Association
Hampe M, Sester M, Harrie L (2004) Multiple representation databases to support visualisation on mobile devices. Commission IV, WG IV/2. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.184.3303&rep=rep1&type=pdf
Lake R (2005) The application of geography markup language (GML) to the geological sciences. Comput Geosci 31:1081–1094. http://www.sciencedirect.com/science/article/pii/S0098300405001032
Laurini R (1998) Spatial multi-database topological continuity and indexing: a step towards seamless GIS data interoperability. Int J Geogr Inf Sci 12(4):373–402. http://www.tandfonline.com/doi/abs/10.1080/136588198241842#.UzVh96hdVHQ
Lemarie C, Raynal L (1996) Geographic data matching: first investigations for a generic tool. In: Proceedings of GIS/LIS, pp 405–420. http://training.esri.com/bibliography/index.cfm?event=general.recordDetail&ID=10135
Levy A (1999) Logic-based techniques in data integration. Department of Computer Science and Engineering, University of Washington. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.33.4733
Lu C-T, Santos RD, Sripada L, Kou Y (2007) Advances in GML for geospatial applications. GeoInformatica 11:131–157. http://link.springer.com/article/10.1007%2Fs10707-006-0013-9
Ma C, Chou DC, Yen DC (2000) Data warehousing, technology assessment and management. Ind Manage Data Syst 100(3):125–134. http://dx.doi.org/10.1108/02635570010323193
Malhotra Y (2000) Knowledge management for [E-] business performance. Information strategy. Exec J 16(4):5–16. http://www.brint.org/KMEbusiness.pdf
Masuyama A (2006) Methods for detecting apparent differences between spatial tessellations at different time points. Int J Geogr Inf Sci 20:633–648. http://www.tandfonline.com/doi/pdf/10.1080/13658810600661300
Natural Earth data, website: http://www.nacis.org/naturalearth/10m/cultural/ne_10m_populated_places.zip
OGC (2007) Geography Markup Language By Open Geospatial Consortium Inc., https://portal.opengeospatial.org/files/?artifact_id=11339
OpenStreetMap data, website: http://downloads.cloudmade.com/europe/eastern_europe/czech_republic#downloads_breadcrumbs
“OSM Place-ranks”, website: https://github.com/mapbox/osm-place-ranks. Accessed: 1 July 2013. http://wiki.openstreetmap.org/wiki/Nominatim/Development_overview. Accessed 12 Aug 2013
Papakonstantinou Y, Abiteboul S, Garcia-Molina H (1996) Object fusion in mediator systems. In: Proceedings of the 22nd international conference on very large databases. http://db.ucsd.edu/pubsFileFolder/147.pdf
Park J (2001) Schema integration methodology and toolkit for heterogeneous and distributed geographic databases. J Korea Ind Inf Syst Soc 51–64. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.203.1655&rep=rep1&type=pdf
Rigeaux P, Scholl M, Voisard A (2001) Spatial databases with application to GIS. Morgan Kaufman Publishers. http://store.elsevier.com/Spatial-Databases/Philippe-Rigaux/isbn-9781558605886/
Safra E, Doytsher Y (2006a) Integration of multiple geo-spatial datasets. In: ASPRS annual conference, Reno, Nevada, 1–5 May. http://www.asprs.org/a/publications/proceedings/reno2006/0132.pdf
Safra E, Doytsher Y (2006b) Integrating a sequence of geo-spatial datasets. GISRUK 2006. http://www.asprs.org/a/publications/proceedings/reno2006/0132.pdf
Safra E, Kanza Y, Sagiv Y, Beeri C, Doytsher Y (2010) Location‐based algorithms for finding sets of corresponding objects over several geo‐spatial data sets. Int J Geogr Inf Sci 24(1):69–106. http://dx.doi.org/10.1080/13658810802275560
Sagayaraj F, Thambidurai P, Bharadwaj BS, Balagei GN, Hemant N (2006) An improved and efficient storage technique for GIS geometric primitives based on minimum bounding rectangles. In: 9th annual international conference, Map India 2006
Sattler K-U, Conrad S, Saake G (2000) Adding conflict resolution features to a query language for database federations, pp 41–52. http://wwwiti.cs.uni-magdeburg.de/iti_db/publikationen/ps/00/SatConSaa00.ps.gz
Sester M, Anders KH, Walter V (1998) Linking objects of different spatial data sets by integration and aggregation. GeoInformatica 2(4):335–358
Spaccapietra S, Parent C (1994) View integration: a step forward in solving structural conflicts. IEEE Trans Knowl Data Eng 6(2):258–274
Spaccapietra S, Parent C, Dupont Y (1992) Model independent assertions for integration of heterogeneous schemas. VLDB J 1(1):81–126
Sripada LN, Lu CT, Wu W (2004) Evaluating GML support for spatial databases. Compsac 2004. In: Proceedings of the 28th annual international computer software and applications conference. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1342680&tag=1
UCGIS (2004) University Consortium for Geographic Information Science 2002 Research Agenda. http://www.Ucgis.Org/Priorities/Research/2002researchagenda.Htm
Walter V, Fritsch D (1999) Matching spatial data sets: a statistical approach. Int J Geogr Inf Sci 13(5):445–473. http://www.tandfonline.com/doi/abs/10.1080/136588199241157#.UzViZ6hdVHQ
Zhang C, Peng Z-R, Li W, Day MJ (2003) GML-based interoperable geographical databases. Cartography 32:1–16. http://gis.geog.uconn.edu/personal/paper1/journal%20paper/2%202003%20InteroperableDabase--scanned%20one.pdf
Ziegler P, Dittrich KR (2004) Three decades of data integration – all problems solved? Database Technology Research Group. Winterthurerstrasse 190, Ch-8057, Zurich, Department of Informatics, University of Zurich. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.200.546&rep=rep1&type=pdf
Acknowledgments
The Ministry of Education, Youth and Sports of the Czech Republic, Project CZ.1.07/2.3.00/30.0021 “Strengthening of Research and Development Teams at the University of Pardubice”, financially supported this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Bhattacharya, D., Pasquali, P., Komarkova, J., Sedlak, P., Saha, A., Boccardo, P. (2015). Interlinking Opensource Geo-Spatial Datasets for Optimal Utility in Ranking. In: Brus, J., Vondrakova, A., Vozenilek, V. (eds) Modern Trends in Cartography. Lecture Notes in Geoinformation and Cartography. Springer, Cham. https://doi.org/10.1007/978-3-319-07926-4_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-07926-4_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07925-7
Online ISBN: 978-3-319-07926-4
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)