Bottom-Up Gazetteers: Learning from the Implicit Semantics of Geotags

  • Carsten Keßler
  • Patrick Maué
  • Jan Torben Heuer
  • Thomas Bartoschek
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5892)


As directories of named places, gazetteers link the names to geographic footprints and place types. Most existing gazetteers are managed strictly top-down: entries can only be added or changed by the responsible toponymic authority. The covered vocabulary is therefore often limited to an administrative view on places, using only official place names. In this paper, we propose a bottom-up approach for gazetteer building based on geotagged photos harvested from the web. We discuss the building blocks of a geotag and how they relate to each other to formally define the notion of a geotag. Based on this formalization, we introduce an extraction process for gazetteer entries that captures the emergent semantics of collections of geotagged photos and provides a group-cognitive perspective on named places. Using an experimental setup based on clustering and filtering algorithms, we demonstrate how to identify place names and assign adequate geographic footprints. The results for three different place names (Soho, Camino de Santiago and Kilimanjaro), representing different geographic feature types, are evaluated and compared to the results obtained from traditional gazetteers. Finally, we sketch how our approach can be combined with other (for example, linguistic) approaches and discuss how such a bottom-up gazetteer can complement existing gazetteers.


Point Cloud Delaunay Triangulation Volunteer Geographic Information Place Type Spatial Data Infrastructure 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Jones, C.B., Purves, R.S., Clough, P.D., Joho, H.: Modelling vague places with knowledge from the web. International Journal of Geographical Information Science 22(10), 1045–1065 (2008)CrossRefGoogle Scholar
  2. 2.
    Larson, R.R.: Geographic information retrieval and spatial browsing. GIS and Libraries: Patrons, Maps and Spatial Information, 81–124 (April 1996)Google Scholar
  3. 3.
    Goodchild, M.F.: Citizens as voluntary sensors: Spatial data infrastructure in the world of web 2.0. International Journal of Spatial Data Infrastructures Research 2, 24–32 (2007)Google Scholar
  4. 4.
    Bennett, B., Mallenby, D., Third, A.: An ontology for grounding vague geographic terms. In: Eschenbach, C., Gruninger, M. (eds.) Proceedings of the 5th International Conference on Formal Ontology in Information Systems (FOIS 2008). IOS Press, Amsterdam (2008)Google Scholar
  5. 5.
    Henrich, A., Lüdecke, V.: Determining geographic representations for arbitrary concepts at query time. In: LOCWEB 2008: Proceedings of the first international workshop on Location and the web, pp. 17–24. ACM, New York (2008)CrossRefGoogle Scholar
  6. 6.
    McConchie, A.: The great pop vs. soda controversy (2002), (last visited august 1st, 2009)
  7. 7.
    Keßler, C., Janowicz, K., Bishr, M.: An agenda for the next generation gazetteer: Geographic information contribution and retrieval. In: ACM GIS 2009, Seattle, WA, USA, November 4–6. ACM, New York (2009)Google Scholar
  8. 8.
    Wilske, F.: Approximation of neighborhood boundaries using collaborative tagging systems. In: Pebesma, E., Bishr, M., Bartoschek, T. (eds.) GI-Days 2008. ifgiPrints, vol. 32, pp. 179–187 (2008)Google Scholar
  9. 9.
    Guo, Q., Liu, Y., Wieczorek, J.: Georeferencing locality descriptions and computing associated uncertainty using a probabilistic approach. International Journal of Geographical Information Science 22(10), 1067–1090 (2008)CrossRefGoogle Scholar
  10. 10.
    Heuer, J.T., Dupke, S.: Towards a spatial search engine using geotags. In: Probst, F., Keßler, C. (eds.) GI-Days 2007 – Young Researchers Conference. ifgiPrints, vol. 30, pp. 199–204 (2007)Google Scholar
  11. 11.
    Aberer, K., Mauroux, P.C., Ouksel, A.M., Catarci, T., Hacid, M.S., Illarramendi, A., Kashyap, V., Mecella, M., Mena, E., Neuhold, E.J., et al.: Emergent semantics principles and issues. In: Lee, Y., Li, J., Whang, K.-Y., Lee, D. (eds.) DASFAA 2004. LNCS, vol. 2973, pp. 25–38. Springer, Heidelberg (2004)Google Scholar
  12. 12.
    Stahl, G.: Group Cognition: Computer Support for Building Collaborative Knowledge (Acting with Technology). MIT Press, Cambridge (2006)Google Scholar
  13. 13.
    Raubal, M.: Cognitive engineering for geographic information science. Geography Compass 3(3), 1087–1104 (2009)CrossRefGoogle Scholar
  14. 14.
    Surowiecki, J.: The Wisdom of Crowds. Anchor, New York (2005)Google Scholar
  15. 15.
    Schlieder, C.: Modeling collaborative semantics with a geographic recommender. In: Hainaut, J.-L., Rundensteiner, E.A., Kirchberg, M., Bertolotto, M., Brochhausen, M., Chen, Y.-P.P., Cherfi, S.S.-S., Doerr, M., Han, H., Hartmann, S., Parsons, J., Poels, G., Rolland, C., Trujillo, J., Yu, E., Zimányie, E. (eds.) ER Workshops 2007. LNCS, vol. 4802, pp. 338–347. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  16. 16.
    Janowicz, K., Keßler, C.: The role of ontology in improving gazetteer interaction. International Journal of Geographical Information Science 22(10), 1129–1157 (2008)CrossRefGoogle Scholar
  17. 17.
    Hill, L.L.: Georeferencing: The Geographic Associations of Information (Digital Libraries and Electronic Publishing). MIT Press, Cambridge (2006)Google Scholar
  18. 18.
    Casati, R., Varzi, A.C.: Parts and Places. The Structures of Spatial Representation. MIT Press, Cambridge (1999)Google Scholar
  19. 19.
    Goodchild, M.F., Hill, L.L.: Introduction to digital gazetteer research. International Journal of Geographical Information Science 22(10), 1039–1044 (2008)CrossRefGoogle Scholar
  20. 20.
    Hastings, J.T.: Automated conflation of digital gazetteer data. International Journal of Geographical Information Science 22, 1109–1127 (2008)CrossRefGoogle Scholar
  21. 21.
    Uryupina, O.: Semi-supervised learning of geographical gazetteers from the internet. In: Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references, Morristown, NJ, USA, Association for Computational Linguistics, pp. 18–25 (2003)Google Scholar
  22. 22.
    Goldberg, D.W., Wilson, J.P., Knoblock, C.A.: Extracting geographic features from the internet to automatically build detailed regional gazetteers. International Journal of Geographical Information Science 23(1), 93–128 (2009)CrossRefGoogle Scholar
  23. 23.
    Bishr, M., Kuhn, W.: Geospatial information bottom-up: A matter of trust and semantics. In: Fabrikant, S., Wachowicz, M. (eds.) The European Information Society – Leading the Way with Geo-information (Proceedings of AGILE 2007), Aalborg, DK. Lecture Notes in Geoinformation and Cartography, pp. 365–387. Springer, Heidelberg (2007)Google Scholar
  24. 24.
    Guszlev, A., Lukács, L.: Folksonomy & landscape regions. In: Probst, F., Keßler, C. (eds.) GI-Days 2007 – Young Researchers Conference. ifgiPrints 30, pp. 193–197 (2007)Google Scholar
  25. 25.
    Gruber, T.: Ontology of folksonomy: A mash-up of apples and oranges. International Journal on Semantic Web & Information Systems 3 (2007), (November 2005)
  26. 26.
    Frank, A.: Ontology for spatio-temporal databases. In: Sellis, T.K., Koubarakis, M., Frank, A., Grumbach, S., Güting, R.H., Jensen, C., Lorentzos, N.A., Manolopoulos, Y., Nardelli, E., Pernici, B., Theodoulidis, B., Tryfona, N., Schek, H.-J., Scholl, M.O. (eds.) Spatio-Temporal Databases. LNCS, vol. 2520, pp. 9–77. Springer, Heidelberg (2003)Google Scholar
  27. 27.
    Goodchild, M.F.: Geographical data modeling. Computational Geosciences 18(4), 401–408 (1992)CrossRefGoogle Scholar
  28. 28.
    Saeed, J.I.: Semantics (Introducing Linguistics). Wiley-Blackwell (2003)Google Scholar
  29. 29.
    Searle, J.R.: Proper names. Mind 67(266), 166–173 (1958)CrossRefGoogle Scholar
  30. 30.
    Codd, E.F.: A relational model of data for large shared data banks. Communications of the ACM 13(6), 377–387 (1970)zbMATHCrossRefGoogle Scholar
  31. 31.
    O’connor, M., Tu, S., Nyulas, C., Das, A., Musen, M.: Querying the semantic web with SWRL, pp. 155–159 (2007)Google Scholar
  32. 32.
    Shirky, C.: Ontology is overrated – categories, links, and tags. Essay (2005),
  33. 33.
    Edelsbrunner, H., Kirkpatrick, D., Seidel, R.: On the shape of a set of points in the plane. IEEE Transactions on Information Theory 29(4), 551–559 (1983)zbMATHCrossRefMathSciNetGoogle Scholar
  34. 34.
    Edelsbrunner, H., Mücke, E.: Three-dimensional alpha shapes. ACM Transactions on Graphics 13(1), 43–72 (1994)zbMATHCrossRefGoogle Scholar
  35. 35.
    Allen, R.: A query interface for an event gazetteer. In: Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, pp. 72–73 (2004)Google Scholar
  36. 36.
    Mostern, R., Johnson, I.: From named place to naming event: creating gazetteers for history. International Journal of Geographical Information Science 22(10), 1091–1108 (2008)CrossRefGoogle Scholar
  37. 37.
    Hägerstrand, T.: What about people in regional science? Papers in Regional Science 24(1), 6–21 (1970)CrossRefGoogle Scholar
  38. 38.
    Miller, H.J.: A measurement theory for time geography. Geographical Analysis 37, 17–45 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Carsten Keßler
    • 1
  • Patrick Maué
    • 1
  • Jan Torben Heuer
    • 1
  • Thomas Bartoschek
    • 1
  1. 1.Institute for GeoinformaticsUniversity of MünsterGermany

Personalised recommendations