Toponym Disambiguation Using Ontology-Based Semantic Similarity

  • David S. Batista
  • João D. Ferreira
  • Francisco M. Couto
  • Mário J. Silva
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7243)


We propose a new heuristic for toponym sense disambiguation, to be used when mapping toponyms in text to ontology concepts, using techniques based on semantic similarity measures. We evaluated the proposed approach using a collection of Portuguese news articles from which the geographic entity names were extracted and then manually mapped to concepts in a geospatial ontology covering the territory of Portugal. The results suggest that using semantic similarity to disambiguate toponyms in text produces good results, in comparison with a baseline method.


semantic similarity ontologies toponym sense disambiguation geographic information retrieval 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Andrade, L., Silva, M.J.: Relevance Ranking for Geographic IR. In: Purves, R., Jones, C. (eds.) GIR. Department of Geography, University of Zurich (2006)Google Scholar
  2. 2.
    Batista, D., Silva, M.J.: A Statistical Study of the WPT05 Crawl of the Portuguese Web. In: FALA 2010 VI Jornadas en Tecnología del Habla and II Iberian SLTech Workshop, Vigo, Spain (2010)Google Scholar
  3. 3.
    Butanitsky, A., Hirst, G.: Semantic Distance in WordNet: An Experimental, Application-Oriented Evaluation of Five Measures. In: Proceedings of WordNet and Other Lexical Resources Workshop (2001)Google Scholar
  4. 4.
    Cardoso, N.: REMBRANDT - Reconhecimento de Entidades Mencionadas Baseado em Relações e ANálise Detalhada do Texto. In: Encontro do Segundo HAREM, PROPOR 2008, Aveiro, Portugal (2008)Google Scholar
  5. 5.
    Gale, W.A., Church, K.W., Yarowsky, D.: One Sense per Discourse. In: Proceedings of the Workshop on Speech and Natural Language, HLT 1991 (1992)Google Scholar
  6. 6.
    Jiang, J.J., Conrath, D.W.: Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In: Proc. of the Int’l. Conf. on Research in Computational Linguistics, pp. 19–33 (1997)Google Scholar
  7. 7.
    Leidner, J.L., Sinclair, G., Webber, B.: Grounding Spatial Named Entities for Information Extraction and Question Answering. In: Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References, vol. 1 (2003)Google Scholar
  8. 8.
    Lin, D.: An Information-Theoretic Definition of Similarity. In: ICML 1998: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 296–304. Morgan Kaufmann Publishers Inc., San Francisco (1998)Google Scholar
  9. 9.
    Lopez-Pellicer, F.J., Chaves, M., Rodrigues, C., Silva, M.J.: Geographic Ontologies Production in GREASE-II. Tech. Rep. TR 09-18, University of Lisbon, Faculty of Sciences, LASIGE (November 2009)Google Scholar
  10. 10.
    Martins, B., Anastácio, I., Calado, P.: A Machine Learning Approach for Resolving Place References in Text. In: Proceedings of the 13th AGILE International Conference on Geographic Information Science. Association of Geographic Information Laboratories for Europe. Springer, Guimarães (2010)Google Scholar
  11. 11.
    Navigli, R.: Word Sense Disambiguation: A Survey. ACM Comput. Surv. 41 (February 2009)Google Scholar
  12. 12.
    Rauch, E., Bukatin, M., Baker, K.: A Confidence-Based Framework for Disambiguating Geographic Terms. In: Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References. Association for Computational Linguistics (2003)Google Scholar
  13. 13.
    Resnik, P.: Using Information Content to Evaluate Semantic Similarity in a Taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol. 1, pp. 448–453. Morgan Kaufmann Publishers Inc, San Francisco (1995)Google Scholar
  14. 14.
    Santos, D., Rocha, P.: The Key to the First CLEF with Portuguese: Topics, Questions and Answers in CHAVE. In: Multilingual Information Access for Text, Speech and Images (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • David S. Batista
    • 1
  • João D. Ferreira
    • 2
  • Francisco M. Couto
    • 2
  • Mário J. Silva
    • 1
  1. 1.IST/INESC-ID LisbonLisbonPortugal
  2. 2.LaSIGEUniversity of LisbonLisbonPortugal

Personalised recommendations