Exploring Wikipedia and Text Features for Named Entity Disambiguation

  • Hien T. Nguyen
  • Tru H. Cao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5991)


Precisely identifying entities is essential for semantic annotation. This paper addresses the problem of named entity disambiguation that aims at mapping entity mentions in a text onto the right entities in Wikipedia. The aim of this paper is to explore and evaluate various combinations of features extracted from Wikipedia and texts for the disambiguation task, based on a statistical ranking model of candidate entities. Through experiments, we show which combinations of features are the best choices for disambiguation.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bunescu, R., Paşca, M.: Using encyclopedic knowledge for named entity disambiguation. In: Proc. of the 11th Conference of EACL, pp. 9–16 (2006)Google Scholar
  2. 2.
    Bontcheva, K., et al.: Shallow methods for named entity coreference resolution. In: Proc. of TALN 2002 Workshop (2002)Google Scholar
  3. 3.
    Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proc. of EMNLP-CoNLL Joint Conference (2007)Google Scholar
  4. 4.
    Cohen, W., Ravikumar, P., Fienberg, S.: A comparison of string metrics for name-matching tasks. In: IJCAI-03 II-Web Workshop (2003)Google Scholar
  5. 5.
    Cunningham, H., et al.: GATE: A framework and graphical development environment for robust NLP tools and applications. In: Proc. of ACL 2002 (2002)Google Scholar
  6. 6.
    Gooi, C.H., Allan, J.: Cross-document coreference on a large-scale corpus. In: Proc. of HLT/NAACL 2004 (2004)Google Scholar
  7. 7.
    Mihalcea, R.: Using Wikipedia for automatic word sense disambiguation. In: Proc. of HLT/NAACL 2007 (2007)Google Scholar
  8. 8.
    Mihalcea, R., Csomai, A.: Wikify!: Linking documents to encyclopedic knowledge. In: Proc. of CIKM 2007, pp. 233–242 (2007)Google Scholar
  9. 9.
    Medelyan, O., et al.: Mining meaning from Wikipedia. International Journal of Human-Computer Studies 67(9), 716–754 (2009)CrossRefGoogle Scholar
  10. 10.
    Medelyan, O., et al.: Topic indexing with Wikipedia. In: Proc. of WIKIAI 2008 (2008)Google Scholar
  11. 11.
    Milne, D., Witten, I.H.: Learning to link with Wikipedia. In: Proc. of CIKM 2008, pp. 509–518 (2008)Google Scholar
  12. 12.
    Overell, S., Rüger, S.: Using co-occurrence models for placename disambiguation. The IJGIS. Taylor and Francis, Abington (2008)Google Scholar
  13. 13.
    Nguyen, H.T., Cao, T.H.: A Knowledge-based approach to named entity disambiguation in news articles. In: Orgun, M.A., Thornton, J. (eds.) AI 2007. LNCS (LNAI), vol. 4830, pp. 619–624. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  14. 14.
    Chen, Y., Martin, J.: Towards robust unsupervised personal name disambiguation. In: Proc. of EMNLP-CoNLL Joint Conference (2007)Google Scholar
  15. 15.
    Zesch, T., Gurevych, I., Mühlhäuser, M.: Analyzing and accessing Wikipedia as a lexical semantic resource. In: Rehm, G., Witt, A., Lemnitzer, L. (eds.) Data Structures for Linguistic Resources and Applications, pp. 197–205 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Hien T. Nguyen
    • 1
  • Tru H. Cao
    • 2
  1. 1.Ton Duc Thang UniversityVietnam
  2. 2.Ho Chi Minh City University of TechnologyVietnam

Personalised recommendations