Using Lexical and Thematic Knowledge for Name Disambiguation

  • Jinpeng Wang
  • Wayne Xin Zhao
  • Rui Yan
  • Haitian Wei
  • Jian-Yun Nie
  • Xiaoming Li
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7675)


In this paper we present a novel approach to disambiguate names based on two different types of semantic information: lexical and thematic. We propose to use translation-based language models to resolve the synonymy problem in every word match, and to use topic-based ranking function to capture rich thematic contexts for names. We test three ranking functions that combine lexical relatedness and thematic relatedness. The experiments on Wikipedia data set and TAC-KBP 2010 data set show that our proposed method is very effective for name disambiguation.


Name Disambiguation Lexical and Thematic Knowledge 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Dredze, M., McNamee, P., Rao, D., Gerber, A., Finin, T.: Entity disambiguation for knowledge base population. In: Proc. COLING 2010, pp. 277–285 (2010)Google Scholar
  2. 2.
    Bunescu, R.: Using encyclopedic knowledge for named entity disambiguation. In: EACL, pp. 9–16 (2006)Google Scholar
  3. 3.
    Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proc. EMNLP-CoNLL 2007, pp. 708–716 (June 2007)Google Scholar
  4. 4.
    Gottipati, S., Jiang, J.: Linking entities to a knowledge base with query expansion. In: Proc. EMNLP 2011, pp. 804–813 (2011)Google Scholar
  5. 5.
    Pilz, A., Paaß, G.: From names to entities using thematic context distance. In: Proc. CIKM 2011, pp. 857–866 (2011)Google Scholar
  6. 6.
    Kozareva, Z., Ravi, S.: Unsupervised name ambiguity resolution using a generative model. In: Proc. EMNLP 2011, pp. 105–112 (2011)Google Scholar
  7. 7.
    Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proc. CIKM 2007, pp. 233–242 (2007)Google Scholar
  8. 8.
    Medelyan, O., Witten, I.H., Milne, D.: Topic indexing with wikipedia. In: Proc. AAAI 2008 (2008)Google Scholar
  9. 9.
    Milne, D., Witten, I.H.: Learning to link with wikipedia. In: Proc. CIKM 2008, pp. 509–518 (2008)Google Scholar
  10. 10.
    Han, X., Sun, L.: A generative entity-mention model for linking entities with knowledge base. In: Proc. HLT 2011, pp. 945–954 (2011)Google Scholar
  11. 11.
    Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22(2), 179–214 (2004)CrossRefGoogle Scholar
  12. 12.
    Berger, A., Lafferty, J.: Information retrieval as statistical translation. In: Proc. SIGIR 1999, pp. 222–229 (1999)Google Scholar
  13. 13.
    Xue, X., Jeon, J., Croft, W.B.: Retrieval models for question and answer archives. In: Proc. SIGIR 2008, pp. 475–482 (2008)Google Scholar
  14. 14.
    Gao, J., He, X., Nie, J.Y.: Clickthrough-based translation models for web search: from word models to phrase models. In: Proc. CIKM 2010, pp. 1139–1148 (2010)Google Scholar
  15. 15.
    Lu, Y., Zhai, C., Sundaresan, N.: Rated aspect summarization of short comments. In: Proc. WWW 2009, pp. 131–140 (2009)Google Scholar
  16. 16.
    Kullback, S., Leibler, R.A.: On information and sufficiency. The Annals of Mathematical Statistics 22(1), 79–86 (1951)MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proc. UAI 2004, pp. 487–494 (2004)Google Scholar
  18. 18.
    Heng, J., Ralph, G., Hoa, T.D., Kira, G., Joe, E.: Overview of the tac 2010 knowledge base population track. In: Proc. TAC 2010 (2010)Google Scholar
  19. 19.
    McCallum, A.K.: Mallet: A machine learning for language toolkit (2002),

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Jinpeng Wang
    • 1
  • Wayne Xin Zhao
    • 1
  • Rui Yan
    • 1
  • Haitian Wei
    • 2
  • Jian-Yun Nie
    • 3
  • Xiaoming Li
    • 1
  1. 1.Department of Computer Science and TechnologyPeking UniversityChina
  2. 2.School of International Trade and EconomicsUniversity of International Business and EconomicsChina
  3. 3.Dpartement d’Informatique et de Recherche OprationnelleUniversit de MontralMontrealCanada

Personalised recommendations