Improving Unsupervised WSD with a Dynamic Thesaurus

  • Javier Tejada-Cárcamo
  • Hiram Calvo
  • Alexander Gelbukh
Conference paper

DOI: 10.1007/978-3-540-87391-4_27

Part of the Lecture Notes in Computer Science book series (LNCS, volume 5246)
Cite this paper as:
Tejada-Cárcamo J., Calvo H., Gelbukh A. (2008) Improving Unsupervised WSD with a Dynamic Thesaurus. In: Sojka P., Horák A., Kopeček I., Pala K. (eds) Text, Speech and Dialogue. TSD 2008. Lecture Notes in Computer Science, vol 5246. Springer, Berlin, Heidelberg

Abstract

The method proposed by Diana McCarthy et al. [1] obtains the predominant sense for an ambiguous word based on a weighted list of terms related to the ambiguous word. This list of terms is obtained using the distributional similarity method proposed by Lin [2] to obtain a thesaurus. In that method, every occurrence of the ambiguous word uses the same thesaurus, regardless of the context where it occurs. Every different word to be disambiguated uses the same thesaurus. In this paper we explore a different method that accounts for the context of a word when determining the most frequent sense of an ambiguous word. In our method the list of distributed similar words is built based on the syntactic context of the ambiguous word. We attain a precision of 69.86%, which is 7% higher than the supervised baseline of using the MFS of 90% SemCor against the remaining 10% of SemCor.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Javier Tejada-Cárcamo
    • 1
    • 2
  • Hiram Calvo
    • 1
  • Alexander Gelbukh
    • 1
  1. 1.Center for Computing ResearchNational Polytechnic InstituteMexico CityMéxico
  2. 2.Sociedad Peruana de ComputaciónArequipaPerú

Personalised recommendations