Finding the Most Frequent Sense of a Word by the Length of Its Definition

  • Hiram Calvo
  • Alexander Gelbukh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8856)

Abstract

Most frequent sense (MFS) is a very powerful heuristic in word sense disambiguation, extremely difficult to outperform with sophisticated methods. We show that counting the number of words, characters, or relationships of a word’s sense definitions allows guessing the most frequent sense of the word: the MFS usually has a longer gloss, more examples of usage, and more relationships with other words (synonyms, hyponyms, etc.). In addition, we show that this effect is resource-dependent, making some algorithms to perform differently with different dictionaries.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hawker, T., Honnibal, M.: Improved Default Sense Selection for Word Sense Disambiguation. In: Proceedings of the 2006 Australasian Language Technology Workshop (ALTW 2006), pp. 11–17 (2006)Google Scholar
  2. 2.
    Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: Proceedings of the 5th Annual International Conference on Systems Documentation, pp. 24–26. ACM (1986)Google Scholar
  3. 3.
    Lin, D.: An information-theoretic definition of similarity. In: International Conference on Machine Learning, vol. 98, pp. 296–304 (1998)Google Scholar
  4. 4.
    Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: The Penn Treebank. Computational linguistics 19(2), 313–330 (1993)Google Scholar
  5. 5.
    Màrquez, L., Taulé, M., Martí, M.A., García, M., Artigas, N., Real, F.J., Ferrés, D.: Senseval-3: The Spanish Lexical Sample Task. In: Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. Association for Computational Linguistics, Barcelona (2004)Google Scholar
  6. 6.
    McCarthy, D., Koeling, R., Weeds, R.J., Carroll, J.: Unsupervised acquisition of predominant word senses. Computational Linguistics 33(4), 553–590 (2007)CrossRefGoogle Scholar
  7. 7.
    Mihalcea, R., Chklovski, T., Kilgarriff, A.: The Senseval-3 English lexical sample task. In: Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, pp. 25–28 (2004)Google Scholar
  8. 8.
    Miller, G., Leacock, C., Tengi, R., Bunker, R.T.: A Semantic Concordance. In: Proceedings of ARPA Workshop on Human Language Technology, pp. 303–308 (1993)Google Scholar
  9. 9.
    Miller, G.A., Chodorow, M., Landes, S., Leacock, C., Thomas, R.G.: Using a semantic concordance for sense identification. In: Proceedings of the ARPA Human Language Technology Workshop, pp. 240–243 (1994)Google Scholar
  10. 10.
    Snyder, B., Palmer, M.: The English all-words task. In: ACL 2004 Senseval-3 Workshop, Barcelona, Spain (2004)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Hiram Calvo
    • 1
  • Alexander Gelbukh
    • 1
  1. 1.Instituto Politécnico NacionalCentro de Investigación en ComputaciónD.F.Mexico

Personalised recommendations