Crossing Parallel Corpora and Multilingual Lexical Databases for WSD

  • Alfio Massimiliano Gliozzo
  • Marcello Ranieri
  • Carlo Strapparava
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3406)


Word Sense Disambiguation (WSD) is the task of selecting the correct sense of a word in a context from a sense repository. Typically, WSD is approached as a supervised classification task to get state-of-the-art performance (e.g. [1]), and thus a large amount of sense-tagged examples for each sense of the word is needed, according to the word-expert approach. This requirement makes the supervised approach unfeasible for “all-words” tasks, consisting on disambiguating all the words in texts. This problem has been called the Knowledge Acquisition Bottleneck and many solutions have been proposed for it (see for example [2]) .


Word Pair Word Sense Disambiguation Parallel Corpus Lexical Resource Polysemous Word 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Strapparava, C., Gliozzo, A., Giuliano, C.: Pattern abstraction and term similarity for word sense disambiguation: Irst at senseval-3. In: Proc. of SENSEVAL-3, Barcelona, Spain (2004)Google Scholar
  2. 2.
    Mihalcea, R., Moldovan, D.: An automatic method for generating sense tagged corpora. In: Proc. of AAAI 1999, Orlando, FL (1999)Google Scholar
  3. 3.
    Magnini, B., Strapparava, C.: Experiments in word domain disambiguation for parallel texts. In: Proc. of Word Senses and Multi-Linguality, Hong Kong, Workshop held in conjunction of ACL 2000 (2000)Google Scholar
  4. 4.
    Diab, M., Resnik, P.: An unsupervised method for word sense tagging using parallel texts. In: Proc. of ACL 2002, Philadelphia (2002)Google Scholar
  5. 5.
    Bentivogli, L., Pianta, E.: Exploiting parallel texts in the creation of multilingual semantically annotated resources: the MultiSemCor corpus. Journal of Natural Language Engineering (NLE), Special Issue on Parallel Texts (to appear) Google Scholar
  6. 6.
    Koehn, P.: EuroParl: A multilingual corpus for evaluation of machine translation (Unpublished),

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Alfio Massimiliano Gliozzo
    • 1
  • Marcello Ranieri
    • 1
  • Carlo Strapparava
    • 1
  1. 1.ITC-irstTrentoItaly

Personalised recommendations