Robust Semi-supervised and Ensemble-Based Methods in Word Sense Disambiguation

  • Anders Søgaard
  • Anders Johannsen
Conference paper

DOI: 10.1007/978-3-642-14770-8_43

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6233)
Cite this paper as:
Søgaard A., Johannsen A. (2010) Robust Semi-supervised and Ensemble-Based Methods in Word Sense Disambiguation. In: Loftsson H., Rögnvaldsson E., Helgadóttir S. (eds) Advances in Natural Language Processing. NLP 2010. Lecture Notes in Computer Science, vol 6233. Springer, Berlin, Heidelberg

Abstract

Mihalcea [1] discusses self-training and co-training in the context of word sense disambiguation and shows that parameter optimization on individual words was important to obtain good results. Using smoothed co-training of a naive Bayes classifier she obtains a 9.8% error reduction on Senseval-2 data with a fixed parameter setting. In this paper we test a semi-supervised learning algorithm with no parameters, namely tri-training [2]. We also test the random subspace method [3] for building committees out of stable learners. Both techniques lead to significant error reductions with different learning algorithms, but improvements do not accumulate. Our best error reduction is 7.4%, and our best absolute average over Senseval-2 data, though not directly comparable, is 12% higher than the results reported in Mihalcea [1].

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Anders Søgaard
    • 1
  • Anders Johannsen
    • 1
  1. 1.Centre for Language TechnologyUniversity of CopenhagenCopenhagen S

Personalised recommendations