Robust Semi-supervised and Ensemble-Based Methods in Word Sense Disambiguation
- Cite this paper as:
- Søgaard A., Johannsen A. (2010) Robust Semi-supervised and Ensemble-Based Methods in Word Sense Disambiguation. In: Loftsson H., Rögnvaldsson E., Helgadóttir S. (eds) Advances in Natural Language Processing. NLP 2010. Lecture Notes in Computer Science, vol 6233. Springer, Berlin, Heidelberg
Mihalcea  discusses self-training and co-training in the context of word sense disambiguation and shows that parameter optimization on individual words was important to obtain good results. Using smoothed co-training of a naive Bayes classifier she obtains a 9.8% error reduction on Senseval-2 data with a fixed parameter setting. In this paper we test a semi-supervised learning algorithm with no parameters, namely tri-training . We also test the random subspace method  for building committees out of stable learners. Both techniques lead to significant error reductions with different learning algorithms, but improvements do not accumulate. Our best error reduction is 7.4%, and our best absolute average over Senseval-2 data, though not directly comparable, is 12% higher than the results reported in Mihalcea .
Unable to display preview. Download preview PDF.