Robust Semi-supervised and Ensemble-Based Methods in Word Sense Disambiguation

  • Anders Søgaard
  • Anders Johannsen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6233)

Abstract

Mihalcea [1] discusses self-training and co-training in the context of word sense disambiguation and shows that parameter optimization on individual words was important to obtain good results. Using smoothed co-training of a naive Bayes classifier she obtains a 9.8% error reduction on Senseval-2 data with a fixed parameter setting. In this paper we test a semi-supervised learning algorithm with no parameters, namely tri-training [2]. We also test the random subspace method [3] for building committees out of stable learners. Both techniques lead to significant error reductions with different learning algorithms, but improvements do not accumulate. Our best error reduction is 7.4%, and our best absolute average over Senseval-2 data, though not directly comparable, is 12% higher than the results reported in Mihalcea [1].

Keywords

Support Vector Machine Unlabeled Data Error Reduction Word Sense Disambiguation Random Subspace 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Mihalcea, R.: Co-training and self-training for word sense disambiguation. In: CONLL, Boston, MA (2004)Google Scholar
  2. 2.
    Li, M., Zhou, Z.H.: Tri-training: exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11), 1529–1541 (2005)CrossRefGoogle Scholar
  3. 3.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)CrossRefGoogle Scholar
  4. 4.
    Abney, S.: Semi-supervised learning for computational linguistics. Chapman and Hall, Boca Raton (2008)Google Scholar
  5. 5.
    Chen, W., Zhang, Y., Isahara, H.: Chinese chunking with tri-training learning. In: Matsumoto, Y., Sproat, R.W., Wong, K.-F., Zhang, M. (eds.) ICCPOL 2006. LNCS (LNAI), vol. 4285, pp. 466–473. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Nguyen, T., Nguyen, L., Shimazu, A.: Using semi-supervised learning for question classification. Journal of Natural Language Processing 15, 3–21 (2008)Google Scholar
  7. 7.
    García-Pedrajas, N., Ortiz-Boyer, D.: Boosting random subspace method. Neural Networks 21(9), 1344–1362 (2008)CrossRefGoogle Scholar
  8. 8.
    Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Annals of Statistics 28(2), 337–407 (2000)MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Frank, E., Witten, I.: Generating accurate rule sets without global optimization. In: The 15th International Conference on Machine Learning (1995)Google Scholar
  10. 10.
    Sindhwani, V., Keerthi, S.: Large scale semi-supervised linear SVMs. In: ACM SIGIR, Seattle, WA (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Anders Søgaard
    • 1
  • Anders Johannsen
    • 1
  1. 1.Centre for Language TechnologyUniversity of CopenhagenCopenhagen S

Personalised recommendations