Defining Classifier Regions for WSD Ensembles Using Word Space Features

Saarikoski, Harri M. T.; Legrand, Steve; Gelbukh, Alexander

doi:10.1007/11925231_82

Harri M. T. Saarikoski²⁰,
Steve Legrand^21,22 &
Alexander Gelbukh²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4293))

Included in the following conference series:

Mexican International Conference on Artificial Intelligence

982 Accesses
2 Citations

Abstract

Based on recent evaluation of word sense disambiguation (WSD) systems [10], disambiguation methods have reached a standstill. In [10] we showed that it is possible to predict the best system for target word using word features and that using this ’optimal ensembling method’ more accurate WSD ensembles can be built (3-5% over Senseval state of the art systems with the same amount of possible potential remaining). In the interest of developing if more accurate ensembles, w e here define the strong regions for three popular and effective classifiers used for WSD task (Naive Bayes – NB, Support Vector Machine – SVM, Decision Rules – D) using word features (word grain, amount of positive and negative training examples, dominant sense ratio). We also discuss the effect of remaining factors (feature-based).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 239.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Edmonds, P., Kilgarriff, A.: Introduction to the Special Issue on evaluating word sense disambiguation programs. Journal of Natural Language Engineering 8(4) (2002)
Google Scholar
Forman, G., Cohen, I.: Learning from Little: Comparison of Classifiers Given Little Training. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, Springer, Heidelberg (2004), http://ecmlpkdd.isti.cnr.it/
Google Scholar
Hoste, V., Hendrickx, I., Daelemans, W., van den Bosch, A.: Parameter optimization for machine-learning of word sense disambiguation. Journal of Natural Language Engineering 8(4), 311–327 (2002)
Article Google Scholar
Legrand, S., Pulido, J.G.R.: A Hybrid Approach to Word Sense Disambiguation: Neural Clustering with Class Labeling. In: Knowledge Discovery and Ontologies workshop at 15th European Conference on Machine Learning (ECML) (2004)
Google Scholar
Luo, F., Khan, L., Bastani, F., Yen, I.-L., Zhou, J.: A dynamically growing self-organizing tree (DGSOT) for hierarchical clustering gene expression profiles. Bioinformatics 20(16), 2605–2617 (2004)
Article Google Scholar
Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: YALE: Rapid Prototyping for Complex Data Mining Tasks. In: Proceedings of the 12th ACM SIGKDD (KDD 2006) (2006)
Google Scholar
Mihalcea, R.: Word sense disambiguation with pattern learning and automatic feature selection. Journal of Natural Language Engineering 8(4), 343–359 (2002)
Article Google Scholar
Mihalcea, R., Kilgarriff, A., Chklovski, T.: The SENSEVAL-3 English lexical sample task. In: Proceedings of SENSEVAL-3 Workshop at ACL (2004)
Google Scholar
Pedersen, T.: Machine Learning with Lexical Features: The Duluth Approach to Senseval-2. In: Proceedings of SENSEVAL-2: Second International Workshop on Evaluating Word Sense Disambiguation Systems (2002)
Google Scholar
Saarikoski, H., Legrand, S.: Building an Optimal WSD Ensemble Using Per-Word Selection of Best System. In: Sanfeliu, A., Cortés, M.L. (eds.) CIARP 2005. LNCS, vol. 3773, Springer, Heidelberg (2005)
Google Scholar
Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Yarowsky, D., Cucerzan, S., Florian, R., Schafer, C., Wicentowski, R.: The Johns Hopkins SENSEVAL2 System Descriptions. In: Proceedings of SENSEVAL-2 workshop (2002)
Google Scholar
Yarowsky, D., Florian, R.: Evaluating sense disambiguation across diverse parameter spaces. Journal of Natural Language Engineering 8(4), 293–311 (2002)
Article Google Scholar
Zavrel, J., Degroeve, S., Kool, A., Daelemans, W., Jokinen, K.: Diverse Classifiers for NLP Disambiguation Tasks. Comparisons, Optimization, Combination, and Evolution. In: TWLT 18. Learning to Behave. CEvoLE 2, pp. 201–221 (2000)
Google Scholar
Seo, H.-C., Rim, H.-C., Kim, S.-H.: KUNLP system in Senseval-2. In: Proceedings of SENSEVAL-2 Workshop, pp. 222–225 (2001)
Google Scholar
Strapparava, C., Gliozzo, A., Giuliano, C.: Pattern abstraction and term similarity for Word Sense Disambiguation: IRST at Senseval-3. In: Proceedings of SENSEVAL-3 workshop (2004)
Google Scholar
Manning, C., Tolga Ilhan, H., Kamvar, S., Klein, D., Toutanova, K.: Combining Heterogeneous Classifiers for Word-Sense Disambiguation. In: Proceedings of SENSEVAL-2, Second International Workshop on Evaluating WSD Systems, pp. 87–90 (2001)
Google Scholar
Lee, Y.-K., Ng, H.-T., Chia, T.-K.: Supervised Word Sense Disambiguation with Support Vector Machines and Multiple Knowledge Sources. In: Proceedings of SENSEVAL-3 workshop (2004)
Google Scholar
Grozea, C.: Finding optimal parameter settings for high performance word sense disambiguation. In: SENSEVAL-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

KIT Language Technology Doctorate School, Helsinki University, Finland
Harri M. T. Saarikoski
Department of Computer Science, University of Jyväskylä, Finland
Steve Legrand
Instituto Politecnico Nacional, Mexico City, Mexico
Steve Legrand & Alexander Gelbukh

Authors

Harri M. T. Saarikoski
View author publications
You can also search for this author in PubMed Google Scholar
Steve Legrand
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Gelbukh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, 07738, Mexico City, México
Alexander Gelbukh
Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), Luis Enrique Erro No. 1, Sta. Ma. Tonanzintla, 72840, Puebla, México
Carlos Alberto Reyes-Garcia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saarikoski, H.M.T., Legrand, S., Gelbukh, A. (2006). Defining Classifier Regions for WSD Ensembles Using Word Space Features. In: Gelbukh, A., Reyes-Garcia, C.A. (eds) MICAI 2006: Advances in Artificial Intelligence. MICAI 2006. Lecture Notes in Computer Science(), vol 4293. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11925231_82

Download citation

DOI: https://doi.org/10.1007/11925231_82
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49026-5
Online ISBN: 978-3-540-49058-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics