Abstract
Word Sense Disambiguation (WSD) systems are usually evaluated by comparing their absolute performance, in a fixed experimental setting, to other alternative algorithms and methods. However, little attention has been paid to analyze the lexical resources and the corpora defining the experimental settings and their possible interactions with the overall results obtained. In this paper we present some experiments supporting the hypothesis that the quality of lexical resources used for tagging the training corpora of WSD systems partly determines the quality of the results. In order to verify this initial hypothesis we have developed two kinds of experiments. At the linguistic level, we have tested the quality of lexical resources in terms of the annotators’ agreement degree. From the computational point of view, we have evaluated how those different lexical resources affect the accuracy of the resulting WSD classifiers. We have carried out these experiments using three different lexical resources as sense inventories and a fixed WSD system based on Support Vector Machines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ide, N., Véronis, J.: Introduction to the special issue on word sense disambiguation: the state of the art. Computational Linguistics, Special issue on Word Sense Disambiguation 24, 1–40 (1998)
Real Academia Española: Diccionario de la lengua española, 22nd edn., Madrid, Spain (2001)
Artigas, N., García, M., Martí, M., Taulé, M.: Diccionario MiniDir-2.1. Technical Report XTRACT2-WP-03/08, Centre de Llenguatge i Computaci (CLiC), Universitat de Barcelona (2003)
Vossen, P. (ed.): EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Kluwer Academic Publishers, Dordrecht (1999)
Joachims, T.: Making large–scale SVM learning practical. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods — Support Vector Learning, pp. 169–184. MIT Press, Cambridge (1999)
Villarejo, L., Màrquez, L., Agirre, E., Martínez, D., Magnini, B., Strapparava, C., McCarthy, D., Montoyo, A., Suárez, A.: The Meaning system on the english allwords task. In: Proceedings of the Senseval-3 ACL-SIGLEX Workshop, Barcelona, Spain (2004)
Véronis, J.: Sense tagging: does it make sense? In: Proceedings of the Corpus Linguistics 2001 Conference, Lancaster, U.K. (2001)
Kilgarriff, A.: 95% replicability for manual word sense tagging. In: Proceedings of the 9th Conference of the European Chapter of the Association for Computational Linguistics, EACL 1999, Bergen, Norway (1999)
Krishnamurthy, R., Nicholls, D.: Peeling an onion: The lexicographer’s experience of manual sense-tagging. Computers and the Humanities. Special Issue on Evaluating Word Sense Disambiguation Programs 34, 85–97 (2000)
Fellbaum, C., Grabowsky, J., Landes, S.: Analysis of a hand-tagging task. In: Proceedings of the ANLP 1997 Workshop on Tagging Text with Lexical Semantics: Why, What, and How? Washington D.C., USA (1997)
Sebastián, N., Martí, M.A., Carreiras, M.F., Gómez, F.C.: Lexesp, léxico informatizado del español. Edicions de la Universitat de Barcelona, Barcelona (2000)
Artigas, N., García, M., Martí, M., Taulé, M.: Manual de anotacin semántica. Technical Report XTRACT2-WP-03/03, Centre de Llenguatge i Computaci (CLiC), Universitat de Barcelona (2003)
Taulé, M., Civit, M., Artigas, N., García, M., Màrquez, L., Martí, M., Navarro, B.: Minicors and cast3lb: Two semantically tagged soanish corpora. In: Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004, Lisbon, Portugal (2004)
Màrquez, L., Taulé, M., Martí, M.A., García, M., Artigas, N., Real, F., Ferrés, D.: Senseval-3: The spanish lexical sample task. In: Proceedings of the Senseval-3 ACL Workshop, Barcelona, Spain (2004)
Escudero, G., Màrquez, L., Rigau, G.: TALP system for the english lexical sample task. In: Proceedings of the Senseval-3 ACL Workshop, Barcelona, Spain (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Màrquez, L., Taulé, M., Padró, L., Villarejo, L., Martí, M.A. (2004). On the Quality of Lexical Resources for Word Sense Disambiguation. In: Vicedo, J.L., Martínez-Barco, P., Muńoz, R., Saiz Noeda, M. (eds) Advances in Natural Language Processing. EsTAL 2004. Lecture Notes in Computer Science(), vol 3230. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30228-5_26
Download citation
DOI: https://doi.org/10.1007/978-3-540-30228-5_26
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23498-2
Online ISBN: 978-3-540-30228-5
eBook Packages: Springer Book Archive