Abstract
Word groupings useful for language processing tasks are increasingly available, as thesauri appear on-line, and as distributional word clustering techniques improve. However, for many tasks, one is interested in relationships among word senses, not words. This paper presents a method for automatic sense disambiguation of nouns appearing within sets of related nouns — the kind of data one finds in on-line thesauri, or as the output of distributional clustering algorithms. Disambiguation is performed with respect to WordNet senses, which are fairly fine-grained; however, the method also permits the assignment of higher-level WordNet categories rather than sense labels. The method is illustrated primarily by example, though results of a more rigorous evaluation are also presented.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Basili, R., Pazienza, M. T. and Velardi, P. 1994. The noisy channel and the braying donkey. In Klavans and Resnik (eds), Proceedings of the ACL Workshop on Combining Symbolic and Statistical Approaches to Language (The Balancing Act), pp. 21–28.
Bensch, P. A. and Savitch, W. J. 1992. An occurrence-based model of word categorization. Presented at 3rd Meeting on Mathematics of Language (MOL3).
Brill, E. 1991. Discovering the lexical features of a language. In Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics. Berkeley, CA, pp. 339–340.
Brown, P. F., Della Pietra, V. J., deSouza, P. V., Lai, J. C. and Mercer, R. L. 1992. Class-based n-gram models of natural language. Computational Linguistics, 18 (4): 467–480.
Church, K. and Hanks, P. 1989. Word association Norms, Mutual Information, and Lexicography. In Proceedings of the 27th Meeting of the Association for Computational Linguistics. Vancouver, B.C., pp. 76–83.
Cowie, J., Guthrie, J. and Guthrie, L. 1992. Lexical disambiguation using simulated annealing. In Proceedings of the 14th International Conference on Computational Linguistics (COLING-92), pp. 359–365, Nantes, France.
Grefenstette, G. 1994. Explorations in Automatic Thesaurus Discovery. Kluwer.
Hearst, M. A. and Schütze, H. 1996. Customizing a lexicon to better suit a computational task. In Boguraev and Pustejovsky (eds), Corpus Processing for Lexical Acquisition. MIT Press, Cambridge, MA, pp. 77–96.
Hearst, M. 1991. Noun homograph disambiguation using local context in large corpora. In Proceedings of the 7th Annual Conference of the University of Waterloo Centre for the New OED and Text Research, Oxford, UK, pp. 1–22.
Leacock, C. and Chodorow, M. 1994. Filling in a sparse training space for word sense identification. ins.
Lee, J. H., Kim, M. H. and Lee, Y. J. 1993. Information retrieval based on conceptual distance in IS-A hierarchies. Journal of Documentation,49(2), pp. 188–207,.June.
Lesk, M. 1986. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In Proceedings of the 1986 SIGDOC Conference, pp. 24–26.
Marcus, M. P., Santorini, B. and Marcinkiewicz, M. A. 1993. Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics, 19 (2): 313–330.
McKeown, K. and Hatzivassiloglou, V. 1993. Augmenting lexicons automatically: Clustering semantically related adjectives. In Bates (ed), ARPA Workshop on Human Language Technology. Morgan Kaufmann.
Miller, G. 1990. WordNet: An on-line lexical database. International Journal of Lexicography, 3(4). (Special Issue).
Pereira, P., Tishby, N. and Lee, L. 1993. Distributional clustering of English words. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics (A CL-93), Columbus, OH, pp. 183–190.
Rada, R., Mili, H., Bicknell, E. and Blettner, M. 1989. Development and application of a metric on semantic nets. IEEE Transaction on Systems, Man, and Cybernetics, 19 (1): 17–30.
Resnik, P. 1993. Selection and Information: A Class-Based Approach to Lexical Relationships. Ph.D. thesis, University of Pennsylvania. ftp://ftp.cis.upenn.edu/pub/ires/tr/93 - 42.ps.Z.
Resnik, P. 1995. Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAJ-95). (cmp-íg/9511007).
Richardson, R., Smeaton, A. F. and Murphy, J. 1994. Using WordNet as a knowledge base for measuring semantic similarity between words. Working Paper CA-1294, Dublin City University, School of Computer Applications, Dublin, Ireland. ftp://ftp.compapp.dcu.ie/pub/w-papers/1994/CAl294.ps.Z.
Schütze, H. 1993. Word space. In Hanson, Cowan, and Lee Giles (eds) Advances in Neural Information Processing Systems 5, pp. 895–902. Morgan Kaufmann Publishers, San Mateo, CA.
Sussna, M. 1993. Word sense disambiguation for free-text indexing using a massive semantic network. In Proceedings of the Second International Conference on Information and Knowledge Management (CIKM-93), Arlington, Virginia.
Voorhees, E. M. 1994. Query expansion using lexical-semantic relations. In 17th International Conference on Research and Development in Information Retrieval (SIGIR ‘84), Dublin, Ireland.
Yarowsky, D. 1992. Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. In Proceedings of the 14th International Conference on Computational Linguistics (COLING-92), pp. 454–460, Nantes, France.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Resnik, P. (1999). Disambiguating Noun Groupings with Respect to WordNet Senses. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds) Natural Language Processing Using Very Large Corpora. Text, Speech and Language Technology, vol 11. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2390-9_6
Download citation
DOI: https://doi.org/10.1007/978-94-017-2390-9_6
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5349-7
Online ISBN: 978-94-017-2390-9
eBook Packages: Springer Book Archive