Abstract
This paper describes a heuristic approach to automatic acquisition of contextual representations of senses in a machine-readable dictionary (MRD). Including contextual information in an MRD-based lexical database offers several benefits. First, the representation can be used to merge closely related senses and construct a coarser sense division, so unnecessarily fine sense distinctions can be avoided in word sense disambiguation (WSD). The contextual information can also be used as a knowledge base to develop a WSD system. Furthermore, if the algorithms run on several MRDs, the contextual representation also provides a means of linking relevant senses across multiple MRDs to create an integrated lexical database. The algorithms are based primarily on information retrieval techniques to build a list of topics that are most relevant to the definition of each MRD sense. An implementation of the method using definition sentences in the Longman Dictionary of Contemporary English is described. To this end, the topical word lists and topical cross-references in the Longman Lexicon of Contemporary English are used. We have conducted a series of experiments and evaluations to assess the performance of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ageno, A., I. Castellon, M. A. Marti, G. Rigau, F. Ribas, H. Rodriguez, M. Taule and F. Verdejo. 1992. SEISD: An Environment for Extraction of Semantic Information from On-Line Dictionaries. In the Proceedings of the 3rd Conference on Applied Natural Language Processing, pp. 253-254, Trento, Italy.
Ahlswede, T. and M. Evens. 1988. Parsing vs. Text Processing in the Analysis of Dictionary Definitions. In the Proceedings of the 26th Annual Meeting of the Association for Computational Linguistics, pp. 217-224.
Alshawi, H. 1987. Processing Dictionary Definitions with Phrasal Pattern Hierarchies. American Journal of Computational Linguistics, Vol. 13. no. 3 ., pp. 195–202.
Alshawi, H., B. Boguraev and D. Carter. 1989. Placing the Dictionary On-Line. In B. Boguraev and T. Briscoe (eds.), Computational Lexicography for Natural Language Processing, pp. 41-63,Longman, London.
Amsler, R. A. 1984a. Machine-Readable Dictionaries. Annual Review of Information Science and Technology, 19, pp. 161–209.
Amsler, R. A. 1984b. Lexical Knowledge Bases, Panel Session on Machine-Readable Dictionaries. In the Proceedings of the Tenth International Congress on Computational Linguistics,pp. 458-459, Stanford, CA.
Amsler, R. A. 1987. Words and Words. In the Proceedings of the Third Workshop on Theoretical Issues in Natural Language Processing,pp. 7-9, New Mexico State Unive:sity at Las Cruces, NM.
Brown, P. F., S. A. Della Pietra, V. J. Della Pietra and R. L. Mercer. 1991. Word Sense Disambiguation using Statistical Methods. In the Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics,pp. 264-270.
Bruce, R. and J. Wiebe. 1994 Word Sense Disambiguation using Decomposable Models. In the Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics,pp. 139-145.
Chang, J. S., J. N. Chen, H. H. Sheng and S. J. Ker. 1996. Combining Machine Readable Lexical Resources and Bilingual Corpora for Broad Word Sense Disambiguation. In the Proceedings of the Second Conference of the Association for Machine Translation, pp. 115–124. Montréal, Québec, Canada.
Chen, J. N. and J. S. Chang. 1994 Towards Generality and Modularity in Statistical Word Sense Disambiguation. In the Proceedings of the 8th Asian Conference on Language, Information and Computation,pp. 45-48.
Chen, J. N. and J. S. Chang. 1998. Topical Clustering of MRD Sense based on Information Retrieval Techniques. Computational Linguistics, Vol.24 no.1, pp. 61–95.
Chodorow, M. S., R. J. Byrd and G. E. Heidorn. 1985. Extracting Semantic Hierarchies from a Large On-Line Dictionary. In the Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics,pp. 299-304.
Church, K. W. 1988. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. In the Proceedings of the 2nd Conference on Applied Natural Language Processing, pp. 136–143, Austin, Texas, USA.
Copestake, A. 1990. An Approach to Building the Hierarchical Element of a Lexical Knowledge base from a Machine Readable Dictionary. In the Proceedings of the First International Workshop on Inheritance in Natural Language Processing,pp. 19-29, Tilburg, The Netherlands.
Cowie, J., J. Guthrie and L. Guthrie. 1992. Lexical Disambiguation using Simulated Annealing. In the Proceedings of the 14th International Conference on Computational Linguistics,pp. 359-365.
Dagan, I. and A. Itai. 1994. Word Sense Disambiguation using a Second Language Monolingual Corpus. Computational Linguistics,Vol. 20 no. 4, pp. 563-596.
Dagan, I., A. Itai and U. Schwall. 1991. Two Languages are More Informative than One. In the Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics,pp. 130-137.
Dolan, W. B. 1994. Word Sense Disambiguation: Clustering Related Senses. In the Proceedings of the 15th International Conference on Computational Linguistics,pp. 712716.
Gale, W., K. W. Church and D. Yarowsky. 1992. Using Bilingual Materials to Develop Word Sense Disambiguation Methods. In the Proceedings of the 4th International Conference on Theoretical and Methodological Issues in Machine Translation,pp. 101-112.
Guthrie L., B. M. Slator, Y. Wilks and R. Bruce. 1990. Is there Contents in Err pty Heads? In the Proceedings of the 13th International Conference on Computational Linguistics,pp. 138-143.
Ide N. and J. Véronis. 1998. Introduction to the Special Issue on Word Sense Disambiguation: the State of the Art. Computational Linguistics, Vol.24 no.1, pp. 1–10.
Jensen, K. and J. L. Binot. 1987. Disambiguating Prepositional Phrase Attachments by using On-Line Dictionary Definitions. Computational Linguistics, Vol.13 no.4, pp. 251–260.
Klavans, J. L., M. S. Chodorow and N. Wacholder. 1990. From Dictionary to Knowledge Base via Taxonomy. In the Proceedings of the Sixth Conference of the University of Waterloo Centre for the New Oxford English Dictionary and Text Research: Electronic Text Research,pp. 110-132, University of Waterloo, Canada.
Krovetz, R. and W. B. Croft. 1992. Lexical Ambiguity and Information Retrieval. ACM Transaction on Information Systems, pp. 115–141.
Krovetz, R. 1992. Sense-Linking in a Machine Readable Dictionary. In the Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, pp. 330–332.
Lesk, M. E. 1986. Automated Sense Disambiguation using Machine-Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone. In the Proceedings of the ACM SIGDOC Conference,pp. 24-26, Toronto, Ontario.
Liddy, E. D., W. Paik and E. S. Yu. 1993. Document Filtering using Semantic Information from a Machine Readable Dictionary. In the Proceedings of the ACL Workshop on Very Large Corpora,pp. 20-29.
Longman. 1978. Longman Dictionary of Contemporary English, P. Proctor (ed.) London: Longman Group.
Luk, A. K. 1995. Statistical Sense Disambiguation with Relatively Small Corpora using Dictionary Definitions. In the Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics,pp. 181-188.
McArthur, T. 1992. Longman Lexicon of Contemporary English. Longman Group ( Far East) Ltd., Hong Kong.
McRoy, S. 1992. Using Multiple Knowledge Sources for Word Sense Discrimination. Com putational Linguistics,Vol. 18 no. 1, pp. 1-30.
Miller, G. A., R. Beckwith, C. Fellbaum, D. Gross and K. Miller. 1990. Word-Net: An On-Line Lexical Database. International Journal of Lexicography,Vol. 3 no. 4:235244.
Montemagni, S. and L. Vanderwende. 1992. Structural Pattern vs. String Pattern for Extracting Semantic Information from Dictionaries. In the Proceedings of the fifteenth International Conference on Computational Linguistics,pp. 546-552.
Ng, H. T. and H. B. Lee. 1996. Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Examplar-Based Approach. In the Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics,pp. 40-47, Santa Cruz, CA, USA.
Okumura, A. and E. Hovy. 1994. Lexicon-to-Ontology Concept Association using a Bilingual Dictionary. In the Proceedings of the First Conference of the Association for Machine Translation in the Americas,pp. 177-184, Columbia, MD.
Ostler, N., and B. T. S. Atkins. 1991. Predictable Meaning Shift: Some Linguistic Properties of Lexical Implication Rules. In the Proceedings of the 1991 ACL Workshop on Lexical Semantics and Knowledge Representation,pp. 76-87.
Putstejovsky, J. 1991. The Generative Lexicon. Computational Linguistics,Vol. 17 no. 4, pp. 409-441.
Putstejovsky, J. and P. Bouillon. 1994. On the Proper Role of Coercion in Semantic Typing. In the Proceedings of the 15th International Conference on Computational Linguistics,pp. 706-711.
Ravin, Y. 1990. Disambiguating and Interpreting Verb Definitions. In the Proceedings of the 28h Annual Meeting of the Association for Computational Linguistics, pp. 260–267.
Roget’s Thesaurus of English words and Phrases. 1987. Longman Group UK Limited. Sanfilippo, A. and V. Poznanski. 1992. The Acquisition of Lexical Knowledge from Combined Machine-Readable Dictionary Sources. In the Proceedings of the 3rd Conference on Applied Natural Language Processing (ANLP-92),pp. 80-87, Trento, Italy.
Schütze, H. 1992. Word Sense Disambiguation with Sublexical Representations. In the Proceedings of the 1992 AAAI Workshop on Statistically-based Natural Language Programming Techniques, pp. 100–104.
Vanderwende, L. 1994. Algorithm for Interpretation of Noun Sequence. In the Proceedings of the 15th International Conference on Computational Linguistics, pp. 782–788.
Vossen, P., W. Meijs and M. D. Broeder. 1989. Meaning and Structure in Dictionary Definitions. In B. Boguraev and T. Briscoe (eds.) Computational Lexicography for Natural Language Processing, London: Longman Group UK Limited, pp. 171–190.
Webster’s Seventh New Collegiate Dictionary. 1967. C. and C. Merriam company, Springfield, Massachusetts.
Wilks, Y. A., D. C. Fass, C. M. Guo, J. E. McDonald, T. Plate, and B. A. Slator. 1990. Providing Tractable Dictionary Tools. In J. Pustejovsky (ed.) Semantics and the Lexicon, MIT Press, Cambridge, M.A.
Witten, I. H., A. Moffat and T. C. Bell. 1994. Managing Gigabytes, Van Nostrand Reinhold, New York.
Yarowsky, D. 1992. Word Sense Disambiguation using Statistical Models of Roget’s Categories Trained on Large Corpora. In the Proceedings of the 14th International Conference on Computational Linguistics,pp. 454-460, Nantes, France.
Yarowsky, D. 1995. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. In the Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pp. 189–196.
Zernik, U. 1991. Trainl vs. Train2: Tagging Word Senses in Corpus. In the Proceedings of Intelligent Systems: Current Research in Text Analysis, Information Extraction and Retrieval. GE Research and Development Center, Schenectady, New York.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Chen, J.N., Chang, J.S. (1999). Integrating Machine Readable Dictionary and Thesaurus for Conceptual Context Representation of Word Sense. In: Viegas, E. (eds) Breadth and Depth of Semantic Lexicons. Text, Speech and Language Technology, vol 10. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-0952-1_10
Download citation
DOI: https://doi.org/10.1007/978-94-017-0952-1_10
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5347-3
Online ISBN: 978-94-017-0952-1
eBook Packages: Springer Book Archive