Abstract
This paper describes how to automatically classify the functional relations from the Factotum knowledge base via a statistical machine learning algorithm. This incorporates a method for inferring prepositional relation indicators from corpus data. It also uses lexical collocations (i.e., word associations) and class-based collocations based on the WordNet hypernym relations (i.e., is-subset-of). The result shows substantial improvement over a baseline approach.
Patrick Cassidy of Micra, Inc. kindly made Factotum available and provided valuable input on the paper. Michael O’Hara helped much with the proofreading. The first author is supported by a generous GAANN fellowship from the Department of Education. Some of the work used computing resources at NMSU made possible through MII Grants EIA-9810732 and EIA-0220590.
Factotum is based on the public domain version of Roget’s Thesaurus. The latter is freely available via Project Gutenberg (http://promo.net/pg), thanks to Micra, Inc.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Miller, G.: Special issue on WordNet. International Journal of Lexicography 3(4) (1990)
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, Massachusetts (1999)
Landau, S.: Dictionaries: The Art and Craft of Lexicography. Second edn. Cambridge University Press, Cambridge (2001)
Cassidy, P.J.: An investigation of the semantic relations in the Roget’s Thesaurus: Preliminary results. In: Proc. CICLing’ 00. (2000)
Lenat, D.B.: Cyc: A large-scale investment in knowledge infrastructure. Communications of the ACM 38(11) (1995)
Tversky, A.: Features of similarity. Psychological Review 84(4) (1977) 327–352
Gati, I., Tversky, A.: Weighting common and distinctive features in perceptual and conceptual judgements. Cognitive Psychology 16 (1984) 341–370
Medin, D.L., Goldstone, R.L., Gentner, D.: Respects for similarity. Psychological Review 100 (1993) 252–278
Mahesh, K., Nirenburg, S.: A situated ontology for practical NLP. In: Proc. Workshop on Basic Ontological Issues in Knowledge Sharing. (1995) International Joint Conference on Artificial Intelligence (IJCAI-95), Aug. 19-20, 1995. Montreal, Canada.
Cruse, D.A.: Lexical Semantics. Cambridge University Press, Cambridge (1986)
Onyshkevych, B., Nirenburg, S.: A lexicon for knowledge-based MT. Machine Translation 10(2) (1995) 5–57 Special Issue on Building Lexicons for MT.
Hirst, G.: Why dictionaries should list case structures. In: Proc. Conference on Advances in Lexicography. (1986) University of Waterloo, November.
Pustejovsky, J.: The Generative Lexicon. MIT Press, Cambridge, MA (1995)
Mel’čuk, I.A., Polguere, A.: A formal lexicon in the meaning-text theory (or how to do lexica with words). Computational Linguistics 13 (3–4) (1987) 261–275
Heylen, D.: Lexical functions, generative lexicons and the world. In Saint-Dizier, P., Viegas, E., eds.: Computational Lexical Semantics. Cambridge University Press, Cambridge (1995) 125–140
Marcus, M., Kim, G., Marcinkiewicz, M.A., MacIntyre, R., Bies, A., Ferguson, M., Katz, K., Schasberger, B.: The Penn Treebank: Annotating predicate argument structure. In: ARPA Human Language Technology Workshop. (1994)
Fillmore, C.J., Wooters, C., Baker, C.F.: Building a large lexical databank which provides deep semantics. In: Proceedings of the Pacific Asian Conference on Language, Information and Computation. (2001) Hong Kong.
Sidorov, G., Bolshakov, I., Cassidy, P., Galicia-Haro, S., Gelbukh, A.: “Non-adult’ semantic field: comparative analysis for English, Spanish, and Russian. In: Proc. 3rd Tbilisi Symposium on Language, Logic, and Computation. (1999)
Kilgarri., A., Palmer, M.: Introduction to the special issue on SENSEVAL. In Computers and the Humanities [36] 15–48
Edmonds, P., Cotton, S., eds.: Proceedings of the SENSEVAL 2 Workshop. Association for Computational Linguistics (2001)
O’Hara, T., Wiebe, J., Bruce, R.F.: Selecting decomposable models for word-sense disambiguation: The grling-sdm system. In Computers and the Humanities [36] 159–164
Wiebe, J., McKeever, K., Bruce, R.: Mapping collocational properties into machine learning features. In: Proc. 6th Workshop on Very Large Corpora (WVLC-98), Montreal, Quebec, Canada, Association for Computational Linguistics SIGDAT (August 1998) 225–233
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann (1999)
Kilgarri., A., Rosenzweig, J.: Framework and results for English SENSEVAL. In Computers and the Humanities [36] 15–48
Litkowski, K.C.: Digraph analysis of dictionary preposition definitions. In: Proceedings of the Association for Computational Linguistics Special Interest Group on the Lexicon. (2002) July 11, Philadelphia, PA.
Srihari, R., Niu, C., Li, W.: A hybrid approach for named entity and sub-type tagging. In: Proc. 6th Applied Natural Language Processing Conference. (2001)
Gildea, D., Jurafsky, D.: Automatic labeling of semantic roles. In: Proc. ACL-00. (2000)
Blaheta, D., Charniak, E.: Assigning function tags to parsed text. In: Proc. NAACL-00. (2000)
O’Hara, T., Wiebe, J.: Classifying preposition semantic roles using class-based lexical associations. Technical Report NMSU-CS-2002-013, Computer Science Department, New Mexico State University (2002)
Dini, L., Tomaso, V.D., Segond, F.: Word sense disambiguation with functional relations. In: Proc. First International Conference on Language Resources and Evaluation (LREC). (1998) 28–30 May 1998, Granada, Spain.
Scott, S., Matwin, S.: Text classification using WordNet hypernyms. In Harabagiu, S., ed.: Use of WordNet in Natural Language Processing Systems: Proceedings of the Conference, Somerset, New Jersey, Association for Computational Linguistics (1998) 38–44
Gildea, D., Jurafsky, D.: Automatic labeling of semantic roles. Computational Linguistics 28(3) (2002) 245–288
Bolshakov, I., Cassidy, P., Gelbukh, A.: Russian Roget: Parallel Russian and English hierarchical thesauri with semantic links, based on an enriched Roget’s Thesaurus. In: Proc. Annual International Conf. on Applied Linguistics Dialogue-95, Moscow, Russia (1995) 57–60
Gelbukh, A.F.: Using a semantic network for lexical and syntactical disambiguation. In: Proc. CIC-97, Nuevas Aplicaciones e Innovaciones Tecnológicas en Computación. (1997) 352–366 Simposium Internacional de Computación, CIC, IPN, Mexico City, Mexico.
Gelbukh, A.F.: Using a semantic network dictionary in some tasks of disambiguation and translation. Technical report, CIC, IPN, Mexico (1998) Serie Roja, N 36.
Kilgarri., A., Palmer, M., eds.: Computers and the Humanities: Special Issue on SENSEVAL. Volume 34 (1–2). Kluwer Academic Publishers, Dordrecht, The Netherlands (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
O’Hara, T., Wiebe, J. (2003). Classifying Functional Relations in Factotum via WordNet Hypernym Associations. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2003. Lecture Notes in Computer Science, vol 2588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36456-0_36
Download citation
DOI: https://doi.org/10.1007/3-540-36456-0_36
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00532-2
Online ISBN: 978-3-540-36456-6
eBook Packages: Springer Book Archive