Date: 20 Jun 2007
Automatically learning semantic knowledge about multiword predicates
Rent the article at a discountRent now
* Final gross prices may vary according to local VAT.Get Access
Highly frequent and highly polysemous verbs, such as give, take, and make, pose a challenge to automatic lexical acquisition methods. These verbs widely participate in multiword predicates (such as light verb constructions, or LVCs), in which they contribute a broad range of figurative meanings that must be recognized. Here we focus on two properties that are key to the computational treatment of LVCs. First, we consider the degree of figurativeness of the semantic contribution of such a verb to the various LVCs it participates in. Second, we explore the patterns of acceptability of LVCs, and their productivity over semantically related combinations. To assess these properties, we develop statistical measures of figurativeness and acceptability that draw on linguistic properties of LVCs. We demonstrate that these corpus-based measures correlate well with human judgments of the relevant property. We also use the acceptability measure to estimate the degree to which a semantic class of nouns can productively form LVCs with a given verb. The linguistically-motivated measures outperform a standard measure for capturing the strength of collocation of these multiword expressions.
Alba-Salas, J. (2002). Light verb constructions in Romance: A syntactic analysis. PhD thesis, Cornell University.
Baldwin, T., Bannard, C., Tanaka, T., & Widdows, D. (2003). An empirical model of multiword expression decomposability. In Proceedings of the ACL-SIGLEX Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, pp. 89–96.
Baldwin, T., & Villavicencio, A. (2002). Extracting the unextractable: A case study on verb-particles. In Proceedings of the Sixth Conference on Computational Natural Language Learning (CoNLL’02), pp. 98–104.
Bannard, C., Baldwin, T., & Lascarides, A. (2003). A statistical approach to the semantics of verb-particles. In Proceedings of the ACL-SIGLEX Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, pp. 65–72.
BNC Reference Guide (2000). Reference guide for the British National Corpus (World Edition). Second edition.
Brinton, L. J., & Akimoto, M. (Eds.) (1999). Collocational and idiomatic aspects of composite predicates in the history of English. John Benjamins Publishing Company.
Butt, M. (2003). The light verb jungle. Manuscript.
Cacciari, C. (1993). The place of idioms in a literal and metaphorical world. In C. Cacciari & P. Tabossi (Eds.), Idioms: Processing, structure, and interpretation (pp. 27–53). Lawrence Erlbaum Associates.
Church, K., Gale, W., Hanks, P., & Hindle, D. (1991). Using statistics in lexical analysis. In U. Zernik (Ed.), Lexical acquisition: Exploiting on-line resources to build a lexicon (pp. 115–164). Lawrence Erlbaum.
Claridge, C. (2000). Multi-word verbs in early modern English: A corpus-based study. Amsterdam, Atlanta: Rodopi B.V.
Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70, 213–220.CrossRef
Collins, M. (1999). Head-driven statistical models for natural language parsing. PhD thesis, University of Pennsylvania.
Cruse, D. A. (1986). Lexical semantics. Cambridge University Press.
Desbiens, M. C., & Simon, M. (2003). Déterminants et locutions verbales. Manuscript.
Dras, M., & Johnson, M. (1996). Death and lightness: Using a demographic model to find support verbs. In Proceedings of the Fifth International Conference on the Cognitive Science of Natural Language Processing.
Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1), 61–74.
Fazly, A. (2007). Automatic acquisition of lexical knowledge about multiword predicates. PhD thesis, University of Toronto.
Fazly, A., North, R., & Stevenson, S. (2005). Automatically distinguishing literal and figurative usages of highly polysemous verbs. In Proceedings of the ACL’05 Workshop on Deep Lexical Acquisition, pp. 38–47.
Fazly, A., North, R., & Stevenson, S. (2006). Automatically determining allowable combinations of a class of flexible multiword expressions. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics (CICLing’06), pp. 81–92.
Fazly, A., & Stevenson, S. (2006). Automatically constructing a lexicon of verb phrase idiomatic combinations. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL’06), pp. 337–344.
Feinstein, A. R., & Cicchetti, D. V. (1990). High agreement but low kappa:I. The problems of two paradoxes. Journal of Clinical Epidemiology, 43(6), 543–549.CrossRef
Fellbaum, C. (Ed.) (1998). WordNet, an electronic lexical database. The MIT Press.
Gibbs, R. W. (1993). Why idioms are not dead metaphors. In C. Cacciari & P. Tabossi (Eds.), Idioms: Processing, structure, and interpretation (pp. 57–77). Lawrence Erlbaum Associates.
Gibbs, R., & Nayak, N. P. (1989). Psychololinguistic studies on the syntactic behaviour of idioms. Cognitive Psychology, 21, 100–138.CrossRef
Glucksberg, S. (1993). Idiom meanings and allusional content. In C. Cacciari & P. Tabossi (Eds.), Idioms: Processing, structure, and interpretation (pp. 3–26). Lawrence Erlbaum Associates.
Grefenstette, G., & Teufel, S. (1995). Corpus-based method for automatic identification of support verbs for nominalization. In Proceedings of the Seventh Meeting of the European Chapter of the Association for Computational Linguistics (EACL’95).
Inkpen, D. (2003). Building a lexical knowledge-base of near-synonym differences. PhD thesis, University of Toronto.
Johnson, M. (1987). The body in the mind: The bodily basis of meaning, imagination, and reason. The University of Chicago Press.
Karimi, S. (1997). Persian complex verbs: Idiomatic or compositional? Lexicology, 3(1), 273–318.
Kearns, K. (2002). Light verbs in English. Manuscript.
Keller, F., & Lapata, M. (2003). Using the web to obtain frequencies for unseen bigrams. Computational Linguistics, 29, 459–484.CrossRef
Krenn, B., & Evert, S. (2001). Can we do better than frequency? A case study on extracting PP-verb collocations. In Proceedings of the ACL’01 Workshop on Collocations, pp. 39–46.
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. The University of Chicago Press.
Levin, B. (1993). English verb classes and alternations: A preliminary investigation. The University of Chicago Press.
Lin, D. (1999). Automatic identification of non-compositional phrases. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL’99), pp. 317–324.
Lin, T. -H. (2001). Light verb syntax and the theory of phrase structure. PhD thesis, University of California, Irvine.
McCarthy, D., Keller, B., & Carroll, J. (2003). Detecting a continuum of compositionality in phrasal verbs. In Proceedings of the ACL-SIGLEX Workshop on Multiword Expressions: Analysis, Acquisition and Treatment.
Melamed, I. D. (1997). Automatic discovery of non-compositional compounds in parallel data. In Proceedings of the Second Conference on Empirical Methods for Natural Language Processing (EMNLP’97).
Miyamoto, T. (2000). The light verb construction in Japanese: The role of the verbal noun. John Benjamins Publishing Company.
Mohammad, S., & Hirst, G. (2006). Determining word sense dominance using a thesaurus. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL’06), pp. 121–128.
Moirón, M. B. V. (2004). Discarding noise in an automatically acquired lexicon of support verb constructions. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC).
Moon, R. (1998). Fixed expressions and idioms in English: A corpus-based approach. Oxford University Press.
Newman, J. (1996). Give: A cognitive linguistic study. Mouton de Gruyter.
Newman, J., & Rice, S. (2004). Patterns of usage for English SIT, STAND, and LIE: A cognitively inspired exploration in corpus linguistics. Cognitive Linguistics, 15(3), 351–396.CrossRef
Nunberg, G., Sag, I. A., & Wasow, T. (1994). Idioms. Language, 70(3), 491–538.CrossRef
Pauwels, P. (2000). Put, set, lay and place: A cognitive linguistic approach to verbal meaning. LINCOM EUROPA.
Pustejovsky, J. (1995). The generative lexicon. MIT Press.
Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985). A comprehensive grammar of the English language. Longman.
Rohde, D. L. T. (2004). TGrep2 User Manual.
Sag, I. A., Baldwin, T., Bond, F., Copestake, A., & Flickinger, D. (2002). Multiword expressions: A pain in the neck for NLP’. In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics (CICLing’02), pp. 1–15.
Seretan, V., Nerima, L., & Wehrli, E. (2003). Extraction of multi-word collocations using syntactic bigram composition. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP’03).
Stevenson, S., Fazly, A., & North, R. (2004). Statistical measures of the semi-productivity of light verb constructions. In Proceedings of the ACL’04 Workshop on Multiword Expressions: Integrating Processing, pp. 1–8
Turney, P. D. (2001). Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. In Proceedings of the 12th European Conference on Machine Learning (ECML’01), pp. 491–502.
Uchiyama, K., Baldwin, T., & Ishizaki, S. (2005). Disambiguating Japanese compound verbs. Computer Speech and Language, 19, 497–512.
Venkatapathy, S., & Joshi, A. (2005). Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features. In Proceedings of the Joint Conference on Human Language Technology and Empirical Methods for Natural Language Processing (HLT-EMNLP’05), pp. 899–906.
Villavicencio, A. (2003). Verb-particle constructions and lexical resources. In Proceedings of the ACL-SIGLEX Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, pp. 57–64.
Villavicencio, A. (2005). The availability of verb-particle constructions in lexical resources: How much is enough? Computer Speech and Language, 19, 415–432.
Wanner, L. (2004). Towards automatic fine-grained semantic classification of verb-noun collocations. Natural Language Engineering, 10(2), 95–143.CrossRef
Wermter, J., & Hahn, U. (2005). Paradigmatic modifiability statistics for the extraction of complex multi-word terms. In Proceedings of the Joint Conference on Human Language Technology and Empirical Methods for Natural Language Processing (HLT-EMNLP’05), pp. 843–850.
Wierzbicka, A. (1982). Why can you have a drink When you can’t *Have an eat? Language, 58(4), 753–799.CrossRef
- Automatically learning semantic knowledge about multiword predicates
Language Resources and Evaluation
Volume 41, Issue 1 , pp 61-89
- Cover Date
- Print ISSN
- Online ISSN
- Springer Netherlands
- Additional Links
- Lexical acquisition
- Corpus-based statistical measures
- Verb semantics
- Multiword predicates
- Light verb constructions
- Industry Sectors