Child Acquisition of Multiword Verbs: A Computational Investigation
Abstract
Traditional theories of grammar, as well as computational modelling of language acquisition, have focused either on aspects of word learning, or grammar learning. Work on intermediate linguistic constructions (the area between words and combinatory grammar rules) has been very limited. Although recent usage-based theories of language learning emphasize the role of multiword constructions, much remains to be explored concerning the precise computational mechanisms that underlie how children learn to identify and interpret different types of multiword lexemes. The goal of the current study is to bring in ideas from computational linguistics on the topic of identifying multiword lexemes, and to explore whether these ideas can be extended in a natural way to the domain of child language acquisition. We take a first step toward computational modelling of the acquisition of a widely-documented class of multiword verbs, such as take the train and give a kiss, that children must master early in language learning. Specifically, we show that simple statistics based on the linguistic properties of these multiword verbs are informative for identifying them in a corpus of child-directed utterances. We present preliminary experiments demonstrating that such statistics can be used within a word learning model to learn associations between meanings and sequences of words.
Keywords
Meaning Symbol Word Learning Linguistic Property Noun Pair Meaning ProbabilityReferences
- 1.Alba-Salas, J. (2002). Light verb constructions in Romance: A syntactic analysis. Ph.D. thesis, Cornell University.Google Scholar
- 2.Alishahi, A., & Fazly, A. (2010). Integrating syntactic knowledge into a model of cross-situational word learning. In Proceedings of CogSci’2010, Portland.Google Scholar
- 3.Alishahi, A., & Stevenson, S. (2008). A computational model of early argument structure acquisition. Cognitive Science: A Multidisciplinary Journal, 32(5), 789–834.CrossRefGoogle Scholar
- 4.Alishahi, A., & Stevenson, S. (2011). Gradual acquisition of verb selectional preferences in a Bayesian model. In Poibeau et al. (2011).Google Scholar
- 5.Arnon, I., & Snider, N. (2010). More than words: Frequency effects for multi-word phrases. Journal of Memory and Language, 62(1), 67–82. ISSN 0749–596X.Google Scholar
- 6.Bannard, C. (2007). A measure of syntactic flexibility for automatically identifying multiword expressions in corpora. In Multiword Expression’07: Proceedings of the Workshop on a Broader Perspective on Multiword Expressions (pp. 1–8). Prague: Association for Computational Linguistics.Google Scholar
- 7.Bannard, C., & Matthews, D. (2008). Stored word sequences in language learning: The effect of familiarity on children’s repetition of four-word combinations. Psychological Science, 19(3), 241–248.CrossRefGoogle Scholar
- 8.Bannard, C., Baldwin, T., & Lascarides, A. (2003). A statistical approach to the semantics of verb-particles. In Proceedings of the ACL-SIGLEX Workshop on Multiword Expressions: Analysis, Acquisition and Treatment (pp. 65–72), Sapporo.Google Scholar
- 9.Borensztajn, G., Zuidema, W., & Bod, R. (2009). Children’s grammars grow more abstract with age – evidence from an automatic procedure for identifying the productive units of language. Topics in Cognitive Science, 1(1), 175–188.CrossRefGoogle Scholar
- 10.Brown, R. (1957). Linguistic determinism and the part of speech. Journal of Abnormal Psychology, 55(1), 1–5.CrossRefGoogle Scholar
- 11.Brown, R. (1973). A first language: The early stages. Cambridge: Harvard University Press.Google Scholar
- 12.Butt, M. (1997). Aspectual complex predicates, passives and dispositionability. In Talk Held at the 1997 Meeting of the Linguistics Association of Great Britain (LAGB’97), University of Essex. http://ling.uni-konstanz.de/pages/home/butt/.
- 13.Chang, N. (2004). Putting meaning into grammar learning. In Proceedings of the ACL’04 Workshop on Psycho-Computational Models of Human Language Acquisition (pp. 17–24), Geneva.Google Scholar
- 14.Church, K., Gale, W., Hanks, P., & Hindle, D. (1991). Using statistics in lexical analysis. In U. Zernik (Ed.), Lexical acquisition: Exploiting on-line resources to build a lexicon (pp. 115–164). Hillsdale: Erlbaum.Google Scholar
- 15.Claridge, C. (2000). Multiword verbs in early modern english. Language and Computers 32. New York: Rodopi.Google Scholar
- 16.Clark, E. V. (1996). Early verbs, event-types, and inflections. In C. E. Johnson & J. H. V. Gilbert (Eds.), Children’s language (Vol. 9, pp. 61–73). Mahwah: Erlbaum.Google Scholar
- 17.Clark, A. (2001). Unsupervised induction of stochastic context free grammars with distributional clustering. In Proceedings of Conference on Computational Natural Language Learning (pp. 105–112), Toulouse.Google Scholar
- 18.Connor, M., Fisher, C., & Roth, D. (2011). Starting from scratch in semantic role labeling: Early indirect supervision. In Poibeau et al. (2011).Google Scholar
- 19.Cook, P., & Stevenson, S. (2006). Classifying particle semantics in English verb-particle constructions. In Proceedings of the COLING-ACL Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties (pp. 45–53), Sydney.Google Scholar
- 20.Cowie, A. P. (1981). The treatment of collocations and idioms in learner’s dictionaries. Applied Linguistics, II(3), 223–235.Google Scholar
- 21.Deane, P. (2005). A nonparametric method for extraction of candidate phrasal terms. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05) (pp. 605–613), Ann Arbor.Google Scholar
- 22.Devereux, B. J. & Costello, F. J. (2011). Learning to interpret novel noun-noun compounds: Evidence from category learning experiments. In Poibeau et al. (2011).Google Scholar
- 23.Dominey, P. F., & Inui, T. (2004). A developmental model of syntax acquisition in the construction grammar framework with cross-linguistic validation in English and Japanese. In Proceedings of the ACL’04 Workshop on Psycho-Computational Models of Human Language Acquisition (pp. 33–40), Geneva.Google Scholar
- 24.Dras, M. (1995). Automatic identification of support verbs: A step towards a definition of semantic weight. In Proceedings of the Eighth Australian Joint Conference on Artificial Intelligence (pp. 451–458). Singapore: World Scientific.Google Scholar
- 25.Dras, M., & Johnson, M. (1996). Death and lightness: Using a demographic model to find support verbs. In Proceedings of the Fifth International Conference on the Cognitive Science of Natural Language Processing (pp. 165–172), Dublin.Google Scholar
- 26.Everaert, M., van der Linden, E. -J., Schenk, A., & Schreuder, R. (Eds.). (1995). Idioms: Structural and psychological perspectives. Hillsdale: Lawrence Erlbaum Associates.Google Scholar
- 27.Evert, S. (2008). Corpora and collocations. In A. Lüdeling & M. Kytö (Eds.), Corpus linguistics. An international handbook. Berlin: Mouton de Gruyter. Article 58.Google Scholar
- 28.Evert, S., Heid, U., & Spranger, K. (2004). Identifying morphosyntactic preferences in collocations. In Proceedings of the 4th Int’l Conference on Language Resources and Evaluation (pp. 907–910), Lisbon.Google Scholar
- 29.A. Fazly. (2007). Automatic acquisition of lexical knowledge about multiword predicates. Ph.D. in Computer Science, University of Toronto.Google Scholar
- 30.Fazly, A., & Stevenson, S. (2007). Distinguishing subtypes of multiword expressions using linguistically-motivated statistical measures. In Multiword Expression’07: Proceedings of the Workshop on a Broader Perspective on Multiword Expressions (pp. 9–16), Prague. Association for Computational Linguistics.Google Scholar
- 31.Fazly, A., Stevenson, S., & North, R. (2007). Automatically learning semantic knowledge about multiword predicates. Journal of Language Resources and Evaluation, 41(1), 61–89.CrossRefGoogle Scholar
- 32.Fazly, A., Nematzadeh, A., & Stevenson, S. (2009). Acquiring multiword verbs: The role of statistical evidence. In Proceedings of the 31st Annual Conference of the Cognitive Science Society, Amsterdam.Google Scholar
- 33.Fazly, A., Alishahi, A., & Stevenson, S. (2010). A probabilistic computational model of cross-situational word learning. Cognitive Science, 34, 1017–1063.CrossRefGoogle Scholar
- 34.Fellbaum, C. (1993). The determiner in English idioms (pp. 271–295). Hillsdale: Lawrence Erlbaum Associates.Google Scholar
- 35.Fellbaum, C. (Ed.). (1998). WordNet, an electronic lexical database. Cambridge/London: MIT Press.zbMATHGoogle Scholar
- 36.Fisher, C. (2002). Structural limits on verb mapping: The role of abstract structure in 2.5-year-olds’ interpretations of novel verbs. Developmental Science, 5(1), 55–64.Google Scholar
- 37.Frank, M., Goodman, N., & Tenenbaum, J. B. (2007). A Bayesian framework for cross-situational word-learning. In Advances in Neural Information Processing Systems. Cambridge/London: MITGoogle Scholar
- 38.Gentner, D., & France, I. M. (2004). The verb mutability effect: Studies of the combinatorial semantics of nouns and verbs. In S. L. Small, G. W. Cottrell, & M. K. Tanenhaus (Eds.), Lexical ambiguity resolution: Perspectives from psycholinguistics, neuropsychology, and artificial intelligence (pp. 343–382). San Mateo: Kaufmann.Google Scholar
- 39.Gertner, Y., Fisher, C., & Eisengart, J. (2006). Learning words and rules: Abstract knowledge of word order in early sentence comprehension. Psychological Science, 17(8), 684–691.CrossRefGoogle Scholar
- 40.Goldberg, A. E. (1995). Constructions: A construction grammar approach to argument structure. Chicago: University of Chicago Press.Google Scholar
- 41.Goldberg, A. E. (2006). Constructions at work: The nature of generalization in language. Oxford: Oxford University Press.Google Scholar
- 42.Grant, L. E. (2005). Frequency of ‘core idioms’ in the British National Corpus (BNC). International Journal of Corpus Linguistics, 10(4), 429–451.MathSciNetCrossRefGoogle Scholar
- 43.Grefenstette, G., & Teufel, S. (1995). Corpus-based method for automatic identification of support verbs for nominalization. In Proceedings of the 7th Meeting of the European Chapter of the Association for Computational Linguistics (EACL’95) (pp. 98–103), Dublin.Google Scholar
- 44.Israel, M. How children get constructions. In M. Fried & J. -O. Ostman (Eds.), Pragmatics in construction grammar and frame semantics. John Benjamins. (submitted)Google Scholar
- 45.Karimi, S. (1997). Persian complex verbs: Idiomatic or compositional? Lexicology, 3(1), 273–318.Google Scholar
- 46.Kearns, K. (2002). Light verbs in English. unpublished manuscript. http://www.ling.canterbury.ac.nz/people/kearns.html.
- 47.Krott, A., Gagne, C., & Nicoladis, E. (2009). How the parts relate to the whole: Frequency effects on childrens interpretations of novel compounds. Journal of Child Language, 36(01), 85–112.CrossRefGoogle Scholar
- 48.Kytö, M. (1999). Collocational and idiomatic aspects of verbs in Early Modern English (pp. 167–206). Amsterdam/Philadelphia: John Benjamins Publishing Company.Google Scholar
- 49.Xiaowei, P. Li, & MacWhinney, B. (2007). Dynamic self-organization and early lexical development in children. Cognitive Science, 31, 581–612.CrossRefGoogle Scholar
- 50.Lin, D. (1999). Automatic identification of non-compositional phrases. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics (pp. 317–324), College Park. Association for Computational Linguistics.Google Scholar
- 51.Lin, T. -H. (2001). Light verb syntax and the theory of phrase structure. Ph.D. thesis, University of California, Irvine.Google Scholar
- 52.MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk. The Database (3rd ed., Vol. 2). Mahwah: Lawrence Erlbaum Associates.Google Scholar
- 53.McCarthy, D., Keller, B., & Carroll, J. (2003). Detecting a continuum of compositionality in phrasal verbs. In Proceedings of the ACL-SIGLEX Workshop on Multiword Expressions: Analysis, Acquisition and Treatment (pp. 73–80), Sapporo.Google Scholar
- 54.Miyamoto, T. (2000). The light verb construction in Japanese: The role of the verbal noun. Amsterdam/Philadelphia: John Benjamins.Google Scholar
- 55.Moon, R. (1998). Fixed expressions and idioms in English: A corpus-based approach. New York: Oxford University Press.Google Scholar
- 56.Naigles, L., & Kako, E. T. (1993). First contact in verb acquisition: Defining a role for syntax. Child Development, 64, 1665–1687.CrossRefGoogle Scholar
- 57.Nation, K., Marshall, C. M., & Altmann, G. T. M. (2003). Investigating individual differences in children’s real-time sentence comprehension using language-mediated eye movements. Journal of Experimental Child Psychology, 86, 314–329.CrossRefGoogle Scholar
- 58.Newman, J. (1996). Give: A cognitive linguistic study. Berlin/New York: Mouton de Gruyter.Google Scholar
- 59.Newman, J., & Rice, S. (2004). Patterns of usage for English SIT, STAND, and LIE: A cognitively inspired exploration in corpus linguistics. Cognitive Linguistics, 15(3), 351–396.CrossRefGoogle Scholar
- 60.Onnis, L., Roberts, M., & Chater, N. (2002). Simplicity: A cure for overgeneralizations in language acquisition. In Proceedings of the 24th Annual Conference of the Cognitive Science Society (pp. 720–725), Fairfax.Google Scholar
- 61.Parisien, C., & Stevenson, S. (2010). Learning verb alternations in a usage-based Bayesian model. In Proceeding of the 32nd Annual Meeting of the Cognitive Science Society, Austin.Google Scholar
- 62.Pauwels, P. (2000). Put, set, lay and place: A cognitive linguistic approach to verbal meaning. Munich: Lincom Europa.Google Scholar
- 63.Perfors, A., Tenenbaum, J. B., & Wonnacott, E. (2010). Variability, negative evidence, and the acquisition of verb argument constructions. Journal of Child Language, 37(3), 607–642.CrossRefGoogle Scholar
- 64.Quochi, V. (2007). A usage-based approach to light verb constructions in Italian: Development and use. Ph.D. thesis, Universit‘a di Pisa.Google Scholar
- 65.Regier, T. (2005). The emergence of words: Attentional learning in form and meaning. Cognitive Science, 29, 819–865.CrossRefGoogle Scholar
- 66.Riehemann, S. (2001). A constructional approach to idioms and word formation. Ph.D. thesis, Stanford University, Stanford.Google Scholar
- 67.Sag, I. A., Baldwin, T., Bond, F., Copestake, A., & Flickinger, D. (2002). Multiword expressions: A pain in the neck for NLP. In Proceedings of the 3rd International Conference on Intelligent Text Processing and Computational Linguistics (CICLing’02) (pp. 1–15), Mexico City, Mexico.Google Scholar
- 68.Sagae, K., Davis, E., Lavie, A., MacWhinney, B., & Wintner, S. (2007). High-accuracy annotation and parsing of CHILDES transcripts. In Proceedings of the ACL’07 Workshop on Cognitive Aspects of Computational Language Acquisition, Prague.Google Scholar
- 69.Sakas, W., & Fodor, J. D. (2001). The structural triggers learner. In S. Bertolo (Eds.), Language acquistion and learnability, (172–233). Cambridge: Cambridge University Press.CrossRefGoogle Scholar
- 70.Scott, R. M., & Fisher, C. (2009). Two-year-olds use distributional cues to interpret transitivity-alternating verbs. Language and Cognitive Processes, 24, 777–803CrossRefGoogle Scholar
- 71.Smadja, F. (1993). Retrieving collocations from text: Xtract. Computational Linguistics, 19(1), 143–177.Google Scholar
- 72.Sosa, A. V., & MacFarlane, J. (2002). Evidence for frequency based constituents in the mental lexicon: Collocations involving the word of. Brain and Language, 83, 227–236.CrossRefGoogle Scholar
- 73.Theakston, A. L., Lieven, E. V. M., Pine, J. M., & Rowland, C. F. (2002). Going, going, gone: The acquisition of the verb ‘go’. Journal of Child Language, 29, 783–811.CrossRefGoogle Scholar
- 74.Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Cambridge: Harvard University Press.Google Scholar
- 75.Venkatapathy, S., & Joshi, A. (2005). Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features. In Proceeding of HLT-EMNLP’05 (pp. 899–906), Vancouver.Google Scholar
- 76.Wierzbicka, A. (1982). Why can you Have a Drink when you can’t *Have an Eat? Language, 58(4), 753–799.CrossRefGoogle Scholar
- 77.Yu, C., & Smith, L. B. (2006). Statistical cross-situational learning to build word-to-world mappings. In Proceedings of the 28th Annual Conference of the Cognitive Science Society, Vancouver.Google Scholar