Child Acquisition of Multiword Verbs: A Computational Investigation

Part of the Theory and Applications of Natural Language Processing book series (NLP)


Traditional theories of grammar, as well as computational modelling of language acquisition, have focused either on aspects of word learning, or grammar learning. Work on intermediate linguistic constructions (the area between words and combinatory grammar rules) has been very limited. Although recent usage-based theories of language learning emphasize the role of multiword constructions, much remains to be explored concerning the precise computational mechanisms that underlie how children learn to identify and interpret different types of multiword lexemes. The goal of the current study is to bring in ideas from computational linguistics on the topic of identifying multiword lexemes, and to explore whether these ideas can be extended in a natural way to the domain of child language acquisition. We take a first step toward computational modelling of the acquisition of a widely-documented class of multiword verbs, such as take the train and give a kiss, that children must master early in language learning. Specifically, we show that simple statistics based on the linguistic properties of these multiword verbs are informative for identifying them in a corpus of child-directed utterances. We present preliminary experiments demonstrating that such statistics can be used within a word learning model to learn associations between meanings and sequences of words.


Meaning Symbol Word Learning Linguistic Property Noun Pair Meaning Probability 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Alba-Salas, J. (2002). Light verb constructions in Romance: A syntactic analysis. Ph.D. thesis, Cornell University.Google Scholar
  2. 2.
    Alishahi, A., & Fazly, A. (2010). Integrating syntactic knowledge into a model of cross-situational word learning. In Proceedings of CogSci’2010, Portland.Google Scholar
  3. 3.
    Alishahi, A., & Stevenson, S. (2008). A computational model of early argument structure acquisition. Cognitive Science: A Multidisciplinary Journal, 32(5), 789–834.CrossRefGoogle Scholar
  4. 4.
    Alishahi, A., & Stevenson, S. (2011). Gradual acquisition of verb selectional preferences in a Bayesian model. In Poibeau et al. (2011).Google Scholar
  5. 5.
    Arnon, I., & Snider, N. (2010). More than words: Frequency effects for multi-word phrases. Journal of Memory and Language, 62(1), 67–82. ISSN 0749–596X.Google Scholar
  6. 6.
    Bannard, C. (2007). A measure of syntactic flexibility for automatically identifying multiword expressions in corpora. In Multiword Expression’07: Proceedings of the Workshop on a Broader Perspective on Multiword Expressions (pp. 1–8). Prague: Association for Computational Linguistics.Google Scholar
  7. 7.
    Bannard, C., & Matthews, D. (2008). Stored word sequences in language learning: The effect of familiarity on children’s repetition of four-word combinations. Psychological Science, 19(3), 241–248.CrossRefGoogle Scholar
  8. 8.
    Bannard, C., Baldwin, T., & Lascarides, A. (2003). A statistical approach to the semantics of verb-particles. In Proceedings of the ACL-SIGLEX Workshop on Multiword Expressions: Analysis, Acquisition and Treatment (pp. 65–72), Sapporo.Google Scholar
  9. 9.
    Borensztajn, G., Zuidema, W., & Bod, R. (2009). Children’s grammars grow more abstract with age – evidence from an automatic procedure for identifying the productive units of language. Topics in Cognitive Science, 1(1), 175–188.CrossRefGoogle Scholar
  10. 10.
    Brown, R. (1957). Linguistic determinism and the part of speech. Journal of Abnormal Psychology, 55(1), 1–5.CrossRefGoogle Scholar
  11. 11.
    Brown, R. (1973). A first language: The early stages. Cambridge: Harvard University Press.Google Scholar
  12. 12.
    Butt, M. (1997). Aspectual complex predicates, passives and dispositionability. In Talk Held at the 1997 Meeting of the Linguistics Association of Great Britain (LAGB’97), University of Essex.
  13. 13.
    Chang, N. (2004). Putting meaning into grammar learning. In Proceedings of the ACL’04 Workshop on Psycho-Computational Models of Human Language Acquisition (pp. 17–24), Geneva.Google Scholar
  14. 14.
    Church, K., Gale, W., Hanks, P., & Hindle, D. (1991). Using statistics in lexical analysis. In U. Zernik (Ed.), Lexical acquisition: Exploiting on-line resources to build a lexicon (pp. 115–164). Hillsdale: Erlbaum.Google Scholar
  15. 15.
    Claridge, C. (2000). Multiword verbs in early modern english. Language and Computers 32. New York: Rodopi.Google Scholar
  16. 16.
    Clark, E. V. (1996). Early verbs, event-types, and inflections. In C. E. Johnson & J. H. V. Gilbert (Eds.), Children’s language (Vol. 9, pp. 61–73). Mahwah: Erlbaum.Google Scholar
  17. 17.
    Clark, A. (2001). Unsupervised induction of stochastic context free grammars with distributional clustering. In Proceedings of Conference on Computational Natural Language Learning (pp. 105–112), Toulouse.Google Scholar
  18. 18.
    Connor, M., Fisher, C., & Roth, D. (2011). Starting from scratch in semantic role labeling: Early indirect supervision. In Poibeau et al. (2011).Google Scholar
  19. 19.
    Cook, P., & Stevenson, S. (2006). Classifying particle semantics in English verb-particle constructions. In Proceedings of the COLING-ACL Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties (pp. 45–53), Sydney.Google Scholar
  20. 20.
    Cowie, A. P. (1981). The treatment of collocations and idioms in learner’s dictionaries. Applied Linguistics, II(3), 223–235.Google Scholar
  21. 21.
    Deane, P. (2005). A nonparametric method for extraction of candidate phrasal terms. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05) (pp. 605–613), Ann Arbor.Google Scholar
  22. 22.
    Devereux, B. J. & Costello, F. J. (2011). Learning to interpret novel noun-noun compounds: Evidence from category learning experiments. In Poibeau et al. (2011).Google Scholar
  23. 23.
    Dominey, P. F., & Inui, T. (2004). A developmental model of syntax acquisition in the construction grammar framework with cross-linguistic validation in English and Japanese. In Proceedings of the ACL’04 Workshop on Psycho-Computational Models of Human Language Acquisition (pp. 33–40), Geneva.Google Scholar
  24. 24.
    Dras, M. (1995). Automatic identification of support verbs: A step towards a definition of semantic weight. In Proceedings of the Eighth Australian Joint Conference on Artificial Intelligence (pp. 451–458). Singapore: World Scientific.Google Scholar
  25. 25.
    Dras, M., & Johnson, M. (1996). Death and lightness: Using a demographic model to find support verbs. In Proceedings of the Fifth International Conference on the Cognitive Science of Natural Language Processing (pp. 165–172), Dublin.Google Scholar
  26. 26.
    Everaert, M., van der Linden, E. -J., Schenk, A., & Schreuder, R. (Eds.). (1995). Idioms: Structural and psychological perspectives. Hillsdale: Lawrence Erlbaum Associates.Google Scholar
  27. 27.
    Evert, S. (2008). Corpora and collocations. In A. Lüdeling & M. Kytö (Eds.), Corpus linguistics. An international handbook. Berlin: Mouton de Gruyter. Article 58.Google Scholar
  28. 28.
    Evert, S., Heid, U., & Spranger, K. (2004). Identifying morphosyntactic preferences in collocations. In Proceedings of the 4th Int’l Conference on Language Resources and Evaluation (pp. 907–910), Lisbon.Google Scholar
  29. 29.
    A. Fazly. (2007). Automatic acquisition of lexical knowledge about multiword predicates. Ph.D. in Computer Science, University of Toronto.Google Scholar
  30. 30.
    Fazly, A., & Stevenson, S. (2007). Distinguishing subtypes of multiword expressions using linguistically-motivated statistical measures. In Multiword Expression’07: Proceedings of the Workshop on a Broader Perspective on Multiword Expressions (pp. 9–16), Prague. Association for Computational Linguistics.Google Scholar
  31. 31.
    Fazly, A., Stevenson, S., & North, R. (2007). Automatically learning semantic knowledge about multiword predicates. Journal of Language Resources and Evaluation, 41(1), 61–89.CrossRefGoogle Scholar
  32. 32.
    Fazly, A., Nematzadeh, A., & Stevenson, S. (2009). Acquiring multiword verbs: The role of statistical evidence. In Proceedings of the 31st Annual Conference of the Cognitive Science Society, Amsterdam.Google Scholar
  33. 33.
    Fazly, A., Alishahi, A., & Stevenson, S. (2010). A probabilistic computational model of cross-situational word learning. Cognitive Science, 34, 1017–1063.CrossRefGoogle Scholar
  34. 34.
    Fellbaum, C. (1993). The determiner in English idioms (pp. 271–295). Hillsdale: Lawrence Erlbaum Associates.Google Scholar
  35. 35.
    Fellbaum, C. (Ed.). (1998). WordNet, an electronic lexical database. Cambridge/London: MIT Press.zbMATHGoogle Scholar
  36. 36.
    Fisher, C. (2002). Structural limits on verb mapping: The role of abstract structure in 2.5-year-olds’ interpretations of novel verbs. Developmental Science, 5(1), 55–64.Google Scholar
  37. 37.
    Frank, M., Goodman, N., & Tenenbaum, J. B. (2007). A Bayesian framework for cross-situational word-learning. In Advances in Neural Information Processing Systems. Cambridge/London: MITGoogle Scholar
  38. 38.
    Gentner, D., & France, I. M. (2004). The verb mutability effect: Studies of the combinatorial semantics of nouns and verbs. In S. L. Small, G. W. Cottrell, & M. K. Tanenhaus (Eds.), Lexical ambiguity resolution: Perspectives from psycholinguistics, neuropsychology, and artificial intelligence (pp. 343–382). San Mateo: Kaufmann.Google Scholar
  39. 39.
    Gertner, Y., Fisher, C., & Eisengart, J. (2006). Learning words and rules: Abstract knowledge of word order in early sentence comprehension. Psychological Science, 17(8), 684–691.CrossRefGoogle Scholar
  40. 40.
    Goldberg, A. E. (1995). Constructions: A construction grammar approach to argument structure. Chicago: University of Chicago Press.Google Scholar
  41. 41.
    Goldberg, A. E. (2006). Constructions at work: The nature of generalization in language. Oxford: Oxford University Press.Google Scholar
  42. 42.
    Grant, L. E. (2005). Frequency of ‘core idioms’ in the British National Corpus (BNC). International Journal of Corpus Linguistics, 10(4), 429–451.MathSciNetCrossRefGoogle Scholar
  43. 43.
    Grefenstette, G., & Teufel, S. (1995). Corpus-based method for automatic identification of support verbs for nominalization. In Proceedings of the 7th Meeting of the European Chapter of the Association for Computational Linguistics (EACL’95) (pp. 98–103), Dublin.Google Scholar
  44. 44.
    Israel, M. How children get constructions. In M. Fried & J. -O. Ostman (Eds.), Pragmatics in construction grammar and frame semantics. John Benjamins. (submitted)Google Scholar
  45. 45.
    Karimi, S. (1997). Persian complex verbs: Idiomatic or compositional? Lexicology, 3(1), 273–318.Google Scholar
  46. 46.
    Kearns, K. (2002). Light verbs in English. unpublished manuscript.
  47. 47.
    Krott, A., Gagne, C., & Nicoladis, E. (2009). How the parts relate to the whole: Frequency effects on childrens interpretations of novel compounds. Journal of Child Language, 36(01), 85–112.CrossRefGoogle Scholar
  48. 48.
    Kytö, M. (1999). Collocational and idiomatic aspects of verbs in Early Modern English (pp. 167–206). Amsterdam/Philadelphia: John Benjamins Publishing Company.Google Scholar
  49. 49.
    Xiaowei, P. Li, & MacWhinney, B. (2007). Dynamic self-organization and early lexical development in children. Cognitive Science, 31, 581–612.CrossRefGoogle Scholar
  50. 50.
    Lin, D. (1999). Automatic identification of non-compositional phrases. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics (pp. 317–324), College Park. Association for Computational Linguistics.Google Scholar
  51. 51.
    Lin, T. -H. (2001). Light verb syntax and the theory of phrase structure. Ph.D. thesis, University of California, Irvine.Google Scholar
  52. 52.
    MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk. The Database (3rd ed., Vol. 2). Mahwah: Lawrence Erlbaum Associates.Google Scholar
  53. 53.
    McCarthy, D., Keller, B., & Carroll, J. (2003). Detecting a continuum of compositionality in phrasal verbs. In Proceedings of the ACL-SIGLEX Workshop on Multiword Expressions: Analysis, Acquisition and Treatment (pp. 73–80), Sapporo.Google Scholar
  54. 54.
    Miyamoto, T. (2000). The light verb construction in Japanese: The role of the verbal noun. Amsterdam/Philadelphia: John Benjamins.Google Scholar
  55. 55.
    Moon, R. (1998). Fixed expressions and idioms in English: A corpus-based approach. New York: Oxford University Press.Google Scholar
  56. 56.
    Naigles, L., & Kako, E. T. (1993). First contact in verb acquisition: Defining a role for syntax. Child Development, 64, 1665–1687.CrossRefGoogle Scholar
  57. 57.
    Nation, K., Marshall, C. M., & Altmann, G. T. M. (2003). Investigating individual differences in children’s real-time sentence comprehension using language-mediated eye movements. Journal of Experimental Child Psychology, 86, 314–329.CrossRefGoogle Scholar
  58. 58.
    Newman, J. (1996). Give: A cognitive linguistic study. Berlin/New York: Mouton de Gruyter.Google Scholar
  59. 59.
    Newman, J., & Rice, S. (2004). Patterns of usage for English SIT, STAND, and LIE: A cognitively inspired exploration in corpus linguistics. Cognitive Linguistics, 15(3), 351–396.CrossRefGoogle Scholar
  60. 60.
    Onnis, L., Roberts, M., & Chater, N. (2002). Simplicity: A cure for overgeneralizations in language acquisition. In Proceedings of the 24th Annual Conference of the Cognitive Science Society (pp. 720–725), Fairfax.Google Scholar
  61. 61.
    Parisien, C., & Stevenson, S. (2010). Learning verb alternations in a usage-based Bayesian model. In Proceeding of the 32nd Annual Meeting of the Cognitive Science Society, Austin.Google Scholar
  62. 62.
    Pauwels, P. (2000). Put, set, lay and place: A cognitive linguistic approach to verbal meaning. Munich: Lincom Europa.Google Scholar
  63. 63.
    Perfors, A., Tenenbaum, J. B., & Wonnacott, E. (2010). Variability, negative evidence, and the acquisition of verb argument constructions. Journal of Child Language, 37(3), 607–642.CrossRefGoogle Scholar
  64. 64.
    Quochi, V. (2007). A usage-based approach to light verb constructions in Italian: Development and use. Ph.D. thesis, Universit‘a di Pisa.Google Scholar
  65. 65.
    Regier, T. (2005). The emergence of words: Attentional learning in form and meaning. Cognitive Science, 29, 819–865.CrossRefGoogle Scholar
  66. 66.
    Riehemann, S. (2001). A constructional approach to idioms and word formation. Ph.D. thesis, Stanford University, Stanford.Google Scholar
  67. 67.
    Sag, I. A., Baldwin, T., Bond, F., Copestake, A., & Flickinger, D. (2002). Multiword expressions: A pain in the neck for NLP. In Proceedings of the 3rd International Conference on Intelligent Text Processing and Computational Linguistics (CICLing’02) (pp. 1–15), Mexico City, Mexico.Google Scholar
  68. 68.
    Sagae, K., Davis, E., Lavie, A., MacWhinney, B., & Wintner, S. (2007). High-accuracy annotation and parsing of CHILDES transcripts. In Proceedings of the ACL’07 Workshop on Cognitive Aspects of Computational Language Acquisition, Prague.Google Scholar
  69. 69.
    Sakas, W., & Fodor, J. D. (2001). The structural triggers learner. In S. Bertolo (Eds.), Language acquistion and learnability, (172–233). Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  70. 70.
    Scott, R. M., & Fisher, C. (2009). Two-year-olds use distributional cues to interpret transitivity-alternating verbs. Language and Cognitive Processes, 24, 777–803CrossRefGoogle Scholar
  71. 71.
    Smadja, F. (1993). Retrieving collocations from text: Xtract. Computational Linguistics, 19(1), 143–177.Google Scholar
  72. 72.
    Sosa, A. V., & MacFarlane, J. (2002). Evidence for frequency based constituents in the mental lexicon: Collocations involving the word of. Brain and Language, 83, 227–236.CrossRefGoogle Scholar
  73. 73.
    Theakston, A. L., Lieven, E. V. M., Pine, J. M., & Rowland, C. F. (2002). Going, going, gone: The acquisition of the verb ‘go’. Journal of Child Language, 29, 783–811.CrossRefGoogle Scholar
  74. 74.
    Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Cambridge: Harvard University Press.Google Scholar
  75. 75.
    Venkatapathy, S., & Joshi, A. (2005). Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features. In Proceeding of HLT-EMNLP’05 (pp. 899–906), Vancouver.Google Scholar
  76. 76.
    Wierzbicka, A. (1982). Why can you Have a Drink when you can’t *Have an Eat? Language, 58(4), 753–799.CrossRefGoogle Scholar
  77. 77.
    Yu, C., & Smith, L. B. (2006). Statistical cross-situational learning to build word-to-world mappings. In Proceedings of the 28th Annual Conference of the Cognitive Science Society, Vancouver.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of TorontoTorontoCanada
  2. 2.School of Computer ScienceInstitute for Research in Fundamental Sciences (IPM)TehranIran

Personalised recommendations