Skip to main content

Computational Models of Language Acquisition

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6008))

Abstract

Child language acquisition, one of Nature’s most fascinating phenomena, is to a large extent still a puzzle. Experimental evidence seems to support the view that early language is highly formulaic, consisting for the most part of frozen items with limited productivity. Fairly quickly, however, children find patterns in the ambient language and generalize them to larger structures, in a process that is not yet well understood. Computational models of language acquisition can shed interesting light on this process. This paper surveys various works that address language learning from data; such works are conducted in different fields, including psycholinguistics, cognitive science and computer science, and we maintain that knowledge from all these domains must be consolidated in order for a well-informed model to emerge. We identify the commonalities and differences between the various existing approaches to language learning, and specify desiderata for future research that must be considered by any plausible solution to this puzzle.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adriaans, P.: Language Learning from a Categorial Perspective. PhD thesis, Universiteit van Amsterdam (1992)

    Google Scholar 

  2. Adriaans, P.: Learning shallow context-free languages under simple distributions. In: Copestake, A., Vermeulen, K. (eds.) Algebras, Diagrams and Decisions in Language, Logic and Computation. CSLI/CUP, Stanford (2001)

    Google Scholar 

  3. Adriaans, P., Vervoort, M.: The EMILE 4.1 grammar induction toolbox. In: Adriaans, P.W., Fernau, H., van Zaanen, M. (eds.) ICGI 2002. LNCS (LNAI), vol. 2484, pp. 293–295. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  4. Adriaans, P.W., van Zaanen, M.M.: Computational grammatical inference. In: Holmes, D.E., Jain, L.C. (eds.) Innovations in Machine Learning. Studies in Fuzziness and Soft Computing, vol. 194, ch. 7. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Banko, M., Moore, R.C.: Part of speech tagging in context. In: COLING 2004: Proceedings of the 20th international conference on Computational Linguistics, Morristown, NJ, USA, p. 556. Association for Computational Linguistics (2004)

    Google Scholar 

  6. Bannard, C., Lieven, E.: Repetition and reuse in child language learning. In: Corrigan, R., Moravcsik, E., Ouali, H., Wheatley, K. (eds.) Formulaic Language. John Benjamins, Amsterdam (2009)

    Google Scholar 

  7. Bannard, C., Lieven, E., Tomasello, M.: Early grammatical development is piecemeal and lexically specific. Proceedings of the National Academy of Science 106(41), 17284–17289 (2009)

    Article  Google Scholar 

  8. Bates, E., MacWhinney, B.: Competition, variation, and language learning. In: [46], ch. 6, pp. 157–193 (1987)

    Google Scholar 

  9. Berant, J., Gross, Y., Mussel, M., Sandbank, B., Edelman, S.: Boosting unsupervised grammar induction by splitting complex sentences on function words. In: Proceedings of the 31st Boston University Conference on Language Development, pp. 93–104. Cascadilla Press (2007)

    Google Scholar 

  10. Berman, R.A.: Between emergence and mastery: The long developmental route of language acquisition. In: Berman, R.A. (ed.) Language development across childhood and adolescence. Trends in Language Acquisition Research, vol. 3, pp. 9–34. John Benjamins, Amsterdam/Philadelphia (2004)

    Google Scholar 

  11. Bod, R.: An all-subtrees approach to unsupervised parsing. In: ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, Morristown, NJ, USA, pp. 865–872. Association for Computational Linguistics (2006a)

    Google Scholar 

  12. Bod, R.: Unsupervised parsing with U-DOP. In: Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X), New York City, pp. 85–92. Association for Computational Linguistics (2006b)

    Google Scholar 

  13. Bod, R.: Is the end of supervised parsing in sight? In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, pp. 400–407. Association for Computational Linguistics (2007)

    Google Scholar 

  14. Bod, R.: Constructions at work or at rest? Cognitive Linguistics 20(1) (2009)

    Google Scholar 

  15. Bod, R., Sima’an, K., Scha, R. (eds.): Data-Oriented Parsing. CSLI Publications, Stanford (2003)

    Google Scholar 

  16. Borensztajn, G., Zuidema, W.: Bayesian model merging for unsupervised constituent labeling and grammar induction. ILLC Prepublication PP-2007-40, ILLC, University of Amsterdam (2007)

    Google Scholar 

  17. Borensztajn, G., Zuidema, J., Bod, R.: Children’s grammars grow more abstract with age — evidence from an automatic procedure for identifying the productive units of language. In: Proceedings of CogSci 2008 (2008)

    Google Scholar 

  18. Brodsky, P., Waterfall, H., Edelman, S.: Characterizing motherese: On the computational structure of child-directed language. In: Proceedings of the 29th Cognitive Science Society Conference. Cognitive Science Society (2007)

    Google Scholar 

  19. Brown, R.: A first language: the Early stages. Harvard University Press, Cambridge (1973)

    Google Scholar 

  20. Chang, F., Lieven, E., Tomasello, M.: Automatic evaluation of syntactic learners in typologically-different languages. Cognitive Systems Research 9(3), 198–213 (2008)

    Article  Google Scholar 

  21. Chomsky, N.: Aspects of the theory of syntax. MIT Press, Cambridge (1965)

    Google Scholar 

  22. Chomsky, N.: Language and Mind. Harcourt Brace Juvanovich, New York (1968)

    Google Scholar 

  23. Chomsky, N.: Rules and representations. Behavioral and Brain Sciences 3, 1–61 (1980)

    Article  Google Scholar 

  24. Chomsky, N.: Reflections on Language. Pantheon, New York (1975)

    Google Scholar 

  25. Church, K.W., Mercer, R.L.: Introduction to the special issue on computational linguistics using large corpora. Computational Linguistics 19(1), 1–24 (1993)

    Google Scholar 

  26. Da̧browska, E., Lieven, E.: Towards a lexically specific grammar of children’s question constructions. Cognitive Linguistics 16(3), 437–474 (2005)

    Article  Google Scholar 

  27. Edelman, S., Waterfall, H.: Behavioral and computational aspects of language and its acquisition. Physics of Life Reviews 4(4), 253–277 (2007)

    Article  Google Scholar 

  28. Freudenthal, D., Pine, J.M., Gobet, F.: Modelling the development of children’s use of optional infinitives in Dutch and English using MOSAIC. Cognitive Science 30, 277–310 (2006)

    Article  Google Scholar 

  29. Freudenthal, D., Pine, J.M., Gobet, F.: Understanding the developmental dynamics of subject omission: the role of processing limitations in learning. Journal of Child Language 34(01), 83–110 (2007)

    Article  Google Scholar 

  30. Freudenthal, D., Pine, J.M., Gobet, F.: Simulating the referential properties of Dutch, German, and English root infinitives in MOSAIC. Language Learning and Development 5, 1–29 (2009)

    Article  Google Scholar 

  31. Kennedy, G.: An introduction to corpus linguistics. Addison Wesley, Reading (1998)

    Google Scholar 

  32. Klein, D., Manning, C.D.: Natural language grammar induction using a constituent-context model. In: Dietterich, T.G., Becker, S., Ghahramani, Z., Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) NIPS, pp. 35–42. MIT Press, Cambridge (2001)

    Google Scholar 

  33. Klein, D., Manning, C.D.: A generative constituent-context model for improved grammar induction. In: ACL, pp. 128–135 (2002)

    Google Scholar 

  34. Klein, D., Manning, C.D.: Corpus-based induction of syntactic structure: Models of dependency and constituency. In: ACL, pp. 478–485 (2004)

    Google Scholar 

  35. Klein, D., Manning, C.D.: Natural language grammar induction with a generative constituent-context model. Pattern Recognition 38(9), 1407–1419 (2005)

    Article  MATH  Google Scholar 

  36. Kol, S., Nir, B., Wintner, S.: Acquisition of abstract slot-filler schemas: Computational evaluation. Presented at the COGSCI 2009 Workshop on Psychocomputational Models of Human Language Acquisition (2009)

    Google Scholar 

  37. Li, P., Farkas, I., MacWhinney, B.: Early lexical development in a self-organizing neural network. Neural Networks 17(8-9), 1345–1362 (2004)

    Article  Google Scholar 

  38. Lieven, E., Behrens, H., Speares, J., Tomasello, M.: Early syntactic creativity: a usage-based approach. Journal of Child Language 30(2), 333–370 (2003)

    Article  Google Scholar 

  39. Lieven, E., Salomo, D., Tomasello, M.: Two-year-old children’s production of multiword utterances: a usage-based analysis. Cognitive Linguistics 20(3), 481–507 (2009)

    Article  Google Scholar 

  40. Lieven, E.V., Pine, J.M., Baldwin, G.: Lexically-based learning and early grammatical development. Journal of Child Language 24(1), 187–219 (1997)

    Article  Google Scholar 

  41. MacWhinney, B.: The CHILDES Project: Tools for Analyzing Talk, 3rd edn. Lawrence Erlbaum Associates, Mahwah (2000)

    Google Scholar 

  42. MacWhinney, B.: Models of the emergence of language. Annual Review of Psychology 49, 199–227 (1998)

    Article  Google Scholar 

  43. MacWhinney, B.: A multiple process solution to the logical problem of language acquisition. Journal of Child Language 31, 883–914 (2004a)

    Article  Google Scholar 

  44. MacWhinney, B.: A unified model of language acquisition. In: Kroll, J., De Groot, A. (eds.) Handbook of bilingualism: Psycholinguistic approaches. Oxford University Press, Oxford (2004b)

    Google Scholar 

  45. MacWhinney, B.: Rules, rote, and analogy in morphological formations by Hungarian children. Journal of Child Language 2, 65–77 (1975)

    Article  Google Scholar 

  46. MacWhinney, B. (ed.): Mechanisms of language acquisition. Lawrence Erlbaum Associates, Hillsdale (1987)

    Google Scholar 

  47. The emergence of language. In: MacWhinney, B. (ed.) Carnegie Mellon Symposia on Cognition. Lawrence Erlbaum Associates, Mahwah (1999)

    Google Scholar 

  48. Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: The Penn treebank. Computational Linguistics 19(2), 313–330 (1993)

    Google Scholar 

  49. McEnery, A., Wilson, A.: Corpus Linguistics. Edinburgh University Press, Edinburgh (1996)

    Google Scholar 

  50. Pinker, S.: The Language Instinct. William Morrow and Company, New York (1994)

    Google Scholar 

  51. Rowland, C.F., Fletcher, S.L., Freudenthal, D.: Repetition and reuse in child language learning. In: Behrens, H. (ed.) Corpora in Language Acquisition Research: History, methods, perspectives, pp. 1–24. John Benjamins, Amsterdam (2008)

    Google Scholar 

  52. Sagae, K., MacWhinney, B., Lavie, A.: Automatic parsing of parent-child interactions. Behavior Research Methods, Instruments, and Computers 36, 113–126 (2004)

    Google Scholar 

  53. Sagae, K., Davis, E., Lavie, A., MacWhinney, B., Wintner, S.: High-accuracy annotation and parsing of CHILDES transcripts. I. In: Proceedings of the ACL-2007 Workshop on Cognitive Aspects of Computational Language Acquisition, Prague, Czech Republic, pp. 25–32. Association for Computational Linguistics (2007)

    Google Scholar 

  54. Sagae, K., Davis, E., Lavie, A., MacWhinney, B., Wintner, S.: Morphosyntactic annotation of CHILDES transcripts. Journal of Child Language (to appear)

    Google Scholar 

  55. Seginer, Y.: Fast unsupervised incremental parsing. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, pp. 384–391. Association for Computational Linguistics (2007)

    Google Scholar 

  56. Smith, N.A., Eisner, J.: Annealing techniques for unsupervised statistical language learning. In: ACL 2004: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Morristown, NJ, USA, p. 486. Association for Computational Linguistics (2004)

    Google Scholar 

  57. Solan, Z., Horn, D., Ruppin, E., Edelman, S.: Unsupervised learning of natural languages. Proceedings of the National Academy of Sciences of the United States of America 102(33), 11629–11634 (2005)

    Article  Google Scholar 

  58. Stolcke, A., Omohundro, S.M.: Inducing probabilistic grammars by bayesian model merging. In: Carrasco, R.C., Oncina, J. (eds.) ICGI 1994. LNCS, vol. 862, pp. 106–118. Springer, Heidelberg (1994)

    Google Scholar 

  59. Tomasello, M.: On the different origins of symbols and grammars. In: Christiansen, M.H., Kirby, S. (eds.) Language Evolution. Studies in the Evolution of Language, ch. 6, pp. 94–110. Oxford University Press, Oxford (2003)

    Chapter  Google Scholar 

  60. Tomasello, M.: Acquiring linguistic constructions. In: Kuhn, D., Siegler, R. (eds.) Handbook of Child Psychology, pp. 255–298. Wiley, New York (2006)

    Google Scholar 

  61. Tomasello, M.: Language is not an instinct. Cognitive Development 10, 131–156 (1995)

    Article  Google Scholar 

  62. van Zaanen, M.: Implementing alignment-based learning. In: Adriaans, P.W., Fernau, H., van Zaanen, M. (eds.) ICGI 2002. LNCS (LNAI), vol. 2484, pp. 312–314. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  63. van Zaanen, M.: ABL: alignment-based learning. In: Proceedings of the 18th conference on Computational linguistics, Morristown, NJ, USA, pp. 961–967. Association for Computational Linguistics (2000)

    Google Scholar 

  64. van Zaanen, M.: Bootstrapping Structure into Language: Alignment-Based Learning. PhD thesis, University of Leeds, Leeds, UK (2002a)

    Google Scholar 

  65. van Zaanen, M., Adriaans, P.: Alignment-Based Learning versus EMILE: A comparison. In: Proceedings of the Belgian-Dutch Conference on Artificial Intelligence (BNAIC), Amsterdam, The Netherlands, pp. 315–322 (2001)

    Google Scholar 

  66. van Zaanen, M., Geertzen, J.: Problems with evaluation of unsupervised empirical grammatical inference systems. In: Clark, A., Coste, F., Miclet, L. (eds.) ICGI 2008. LNCS (LNAI), vol. 5278, pp. 301–303. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  67. Vogt, P., Lieven, E.: Verifying theories of language acquisition using computer models of language evolution. In: Adaptive Behavior Special issue on Language Evolution: Computer models for Empirical Data (forthcoming)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wintner, S. (2010). Computational Models of Language Acquisition. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2010. Lecture Notes in Computer Science, vol 6008. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12116-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12116-6_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12115-9

  • Online ISBN: 978-3-642-12116-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics