Computational Modeling as a Methodology for Studying Human Language Learning

  • Thierry Poibeau
  • Aline Villavicencio
  • Anna Korhonen
  • Afra Alishahi
Chapter
Part of the Theory and Applications of Natural Language Processing book series (NLP)

Abstract

The nature and amount of information needed for learning a natural language, and the underlying mechanisms involved in this process, are the subject of much debate: how is the knowledge of language represented in the human brain? Is it possible to learn a language from usage data only, or is some sort of innate knowledge and/or bias needed to boost the process? Are different aspects of language learned in order? These are topics of interest to (psycho)linguists who study human language acquisition, as well as to computational linguists who develop the knowledge sources necessary for large-scale natural language processing systems. Children are the ultimate subjects of any study of language learnability. They learn language with ease, in a short period of time and their acquired knowledge of language is flexible and robust.

Keywords

Language Acquisition Specific Language Impairment Minimum Description Length Argument Structure Semantic Role 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Akhtar, N. (1999). Acquiring basic word order: Evidence for data-driven learning of syntactic structure. Journal of Child Language, 26, 339–356.CrossRefGoogle Scholar
  2. 2.
    Alishahi, A. (2010). Computational modeling of human language acquisition (Synthesis lectures on human language technologies). San Rafael: Morgan & Claypool Publishers.Google Scholar
  3. 3.
    Bowerman, M. (1982). Evaluating competing linguistic models with language acquisition data: Implications of developmental errors with causative verbs. Quaderni di semantica, 3, 5–66.Google Scholar
  4. 4.
    Brent, M. R., & Cartwright, T. A. (1996). Distributional regularity and phonotactic constraints are useful for segmentation. Cognition, 61(1–2), 93–125.CrossRefGoogle Scholar
  5. 5.
    Broen, P. A. (1972). The verbal environment of the language-learning child. Washington: American Speech and Hearing Association.Google Scholar
  6. 6.
    Burnard, L. (2000). Users reference guide for the British National Corpus (Technical Report). Oxford University Computing Services.Google Scholar
  7. 7.
    Buttery, P., & Korhonen, A. (2007). I will shoot your shopping down and you can shoot all my tins: Automatic lexical acquisition from the CHILDES database. In Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition (pp. 33–40). Prague: Association for Computational Linguistics.Google Scholar
  8. 8.
    Chater, N., & Manning, C. D. (2006). Probabilistic models of language processing and acquisition. Trends in Cognitive Science, 10(7), 335–344.CrossRefGoogle Scholar
  9. 9.
    Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge: MIT Press.Google Scholar
  10. 10.
    Chomsky, N. (1975). The logical structure of linguistic theory. New York: Plenum press.Google Scholar
  11. 11.
    Chomsky, N. (1980). Rules and representations. Oxford: Basil Blackwell.Google Scholar
  12. 12.
    Chomsky, N. (1981). Lectures on government and binding. Dordrecht/Cinnaminson: Mouton de Gruyter.Google Scholar
  13. 13.
    Chomsky, N. (1986). Knowledge of language: Its nature, origin, and use. New York: Praeger Publishers.Google Scholar
  14. 14.
    Clark, E. V. (2009). First language acquisition (2nd ed.). Cambridge/New York: Cambridge University Press.CrossRefGoogle Scholar
  15. 15.
    Clark, A., & Lappin, S. (2010). Linguistic nativism and the poverty of stimulus. Oxford/Malden, MA: Wiley Blackwell.Google Scholar
  16. 16.
    Cullicover, P. W. (1999). Syntactic nuts. Oxford/New York: Oxford University Press.Google Scholar
  17. 17.
    De Marcken, C. G. (1996). Unsupervised language acquisition. Ph.D. thesis, MIT.Google Scholar
  18. 18.
    Dominey, P., & Boucher, J. (2005). Learning to talk about events from narrated video in a construction grammar framework. Artificial Intelligence, 167(1–2), 31–61.CrossRefGoogle Scholar
  19. 19.
    Dowman, M. (2000). Addressing the learnability of verb subcategorizations with Bayesian inference. In L. R. Gleitman & A. K. Joshi (Eds.), Proceedings of the Twenty-Second Annual Conference of the Cognitive Science Society. Mahwah/London: ErlbaumGoogle Scholar
  20. 20.
    Elman, J. (2001). Connectionism and language acquisition. In Essential readings in language acquisition. Oxford: Blackwell.Google Scholar
  21. 21.
    Fisher, C. (1996). Structural limits on verb mapping: The role of analogy in children’s interpretations of sentences. Cognitive Psychology, 31(1), 41–81.CrossRefGoogle Scholar
  22. 22.
    Francis, W., Kučera, H., & Mackie, A. (1982). Frequency analysis of English usage: Lexicon and grammar. Boston: Houghton Mifflin Harcourt (HMH).Google Scholar
  23. 23.
    Frank, M., Goodman, N., & Tenenbaum, J. (2008). A Bayesian framework for cross-situational word learning. Advances in Neural Information Processing Systems, 20, 457–464.Google Scholar
  24. 24.
    Frazier, L., & Fodor, J. D. (1978). The sausage machine: A new two-stage parsing model. Cognition, 13, 187–222.CrossRefGoogle Scholar
  25. 25.
    Gelman, S., & Taylor, M. (1984). How two-year-old children interpret proper and common names for unfamiliar objects. Child Development, 55, 1535–1540.CrossRefGoogle Scholar
  26. 26.
    Gibson, E., & Wexler, K. (1994). Triggers. Linguistic Inquiry, 25, 407–454.Google Scholar
  27. 27.
    Godfrey, J., Holliman, E., & McDaniel, J. (1992). SWITCHBOARD: Telephone speech corpus for research and development. In 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1992. ICASSP-92 (Vol. 1). New York: IEEEGoogle Scholar
  28. 28.
    Gold, E. M. (1967). Language identification in the limit. Information and Control, 10(5), 447–474.MATHCrossRefGoogle Scholar
  29. 29.
    Goldberg, A. E. (1999). Emergence of the semantics of argument structure constructions. In The emergence of language (Carnegie Mellon Symposia on Cognition Series, pp. 197–212). Mahwah: Lawrence Erlbaum AssociatesGoogle Scholar
  30. 30.
    Grünwald, P. (1996). A minimum description length approach to grammar inference. In S. Wermter, E. Riloff, & G. Scheler (Eds.), Connectionist, statistical and symbolic approaches to learning for natural language processing (Lecture Notes in Computer Science, Vol. 1040, pp. 203–216). Berlin/New York: Springer.Google Scholar
  31. 31.
    Hsu, A. S., & Chater, N. (2010). The logical problem of language acquisition: A probabilistic perspective. Cognitive Science, 34(6), 972–1016.CrossRefGoogle Scholar
  32. 32.
    Jurafsky, D. (1996). A probabilistic model of lexical and syntactic access and disambiguation. Cognitive Science, 20, 137–194.CrossRefGoogle Scholar
  33. 33.
    Keller, B., & Lutz, R. (1997). Evolving stochastic context-free grammars from examples using a minimum description length principle. In Workshop on Automata Induction Grammatical Inference and Language Acquisition, ICML-97. San Francisco: Morgan Kaufmann PublishersGoogle Scholar
  34. 34.
    Leech, G. (1992). 100 million words of English: The British National Corpus (BNC). Language Research, 28(1), 1–13.MathSciNetGoogle Scholar
  35. 35.
    Legate, J., & Yang, C. (2002). Empirical re-assessment of stimulus poverty arguments. Linguistic Review, 19(1/2), 151–162.Google Scholar
  36. 36.
    Leonard, L. (2000). Children with specific language impairment. Cambridge: MIT Press.Google Scholar
  37. 37.
    Li, M., & Vitányi, P. M. B. (1995). Computational machine learning in theory and praxis. In J. van Leeuwen (Ed.), Computer science today (Lecture notes in computer science, Vol. 1000). Heidelberg: Springer.Google Scholar
  38. 38.
    MacWhinney, B. (1982). Basic syntactic processes. In S. Kuczaj (Ed.), Language development: Syntax and semantics (Vol. 1, pp. 73–136). Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
  39. 39.
    MacWhinney, B. (1987). The competition model. In B. MacWhinney (Ed.), Mechanisms of language acquisition. Hillsdale, NJ: Erllbaum.Google Scholar
  40. 40.
    MacWhinney, B. (1993). Connections and symbols: Closing the gap. Cognition, 49, 291–296.CrossRefGoogle Scholar
  41. 41.
    MacWhinney, B. (1995). The CHILDES project: Tools for analyzing talk (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
  42. 42.
    MacWhinney, B. (2004). A multiple process solution to the logical problem of language acquisition. Journal of Child Language, 31, 883–914.CrossRefGoogle Scholar
  43. 43.
    MacWhinney, B., Bird, S., Cieri, C., & Martell, C. (2004). TalkBank: Building an open unified multimodal database of communicative interaction. In Proceedings of the Fourth International Conference on Language Resources and Evaluation, Lisbon (pp. 525–528). Paris: ELRAGoogle Scholar
  44. 44.
    Marcus, G. F. (1993). Negative evidence in language acquisition. Cognition, 46, 53–85.CrossRefGoogle Scholar
  45. 45.
    Marcus, G. F., Pinker, S., Ullman, M., Hollander, M., Rosen, T. J., & Xu, F. (1992). Overregularization in language acquisition (Monographs of the society for research in child development, Vol. 57 (4, Serial No. 228)). Chicago: University of Chicago PressGoogle Scholar
  46. 46.
    Marcus, M., Santorini, B., & Marcinkiewicz, M. (1994). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), 313–330.Google Scholar
  47. 47.
    Marr, D. (1982). Vision. San Francisco, CA: W. H. Freeman.Google Scholar
  48. 48.
    McClelland, J. L., Rumelhart, D. E., & The PDP Research Group (1986). Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 2). Cambridge, MA: Bradford Books/MIT Press.Google Scholar
  49. 49.
    Mintz, T. (2003). Frequent frames as a cue for grammatical categories in child directed speech. Cognition, 90(1), 91–117.CrossRefGoogle Scholar
  50. 50.
    Parisse, C., & Le Normand, M. T. (2000). Automatic disambiguation of the morphosyntax in spoken language corpora. Behavior Research Methods, Instruments, and Computers, 32, 468–481.CrossRefGoogle Scholar
  51. 51.
    Perfors, A., Tenenbaum, J., & Wonnacott, E. (2010). Variability, negative evidence, and the acquisition of verb argument constructions. Journal of Child Language, 37, 607–642.CrossRefGoogle Scholar
  52. 52.
    Pinker, S. (1989). Learnability and cognition: The acquisition of argument structure. Cambridge, MA: MIT Press.Google Scholar
  53. 53.
    Pinker, S. (1994). How could a child use verb syntax to learn verb semantics? Lingua, 92, 377–410.CrossRefGoogle Scholar
  54. 54.
    Pullum, G., & Scholz, B. (2002). Empirical assessment of stimulus poverty arguments. Linguistic Review, 19(1/2), 9–50.Google Scholar
  55. 55.
    Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14(5), 465–471.MATHCrossRefGoogle Scholar
  56. 56.
    Roy, D. (2009). New horizons in the study of child language acquisition. In Proceedings of Interspeech 2009, Brighton. Grenoble: ISCAGoogle Scholar
  57. 57.
    Rumelhart, D., & McClelland, J. (1987). Learning the past tenses of English verbs: Implicit rules or parallel distributed processing. Mechanisms of language acquisition (pp. 195–248). Hillsdale: Erlbaum.Google Scholar
  58. 58.
    Sagae, K., Davis, E., Lavie, A., MacWhinney, B., & Wintner, S. (2007). High-accuracy annotation and parsing of CHILDES transcripts. In Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition (pp. 25–32). Prague: Association for Computational Linguistics.Google Scholar
  59. 59.
    Sagae, K., Davis, E., Lavie, A., MacWhinney, B., & Wintner, S. (2010). Morphosyntactic annotation of CHILDES transcripts. Journal of Child Language, 37(03), 705–729.CrossRefGoogle Scholar
  60. 60.
    Steedman, M., Baldridge, J., Bozsahin, C., Clark, S., Curran, J., & Hockenmaier, J. (2005). Grammar acquisition by child and machine: The combinatory manifesto. Invited Talk at the Ninth Conference on Computational Natural Language Learning (CoNLL-2005), Ann Arbor.Google Scholar
  61. 61.
    Tanenhaus, M., Spivey-Knowlton, M., Eberhard, K., & Sedivy, J. (1995). Integration of visual and linguistic information in spoken language comprehension. Science, 268(5217), 1632.CrossRefGoogle Scholar
  62. 62.
    Tenenbaum, J., Griffiths, T., & Kemp, C. (2006). Theory-based Bayesian models of inductive learning and reasoning. Trends in Cognitive Sciences, 10(7), 309–318.CrossRefGoogle Scholar
  63. 63.
    Tomasello, M. (2000). Do young children have adult syntactic competence? Cognition, 74, 209–253.CrossRefGoogle Scholar
  64. 64.
    Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Cambridge: Harvard University Press.Google Scholar
  65. 65.
    Tomasello, M., Akhtar, N., Dodson, K., & Rekau, L. (1997). Differential productivity in young children’s use of nouns and verbs. Journal of Child Language, 24(02), 373–387.CrossRefGoogle Scholar
  66. 66.
    Villavicencio, A. (2002). The acquisition of a unification-based generalised categorial grammar. Ph.D. thesis, Computer Laboratory, University of Cambridge.Google Scholar
  67. 67.
    Yang, C. (2002). Knowledge and learning in natural language. Oxford/New York: Oxford University Press.Google Scholar
  68. 68.
    Yu, C., & Ballard, D. (2007). A unified model of early word learning: Integrating statistical and social cues. Neurocomputing, 70(13–15), 2149–2165.CrossRefGoogle Scholar
  69. 69.
    Yu, C., & Smith, L. (2006). Statistical cross-situational learning to build word-to-world mappings. In Proceedings of the 28th Annual Meeting of the Cognitive Science Society, Vancouver. Citeseer.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Thierry Poibeau
    • 1
  • Aline Villavicencio
    • 2
  • Anna Korhonen
    • 3
    • 4
  • Afra Alishahi
    • 5
  1. 1.Laboratoire Langues, Textes, Traitements informatiques, Cognition, CNRSEcole Normale Supérieure and Université Sorbonne NouvelleParisFrance
  2. 2.Institute of InformaticsFederal University of Rio Grande do SulPorto AlegreBrazil
  3. 3.Computer LaboratoryUniversity of CambridgeCambridgeUK
  4. 4.Department of Theoretical and Applied Linguistics (DTAL)CambridgeUK
  5. 5.Department of Communication and Information StudiesTilburg UniversityTilburgThe Netherlands

Personalised recommendations