TSD 2008: Text, Speech and Dialogue pp 261-268 | Cite as
Language Acquisition: The Emergence of Words from Multimodal Input
Abstract
Young infants learn words by detecting patterns in the speech signal and by associating these patterns to stimuli provided by non-speech modalities (such as vision). In this paper, we discuss a computational model that is able to detect and build word-like representations on the basis of multimodal input data. Learning of words (and word-like entities) takes place within a communicative loop between a ‘carer’ and the ‘learner’. Experiments carried out on three different European languages (Finnish, Swedish, and Dutch) show that a robust word representation can be learned in using approximately 50 acoustic tokens (examples) of that word. The model is inspired by the memory structure that is assumed functional for human speech processing.
Keywords
Language acquisition word representation learningPreview
Unable to display preview. Download preview PDF.
References
- 1.Bellegarda, J.R.: Exploiting Latent Semantic Information for Statistical Language Modeling. Proc. IEEE 88, 1279–1296 (2000)CrossRefGoogle Scholar
- 2.Boves, L., ten Bosch, L., Moore, R.: ACORNS _ towards computational modeling of communication and recognition skills. In: Proceedings IEEE-ICCI 2007 (2007)Google Scholar
- 3.Goldinger, S.D.: Echoes of echoes? An episodic theory of lexical access. Psychological Review 105, 251–279 (1998)CrossRefGoogle Scholar
- 4.Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research 5, 1457–1469 (2004)MathSciNetGoogle Scholar
- 5.Johnson, S.: Emergence. Scribner, New York (2002)Google Scholar
- 6.Jones, D.M., Hughes, R.W., Macken, W.J.: Perceptual organization masquerading as phonological storage: Further support for a perceptual-gestural view of short-term memory. J. Memory and Language 54, 265–281 (2006)CrossRefGoogle Scholar
- 7.Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems 13 (2001)Google Scholar
- 8.Lippmann, R.: Speech Recognition by Human and Machines. Speech Communication 22, 1–14 (1997)CrossRefGoogle Scholar
- 9.Maloof, M.A., Michalski, R.S.: Incremental learning with partial instance memory. Artificial intelligence 154, 95–126 (2004)CrossRefMathSciNetGoogle Scholar
- 10.McClelland, J.L., Elman, J.L.: The TRACE model of speech perception. Cognitive Psychology 18, 1–86 (1986)CrossRefGoogle Scholar
- 11.Norris, D.: Shortlist: A connectionist model of continuous speech recognition. Cognition 52, 189–234 (1994)CrossRefGoogle Scholar
- 12.Roy, D.K., Pentland, A.P.: Learning words from sights and sounds: a computational model. Cognitive Science 26, 113–146 (2002)CrossRefGoogle Scholar
- 13.Sroka, J.J., Braida, L.D.: Human and machine consonant recognition. Speech Communication 44, 401–423 (2005)CrossRefGoogle Scholar
- 14.Stouten, V., Demuynck, K., Van hamme, H.: Automatically Learning the Units of Speech by Non-negative Matrix Factorisation. In: Interspeech 2007, Antwerp, Belgium (2007)Google Scholar
- 15.Werker, J.F., Curtis, S.: PRIMIR: a developmental framework for of infant speech processing. Language Learning and Development 1, 197–234 (2005)CrossRefGoogle Scholar
- 16.Werker, J.F., Yeung, H.H.: Infant speech perception bootstraps word learning. TRENDS in Cognitive Science 9, 519–527 (2005)CrossRefGoogle Scholar