Interactions between speech perception and production during learning of novel phonemic categories

  • Melissa Michaud Baese-BerkEmail author
Perceptual/Cognitive Constraints on the Structure of Speech Communication: In Honor of Randy Diehl


A successful language learner must be able to perceive and produce novel sounds in their second language. However, the relationship between learning in perception and production is unclear. Some studies show correlations between the two modalities; however, other studies have not shown such correlations. In the present study, I examine learning in perception and production after training in a distributional learning paradigm. Training modality is manipulated, while testing modality remained constant. Overall, participants showed substantial learning in the modality in which they were trained; however, learning across modalities shows a more complex pattern. Although individuals trained in perception improved in production, individuals trained in production did not show substantial learning in perception. That is, production during training disrupted perceptual learning. Further, correlations between learning in the two modalities were not strong. Several possible explanations for the pattern of results are explored, including a close examination of the role of production variability, and the results are explained using a paradigm appealing to shared cognitive resources. The article concludes with a discussion of the implications of these results for theories of second-language learning, speech perception, and production.


Speech perception Speech production Perceptual learning 



This work was supported by National Science Foundation Grants BCS-0951943 and BCS-1734166. I would like to thank Ann Bradlow, Matthew Goldrick, and Arthur Samuel for their comments on previous versions of this work.

Open practices statement

The data reported here are available, but none of the experiments were preregistered.


  1. Arnold, H. S., MacPherson, M. K., & Smith, A. (2014). Autonomic correlates of speech versus nonspeech tasks in children and adults. Journal of Speech, Language, and Hearing Research, 57(4), 1296–1307.Google Scholar
  2. Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412. Google Scholar
  3. Babel, M. (2011). Evidence for phonetic and social selectivity in spontaneous phonetic imitation. Journal of Phonetics, 1–13.
  4. Babel, M., McGuire, G., Walters, S., & Nicholls, A. (2014). Novelty and social preference in phonetic accommodation. Laboratory Phonology, 5(1), 1–28. Google Scholar
  5. Baese-Berk, M. M., Bradlow, A. R., & Wright, B. A. (2013). Accent-independent adaptation to foreign accented speech. The Journal of the Acoustical Society of America, 133(3), EL174–EL180. Google Scholar
  6. Baese-Berk, M. M., & Morrill, T. H. (2015). Speaking rate consistency in native and non-native speakers of English. Journal of the Acoustical Society of America, 138(3), EL223–EL228. Google Scholar
  7. Baese-Berk, M. M., & Samuel, A. G. (2016). Listeners beware: Speech production may be bad for learning speech sounds. Journal of Memory and Language, 89, 23–36.Google Scholar
  8. Barcroft, J., & Sommers, M. S. (2005). Effects of acoustic variability on second language vocabulary learning. Studies in Second Language Acquisition, 27(3), 387–414.Google Scholar
  9. Bates, D. M., Maechler, M., Bolker, B., & Walker, S. (2014). lme4: Linear mixed-effects models using Eigen and S4 [Computer software]. Retrieved from
  10. Best, C. T. (1994). The emergence of native-language phonological influences in infants: A perceptual assimilation model. In J. C. Goodman & H. C. Nusbaum (Eds.), The development of speech perception: The transition from speech sounds to spoken words (pp. 167–224). Cambridge, MA: MIT Press.Google Scholar
  11. Best, C. T. (1995). A direct realist view of cross-language speech perception. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 171–204). Timonium, MD: York Press.Google Scholar
  12. Best, C. T., McRoberts, G. W., & Goodell, E. (2001). Discrimination of non-native consonant contrasts varying in perceptual assimilation to the listener’s native phonological system. Journal of the Acoustical Society of America, 109(2), 775–794. Google Scholar
  13. Best, C. T., McRoberts, G. W., & LaFleur, R. (1995). Divergent developmental patterns for infants’ perception of two nonnative consonant contrasts. Infant Behavior and Development, 18(3), 339–350. Google Scholar
  14. Best, C. T., McRoberts, G. W., & Sithole, N. (1988). Examination of perceptual reorganization for nonnative speech contrasts: Zulu click discrimination by English-speaking adults and infants. Journal of Experimental Psychology: Human Perception and Performance, 14, 345–360.Google Scholar
  15. Best, C. T., & Tyler, M. D. (2007). Nonnative and second-language speech perception: Commonalities and complementarities. In O. S. Bohn (Ed.), Language experience in second language speech learning in honor of James Emil Flege (pp. 13–34). Amsterdam, The Netherlands: John Benjamins.Google Scholar
  16. Boersma, P., & Weenink, D. (2015). Praat: doing phonetics by computer [Computer software]. Retrieved from
  17. Bradlow, A. R., Akahane-Yamada, R., Pisoni, D. B., & Tohkura, Y. I. (1999). Training Japanese listeners to identify English /r/ and /l/: Long-term retention of learning in perception and production. Perception & Psychophysics, 61(5), 977–985.Google Scholar
  18. Bradlow, A. R., & Bent, T. (2008). Perceptual adaptation to non-native speech. Cognition, 106(2), 707–729. Google Scholar
  19. Bradlow, A. R., Pisoni, D., Akahane-Yamada, R., & Tohkura, Y. (1997). Training Japanese listeners to identify English/r/and IV: IV. Some effects of perceptual learning on speech production. Journal of the Acoustical Society of America, 101(4), 2299–2223.Google Scholar
  20. Brouwer, S., Mitterer, H., & Huettig, F. (2010). Shadowing reduced speech and alignment. Journal of the Acoustical Society of America, 1–14.
  21. Brown, H. D. (2015). Teaching by principles: An interactive approach to language pedagogy (Vol. 4). Englewood Cliffs, NJ: Prentice Hall Regents.Google Scholar
  22. Clayards, M., Tanenhaus, M. K., Aslin, R. N., & Jacobs, R. A. (2008). Perception of speech reflects optimal use of probabilistic speech cues. Cognition, 108(3), 804–809.Google Scholar
  23. Davidson, L. (2016). Variability in the implementation of voicing in American English obstruents. Journal of Phonetics, 54(C), 35–50. Google Scholar
  24. Diehl, R. L., & Kluender, K. R. (1989). On the objects of speech perception. Ecological Psychology, 1(2), 121–144. Google Scholar
  25. Diehl, R. L., Lotto, A. J., & Holt, L. L. (2004). Speech perception. Annual Review of Psychology, 55(1), 149–179. Google Scholar
  26. Dromey, C., & Benson, A. (2003). Effects of concurrent motor, linguistic, or cognitive tasks on speech motor performance. Journal of Speech, Language, and Hearing Research, 46(5), 1234–1246.Google Scholar
  27. Ellis, N. C. (2003). Constructions, chunking, and connectionism: The emergence of second language structure. In C. Doughty & M. H. Long (Eds.), Handbook of second language acquisition (pp. 33–68). Oxford, UK: Blackwell.Google Scholar
  28. Ellis, N. C. (2009). Optimizing the input: Frequency and in usage-based and form-focused learning. In M. H. Long & C. Doughty (Eds.), Handbook of language teaching (pp. 139–158). Oxford, UK: Blackwell.Google Scholar
  29. Ellis, R., & Shintani, N. (2013). Exploring language pedagogy through second language acquisition research. New York, NY: Routledge.Google Scholar
  30. Ferreira, V. S., & Pashler, H. (2002). Central bottleneck influences on the processing stages of word production. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28(6), 1187–1199.Google Scholar
  31. Flege, J. E. (1995). Second language speech learning: Theory, findings, and problems. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 233–277). Timonium, MD: York PressGoogle Scholar
  32. Flege, J. E. (1997). Effects of experience on non-native speakers’ production and perception of English vowels. Journal of Phonetics, 25(4), 437–470. Google Scholar
  33. Fowler, C. A. (1986). An event approach to the study of speech perception from a direct-realist perspective. Journal of Phonetics, 14(1), 3–28.Google Scholar
  34. Francis, A. L., MacPherson, M. K., Chandrasekaran, B., & Alvar, A. M. (2016). Autonomic nervous system responses during perception of masked speech may reflect constructs other than subjective listening effort. Frontiers in Psychology, 7, 263.Google Scholar
  35. Goldinger, S. (1998). Echoes of echoes? An episodic theory of lexical access. Psychological Review, 105(2), 251–279.Google Scholar
  36. Goldinger, S., & Azuma, T. (2004). Episodic memory reflected in printed word naming. Psychonomic Bulletin & Review, 11, 716–722.Google Scholar
  37. Goto, H. (1971). Auditory perception by normal Japanese adults of the sounds. Neuropsychologia, 9(3), 317–323.Google Scholar
  38. Guenther, F. H., Hampson, M., & Johnson, D. (1998). A theoretical investigation of reference frames for the planning of speech movements. Psychological Review, 105(4), 611.Google Scholar
  39. Hao, Y. C., & de Jong, K. (2016). Imitation of second language sounds in relation to L2 perception and production. Journal of Phonetics, 54, 151–168.Google Scholar
  40. Hattori, K. (2010). Perception and production of English/r/-/l/by adult Japanese speakers (Doctoral thesis, University College London, UK). Retrieved from
  41. Holt, L. L., & Lotto, A. J. (2006). Cue weighting in auditory categorization: Implications for first and second language acquisition. The Journal of the Acoustical Society of America, 119(5), 3059–3071.Google Scholar
  42. Hymes, D. (1972). On communicative competence. In J. B. Pride & J. Holmes (Eds.), Sociolinguistics (pp. 269–293). Harmondsworth, UK: Penguin Books.Google Scholar
  43. Iverson, P., & Evans, B. G. (2009). Learning English vowels with different first-language vowel systems II: Auditory training for native Spanish and German speakers. Journal of the Acoustical Society of America, 126(2), 866. Google Scholar
  44. Iverson, P., Hazan, V., & Bannister, K. (2005). Phonetic training with acoustic cue manipulations: A comparison of methods for teaching English/r/-/l/to Japanese adults. Journal of the Acoustical Society of America, 118, 3267.Google Scholar
  45. Kingston, J. (2003). Learning foreign vowels. Language and Speech, 46, 295–349.Google Scholar
  46. Kingston J., & Diehl, R. L. (1994), Phonetic knowledge. Language, 70, 419–454.Google Scholar
  47. Kingston, J., & Diehl, R. L. (1995). Intermediate properties in the perception of distinctive feature values. In B. Connell & A. Arvaniti (Eds.), Phonology and phonetics: Papers in laboratory phonology IV (pp. 7–27). Cambridge, UK: Cambridge University Press.Google Scholar
  48. Krashen, S. (1981). Bilingual education and second language acquisition theory. Schooling and language minority students: A theoretical framework (pp. 51–79). Sacramento, CA: California State Department of Education.Google Scholar
  49. Krashen, S. (1989). We acquire vocabulary and spelling by reading: Additional evidence for the input hypothesis. The Modern Language Journal, 73(4), 440–464.Google Scholar
  50. Krashen, S. D. (1985). The input hypothesis: Issues and implications. New York, NY: Addison-Wesley.Google Scholar
  51. Kronrod, Y., Coppess, E., & Feldman, N. H. (2016). A unified account of categorical effects in phonetic perception. Psychonomic Bulletin & Review, 23(6), 1681–1712.Google Scholar
  52. Kuhl, P. K., Williams, K., Lacerda, F., Stevens, K., & Lindblom, B. (1992). Linguistic experience alters phonetic perception in infants by 6 months of age. Science, 255(5044), 606–608.Google Scholar
  53. Leach, L., & Samuel, A. G. (2007). Lexical configuration and lexical engagement: When adults learn new words. Cognitive Psychology, 55(4), 306–353. Google Scholar
  54. Leather, J. (1990). Perceptual and productive learning of Chinese lexical tone by Dutch and English speakers. New Sounds, 90, 72–95.Google Scholar
  55. Leussen, V., & Escudero, P. (2015). Learning to perceive and recognize a second language: The L2LP model revised. Frontiers in Psychology, 6, 1000.Google Scholar
  56. Libermann, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74(6), 431.Google Scholar
  57. Libermann, A. M., Delattre, P., & Cooper, F. S. (1952). The role of selected stimulus-variables in the perception of the unvoiced stop consonants. American Journal of Psychology, 65(4), 497–516.Google Scholar
  58. Libermann, A. M., Harris, K., Hoffman, H., & Griffith, B. (1957). The discrimination of speech sounds within and across phoneme boundaries. Journal of Experimental Psychology, 54(5), 358–368.Google Scholar
  59. Libermann, A. M., & Mattingly, I. G. (1985). The motor theory of speech perception revised. Cognition, 21(1), 1–36. Google Scholar
  60. Libermann, A. M., & Mattingly, I. G. (1989). A specialization for speech perception. Science, 243(4890), 489–494.Google Scholar
  61. Liljencrants, J., & Lindblom, B. (1972). Numerical simulation of vowel quality contrasts: The role of perceptual contrast. Language, 48(4), 839–862.Google Scholar
  62. Lim, S. J., & Holt, L. L. (2011). Learning foreign sounds in an alien world: Videogame training improves non-native speech categorization. Cognitive Science, 35(7), 1390–1405.Google Scholar
  63. Logan, J. S., Lively, S. E., & Pisoni, D. B. (1991). Training Japanese listeners to identify English/r/and/l: A first report. The Journal of the Acoustical Society of America, 89(2), 874–886. Google Scholar
  64. MacKain, K., Best, C. T., & Strange, W. (1981). Categorical perception of English /r/ and /l/ by Japanese bilinguals. Applied PsychoLinguistics, 2, 369–390.Google Scholar
  65. Maye, J., & Gerken, L. A. (2000). Learning phonemes without minimal pairs. In S. C. Howell, S. A. Fish, & T. Keith-Lucas (Eds.), Proceedings of the 24th annual Boston University Conference on Language Development (pp. 522–533). Somerville, MA: Cascadilla Press.Google Scholar
  66. McClaskey, C. L., Pisoni, D. B., & Carrell, T. D. (1983). Transfer of training to a new linguistic contrast in voicing. Perception & Psychophysics, 34(4), 323–330. Retrieved from Google Scholar
  67. McDonough, J., Shaw, C., & Masuhara, H. (2013). Materials and methods in ELT: A teacher’s guide (3rd ed.). New York, NY: Wiley-Blackwell.Google Scholar
  68. McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264(5588), 746–748.Google Scholar
  69. Mitterer, H., & Ernestus, M. (2008). The link between speech perception and production is phonological and abstract: Evidence from the shadowing task. Cognition, 109(1), 168–173. Google Scholar
  70. Morrill, T., Baese-Berk, M. M., & Bradlow, A. R. (2016). Speaking rate consistency and variability in spontaneous speech by native and non-native speakers of English. Proceedings of the International Conference on Speech Prosody, 2016, 1119–1123.Google Scholar
  71. Nagle, C. L. (2018). Examining the temporal structure of the perception–production link in second language acquisition: A longitudinal study. Language Learning, 68(1), 234–270.Google Scholar
  72. Nielsen, K. (2011). Specificity and abstractness of VOT imitation. Journal of Phonetics, 1–11.
  73. Nunan, D. (2002). Listening in language learning. In J. C. Richards, & W. A. Renandya (Eds.), Methodology in language teaching (pp. 238–241). Cambridge, UK: Cambridge University Press.Google Scholar
  74. Nye, P., & Fowler, C. A. (2003). Shadowing latency and imitation: The effect of familiarity with the phonetic patterning of English. Journal of Phonetics, 31(1), 63–79.Google Scholar
  75. Pardo, J. S. (2006). On phonetic convergence during conversational interaction. Journal of the Acoustical Society of America, 119(4), 2382–2393. Google Scholar
  76. Pederson, E., & Guion-Anderson, S. (2010). Orienting attention during phonetic training facilitates learning. The Journal of the Acoustical Society of America, 127(2), EL54–EL59.Google Scholar
  77. Pegg, J., Werker, J. F., Ferguson, L., Menn, C A, & Stoel-Gammon, C. (1992). Infant speech perception and phonological acquisition. In C. A. Ferguson, L. Menn, & C. Stoel-Gammon (Eds.), Phonological development: Models, research, implications (pp. 285–311). Timonium, MD: York Press.Google Scholar
  78. Perrachione, T. K., Lee, J., Ha, L. Y., & Wong, P. C. (2011). Learning a novel phonological contrast depends on interactions between individual differences and training paradigm design. The Journal of the Acoustical Society of America, 130(1), 461–472.Google Scholar
  79. Polio, C. (2007). A history of input enhancement: Defining an evolving concept. In C. Gascoigne (Ed.). Assessing the impact of input enhancement in second language education. Stillwater, OK: New Forums Press.Google Scholar
  80. Prather, J., Okanoya, K., & Bolhuis, J. J. (2017). Brains for birds and babies: Neural parallels between birdsong and speech acquisition. Neuroscience & Biobehavioral Reviews, 81(Pt. B), 225–237. Google Scholar
  81. Rost, G. C., & McMurray, B. (2010). Finding the signal by adding noise: The role of noncontrastive phonetic variability in early word learning. Infancy, 15(6), 608–635.Google Scholar
  82. Samuel, A. G. (2011). The lexicon and phonetic categories: Change is bad, change is necessary. In M. G. Gaskell & P. Zwisterlood (Eds.), Lexical representation: A multidisciplinary approach. Berlin, Germany: Mouton de Gruyter.Google Scholar
  83. Sheldon, A., & Strange, W. (1982). The acquisition of /r/ and /l/ by Japanese learners of English: Evidence that speech production can precede speech perception. Applied PsychoLinguistics, 3(03), 243–261. Google Scholar
  84. Shockley, K., Sabadini, L., & Fowler, C. A. (2004). Imitation in shadowing words. Perception & Psychophysics, 66(3), 422–429.Google Scholar
  85. Sidaras, S. K., Alexander, J. E. D., & Nygaard, L. C. (2009). Perceptual learning of systematic variation in Spanish accented speech. The Journal of the Acoustical Society of America, 125(5), 3306–3316.Google Scholar
  86. Sommers, M., & Barcroft, J. (2007). An integrated account of the effects of acoustic variability in first language and second language: Evidence from amplitude, fundamental frequency, and speaking rate variability. Applied PsychoLinguistics, 28, 2, 231–249Google Scholar
  87. Strange, W., & Dittman, S. (1984). Effects of discrimination training on the perception of /r-l/ by Japanese adults learning English. Perception & Psychophysics, 36, 131–145.Google Scholar
  88. Sumner, M. (2011). The role of variation in the perception of accented speech. Cognition, 119(1), 131–136.Google Scholar
  89. Tateishi, M. (2013, September 25). Effects of the use of ultrasound in production training on the perception of English /r/ and /l/ by Native Japanese speakers (Master’s thesis, University of Calgary, Alberta, Canada). Retrieved from
  90. Thorin, J., Sadakata, M., Desain, P., & McQueen, J. M. (2018). Perception and production in interaction during non-native speech category learning. The Journal of the Acoustical Society of America, 144(1), 92–103.Google Scholar
  91. Tremblay, K., Kraus, N., & McGee, T. (1998). The time course of auditory perceptual learning: Neurophysiological changes during speech-sound training. NeuroReport, 9(16), 3556–3560.Google Scholar
  92. Vallabha, G., & Tuller, B. (2004). Perceptuomotor bias in the imitation of steady-state vowels. Journal of the Acoustical Society of America, 116, 1184.Google Scholar
  93. Vaughn, C. R., Baese-Berk, M. M., & Idemaru, K. (2018). Re-examining phonetic variability in native and non-native speech. Phonetica. Advance online publication.
  94. Wade, T., Jongman, A., & Sereno, J. (2007). Effects of acoustic variability in the perceptual learning of non-native-accented speech sounds. Phonetica, 64(2/3), 122–144. Google Scholar
  95. Wang, Y., Jongman, A., & Sereno, J. A. (2003). Acoustic and perceptual evaluation of Mandarin tone productions before and after perceptual training. The Journal of the Acoustical Society of America, 113(2), 1033–1043.Google Scholar
  96. Wang, Y., Spence, M. M., Jongman, A., & Sereno, J. A. (1999). Training American listeners to perceive Mandarin tones: Transfer to production. Journal of the Acoustical Society of America, 106(6), 3649–3658.Google Scholar
  97. Werker, J. F., & Tees, R. C. (1984). Cross-language speech perception: Evidence for perceptual reorganization during the first year of life. Infant Behavior and Development, 7, 49–63.Google Scholar

Copyright information

© The Psychonomic Society, Inc. 2019

Authors and Affiliations

  1. 1.Department of Linguistics1290 University of OregonEugeneUSA

Personalised recommendations