Abstract
Perceiving and understanding spoken language is something that most listeners take for granted, at least in favorable listening conditions. Yet, decades of research have demonstrated that speech is variable and ambiguous, meaning listeners must constantly engage in active hypothesis testing of what was said. Within this framework, even relatively minor challenges imposed on speech recognition must be understood as requiring the interaction of perceptual, cognitive, and linguistic factors. This chapter provides a systematic review of the various ways in which listening environments may be considered adverse, with a dual focus on the cognitive and neural systems that are thought to improve speech recognition in these challenging situations. Although a singular mechanism or construct cannot entirely explain how listeners cope with adversity in speech recognition, overcoming listening adversity is an attentionally guided process. Neurally, many adverse listening conditions appear to depend on higher-order (rather than primary) representations of speech in cortex, suggesting that more abstract linguistic knowledge and context become particularly important for comprehension when acoustic input is compromised. Additionally, the involvement of the cinguloopercular (CO) network, particularly the anterior insula, in a myriad of adverse listening situations may indicate that this network reflects a general indication of cognitive effort. In discussing the various challenges faced in the perception and understanding of speech, it is critically important to consider the interaction of the listener’s cognitive resources (knowledge and abilities) with the specific challenges imposed by the listening environment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adank P, Evans BG, Stuart-Smith J, Scott SK (2009) Comprehension of familiar and unfamiliar native accents under adverse listening conditions. J Exp Psychol Hum Percept Perform 35:520–529. https://doi.org/10.1037/a0013552
Adank P, Davis MH, Hagoort P (2012) Neural dissociation in processing noise and accent in spoken language comprehension. Neuropsychologia 50:77–84. https://doi.org/10.1016/j.neuropsychologia.2011.10.024
Adank P, Nuttall HE, Banks B, Kennedy-Higgins D (2015) Neural bases of accented speech perception. Front Hum Neurosci 9:1–7. https://doi.org/10.3389/fnhum.2015.00558
ANSI. (2013). American National Standard Acoustical Terminology, ANSI S1.1-2013. New York: American National Standards Institute.
Baddeley AD, Hitch G (1974) Working memory. Psychol Learn Motiv 8:47–89. https://doi.org/10.1016/S0079-7421(08)60452-1
Banks B, Gowen E, Munro KJ, Adank P (2015) Cognitive predictors of perceptual adaptation to accented speech. J Acoust Soc Am 137:2015–2024. https://doi.org/10.1121/1.4916265
Bates E, Wilson SM, Saygin AP et al (2003) Voxel-based lesion–symptom mapping. Nat Neurosci 6:448–450. https://doi.org/10.1038/nn1050
Binder JR, Desai RH, Graves WW, Conant LL (2009) Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies. Cereb Cortex 19:2767–2796. https://doi.org/10.1093/cercor/bhp055
Bolia RS, Nelson WT, Ericson MA, Simpson BD (2000) A speech corpus for multitalker communications research. J Acoust Soc Am 107:1065–1066. https://doi.org/10.1121/1.428288
Bradlow AR, Pisoni DB, Akahane-Yamada R, Tohkura Y (1997) Training Japanese listeners to identify English / r / and / l /: IV. Some effects of perceptual learning on speech production. J Acoust Soc Am 101:2299–2310. https://doi.org/10.1121/1.418276
Brungart DS, Simpson BD, Ericson MA, Scott KR (2001) Informational and energetic masking effects in the perception of multiple simultaneous talkers. J Acoust Soc Am 110:2527–2538. https://doi.org/10.1121/1.1408946
Clarke CM, Garrett MF (2004) Rapid adaptation to foreign-accented English. J Acoust Soc Am 116:3647–3658. https://doi.org/10.1121/1.1815131
Collin B, Lavandier M (2013) Binaural speech intelligibility in rooms with variations in spatial location of sources and modulation depth of noise interferers. J Acoust Soc Am 134:1146–1159. https://doi.org/10.1121/1.4812248
Culling JF, Mansell ER (2013) Speech intelligibility among modulated and spatially distributed noise sources. J Acoust Soc Am 133:2254–2261. https://doi.org/10.1121/1.4794384
Culling JF, Stone MA (2017) Energetic masking and masking release. In: Middlebrooks J, Simon J, Popper A, Fay R (eds) The auditory system at the cocktail party. Springer handbook of auditory research, vol 60. Springer, Cham. https://doi.org/10.1007/978-3-319-51662-2_3
Cutler A, Norris D (1988) The role of strong syllables in segmentation for lexical access. J Exp Psychol Hum Percept Perform 14:113–121. https://doi.org/10.1037/0096-1523.14.1.113
Darwin CJ, Carlyon RP (1995) Auditory grouping. In: Moore BCJ (ed) The handbook of perception and cognition, vol 6, Hearing, 2nd edn. Academic Press, San Diego, pp 387–424
Davis MH, Johnsrude IS (2003) Hierarchical processing in spoken language comprehension. J Neurosci 23:3423–3431. https://doi.org/10.1523/JNEUROSCI.23-08-03423.2003
Davis MH, Johnsrude IS (2007) Hearing speech sounds: top-down influences on the interface between audition and speech perception. Hear Res 229:132–147. https://doi.org/10.1016/j.heares.2007.01.014
Denes PB, Pinson EN (1993) The speech chain: the physics and biology of spoken language. W.H. Freeman, New York
Dosenbach NUF, Visscher KM, Palmer ED et al (2006) A core system for the implementation of task sets. Neuron 50:799–812. https://doi.org/10.1016/j.neuron.2006.04.031
Dronkers NF, Wilkins DP, Van Valin RD et al (2004) Lesion analysis of the brain areas involved in language comprehension. Cognition 92:145–177. https://doi.org/10.1016/j.cognition.2003.11.002
Du Y, Buchsbaum BR, Grady CL, Alain C (2014) Noise differentially impacts phoneme representations in the auditory and speech motor systems. Proc Natl Acad Sci 111:7126–7131. https://doi.org/10.1073/pnas.1318738111
Duncan J (2010) The multiple-demand (MD) system of the primate brain: mental programs for intelligent behaviour. Trends Cogn Sci 14:172–179. https://doi.org/10.1016/j.tics.2010.01.004
Edmonds BA, Culling JF (2006) The spatial unmasking of speech: evidence for better-ear listening. J Acoust Soc Am 120:1539–1545. https://doi.org/10.1121/1.2228573
Fadiga L, Craighero L, Buccino G, Rizzolatti G (2002) Speech listening specifically modulates the excitability of tongue muscles: a TMS study. Eur J Neurosci 15:399–402. https://doi.org/10.1046/j.0953-816x.2001.01874.x
Floccia C, Goslin J, Girard F, Konopczynski G (2006) Does a regional accent perturb speech processing? J Exp Psychol Hum Percept Perform 32:1276–1293. https://doi.org/10.1037/0096-1523.32.5.1276
Giraud AL, Lorenzi C, Ashburner J et al (2000) Representation of the temporal envelope of sounds in the human brain. J Neurophysiol 84:1588–1598. https://doi.org/10.1152/jn.2000.84.3.1588
Guion SG, Harada T, Clark JJ (2004) Early and late Spanish–English bilinguals’ acquisition of English word stress patterns. Biling (Camb Engl) 7:207–226. https://doi.org/10.1017/S1366728904001592
Hackett TA (2011) Information flow in the auditory cortical network. Hear Res 271:133–146. https://doi.org/10.1016/j.heares.2010.01.011
Hackett TA, de la Mothe LA, Camalier CR et al (2014) Feedforward and feedback projections of caudal belt and parabelt areas of auditory cortex: refining the hierarchical model. Front Neurosci. https://doi.org/10.3389/fnins.2014.00072
Hall DA, Haggard MP, Akeroyd MA et al (1999) “Sparse” temporal sampling in auditory fMRI. Hum Brain Mapp 7:213–223. https://doi.org/10.1002/(SICI)1097-0193(1999)7:3<213::AID-HBM5>3.0.CO;2-N
Hawkins S (2003) Roles and representations of systematic fine phonetic detail in speech understanding. J Phon 31:373–405. https://doi.org/10.1016/j.wocn.2003.09.006
Hickok G, Poeppel D (2015) Neural basis of speech perception. In: Aminoff MJ, Boller F, Swaab DF (eds) Handbook of clinical neurology, 129th edn. Elsevier, pp 149–160
Holmes E, Domingo Y, Johnsrude IS (2018) Familiar voices are more intelligible, even if they are not recognized as familiar. Psychol Sci 29:1575–1583. https://doi.org/10.1177/0956797618779083
Holt L (2005) Temporally nonadjacent nonlinguistic sounds affect speech categorization. Psychol Sci 16:305–312. https://doi.org/10.1111/j.0956-7976.2005.01532.x
Humes LE, Lee JH, Coughlin MP (2006) Auditory measures of selective and divided attention in young and older adults using single-talker competition. J Acoust Soc Am 120:2926–2937. https://doi.org/10.1121/1.2354070
Ingvalson EM, Dhar S, Wong PCM, Liu H (2015) Working memory training to improve speech perception in noise across languages. J Acoust Soc Am 137:3477–3486. https://doi.org/10.1121/1.4921601
Janse E, Adank P (2012) Predicting foreign-accent adaptation in older adults. Q J Exp Psychol 65:1563–1585. https://doi.org/10.1080/17470218.2012.658822
Johnson J, Xu J, Cox R, Pendergraft P (2015) A comparison of two methods for measuring listening effort as part of an audiologic test battery. Am J Audiol 24:419–431. https://doi.org/10.1044/2015_AJA-14-0058
Johnsrude IS, Mackey A, Hakyemez H et al (2013) Swinging at a cocktail party: voice familiarity aids speech perception in the presence of a competing voice. Psychol Sci 24:1995–2004. https://doi.org/10.1177/0956797613482467
Jones EG (2003) Chemically defined parallel pathways in the monkey auditory system. Ann N Y Acad Sci 999:218–233. https://doi.org/10.1196/annals.1284.033
Kidd G, Colbourn HS (2017) Informational masking in speech recognition. In: Middlebrooks J, Simon J, Popper A, Fay R (eds) The auditory system at the cocktail party, Springer handbook of auditory research, 60th edn. Springer International Publishing, Cham, pp 75–109
Kidd G, Mason CR, Best V, Marrone N (2010) Stimulus factors influencing spatial release from speech-on-speech masking. J Acoust Soc Am 128:1965–1978. https://doi.org/10.1121/1.3478781
Kraljic T, Brennan SE, Samuel AG (2008) Accommodating variation: dialects, idiolects, and speech processing. Cognition 107:54–81. https://doi.org/10.1016/j.cognition.2007.07.013
Liberman AM, Mattingly IG (1985) The motor theory of speech perception revised. Cognition 21:1–36. https://doi.org/10.1016/0010-0277(85)90021-6
Liberman AM, Cooper FS, Shankweiler DP, Studdert-Kennedy M (1967) Perception of the speech code. Psychol Rev 74:431–461. https://doi.org/10.1037/h0020279
Lim SJ, Holt LL (2011) Learning foreign sounds in an alien world: videogame training improves non-native speech categorization. Cogn Sci 35:1390–1405. https://doi.org/10.1111/j.1551-6709.2011.01192.x
Lotto AJ, Hickok GS, Holt LL (2009) Reflections on mirror neurons and speech perception. Trends Cogn Sci 13:110–114. https://doi.org/10.1016/j.tics.2008.11.008
Macleod A, Summerfield Q (1990) A procedure for measuring auditory and audiovisual speech-reception thresholds for sentences in noise: rationale, evaluation, and recommendations for use. Br J Audiol 24:29–43. https://doi.org/10.3109/03005369009077840
Mattys SL, Palmer SD (2015) Divided attention disrupts perceptual encoding during speech recognition. J Acoust Soc Am 137:1464–1472. https://doi.org/10.1121/1.4913507
Mattys SL, White L, Melhorn JF (2005) Integration of multiple speech segmentation cues: a hierarchical framework. J Exp Psychol Gen 134:477–500. https://doi.org/10.1037/0096-3445.134.4.477
Mattys SL, Davis MH, Bradlow AR, Scott SK (2012) Speech recognition in adverse conditions: a review. Lang Cogn Process 27:953–978. https://doi.org/10.1080/01690965.2012.705006
Mattys SL, Barden K, Samuel AG (2014) Extrinsic cognitive load impairs low-level speech perception. Psychon Bull Rev 21:748–754. https://doi.org/10.3758/s13423-013-0544-7
Mesulam MM, Wieneke C, Thompson C et al (2012) Quantitative classification of primary progressive aphasia at early and mild impairment stages. Brain 135:1537–1553. https://doi.org/10.1093/brain/aws080
Miller GA, Licklider JCR (1950) The intelligibility of interrupted speech. J Acoust Soc Am 22:167–173. https://doi.org/10.1017/S0031182000023970
Möttönen R, van de Ven GM, Watkins KE (2014) Attention fine-tunes auditory-motor processing of speech sounds. J Neurosci 34:4064–4069. https://doi.org/10.1523/JNEUROSCI.2214-13.2014
Nakai T, Kato C, Matsuo K (2005) An fMRI study to investigate auditory attention: a model of the cocktail party phenomenon. Magn Reson Med Sci 4:75–82. https://doi.org/10.2463/mrms.4.75
Norris D, Mcqueen JM, Cutler A, Butterfield S (1997) The possible-word constraint in the segmentation of continuous speech. Cogn Psychol 34:191–243. https://doi.org/10.1006/cogp.1997.0671
Nygaard LC, Pisoni DB (1998) Talker-specific learning in speech perception. Percept Psychophys 60:355–376. https://doi.org/10.3758/BF03206860
Okada K, Rong F, Venezia J et al (2010) Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. Cereb Cortex 20:2486–2495. https://doi.org/10.1093/cercor/bhp318
Peelle JE (2018) Listening effort: how the cognitive consequences of acoustic challenge are reflected in brain and behavior. Ear Hear 39:204–214. https://doi.org/10.1097/AUD.0000000000000494
Peelle JE, Johnsrude IS, Davis MH (2010) Hierarchical organization for speech in human auditory cortex and beyond. Front Hum Neurosci 4:1–3. https://doi.org/10.3389/fnhum.2010.00051
Phillips DP, Farmer ME (1990) Acquired word deafness, and the temporal grain of sound representation in the primary auditory cortex. Behav Brain Res 40:85–94. https://doi.org/10.1016/0166-4328(90)90001-U
Pichora-Fuller MK, Kramer SE, Eckert MA et al (2016) Hearing impairment and cognitive energy. Ear Hear 37:5S–27S. https://doi.org/10.1097/AUD.0000000000000312
Pisoni DB, Nusbaum HC, Greene BG (1985) Perception of synthetic speech generated by rule. Proc IEEE 73:1665–1676. https://doi.org/10.1109/PROC.1985.13346
Poremba A, Mishkin M (2007) Exploring the extent and function of higher-order auditory cortex in rhesus monkeys. Hear Res 229:14–23. https://doi.org/10.1016/j.heares.2007.01.003
Reisberg D, McLean J, Goldfield A (1987) Easy to hear but hard to understand: a lip-reading advantage with intact auditory stimuli. In: Dodd B, Campbell R (eds) Hearing by eye: the psychology of lip-reading. Lawrence Erlbaum Associates, Inc., Hillsdale, pp 97–113
Ritz H, Wild C, Johnsrude IJ (2016) The effects of concurrent cognitive load on the processing of clear and degraded speech. In: 22nd annual meeting of the Organization for Human Brain Mapping
Rodd JM, Gaskell G, Marslen-Wilson W (2002) Making sense of semantic ambiguity: semantic competition in lexical access. J Mem Lang 46:245–266. https://doi.org/10.1006/jmla.2001.2810
Rodd JM, Davis MH, Johnsrude IS (2005) The neural mechanisms of speech comprehension: fMRI studies of semantic ambiguity. Cereb Cortex 15:1261–1269. https://doi.org/10.1093/cercor/bhi009
Rodd JM, Johnsrude IS, Davis MH (2012) Dissociating frontotemporal contributions to semantic ambiguity resolution in spoken sentences. Cereb Cortex 22:1761–1773. https://doi.org/10.1093/cercor/bhr252
Scott SK, McGettigan C (2013) The neural processing of masked speech. Hear Res 303:58–66. https://doi.org/10.1016/j.heares.2013.05.001
Shannon RV, Zeng FG, Kamath V et al (1995) Speech recognition with primarily temporal cues. Science 270:303–304. https://doi.org/10.1126/science.270.5234.303
Srinivasan S, Keil A, Stratis K et al (2014) Interaural attention modulates outer hair cell function. Eur J Neurosci 40:3785–3792. https://doi.org/10.1111/ejn.12746
Sumby WH, Pollack I (1954) Visual contribution to speech intelligibility in noise. J Acoust Soc Am 26:212–215. https://doi.org/10.1121/1.1907309
Turken AU, Dronkers NF (2011) The neural architecture of the language comprehension network: converging evidence from lesion and connectivity analyses. Front Syst Neurosci 5:1–20. https://doi.org/10.3389/fnsys.2011.00001
Vaden KI, Kuchinsky SE, Cute SL et al (2013) The cingulo-opercular network provides word-recognition benefit. J Neurosci 33:18979–18986. https://doi.org/10.1523/JNEUROSCI.1417-13.2013
Van Engen KJ, Peelle JE (2014) Listening effort and accented speech. Front Hum Neurosci 8:1–4. https://doi.org/10.3389/fnhum.2014.00577
Wayne RV, Johnsrude IS (2015) A review of causal mechanisms underlying the link between age-related hearing loss and cognitive decline. Ageing Res Rev 23:154–166. https://doi.org/10.1016/j.arr.2015.06.002
Wayne RV, Hamilton C, Huyck JJ, Johnsrude IS (2016) Working memory training and speech in noise comprehension in older adults. Front Aging Neurosci 8:1–15. https://doi.org/10.3389/fnagi.2016.00049
Wild CJ, Yusuf A, Wilson DE et al (2012) Effortful listening: the processing of degraded speech depends critically on attention. J Neurosci 32:14010–14021. https://doi.org/10.1523/JNEUROSCI.1528-12.2012
Zekveld AA, Rudner M, Johnsrude IS et al (2012) Behavioral and fMRI evidence that cognitive ability modulates the effect of semantic context on speech intelligibility. Brain Lang 122:103–113. https://doi.org/10.1016/j.bandl.2012.05.006
Zekveld AA, Rudner M, Johnsrude IS, Rönnberg J (2013) The effects of working memory capacity and semantic cues on the intelligibility of speech in noise. J Acoust Soc Am 134:2225–2234. https://doi.org/10.1121/1.4817926
Compliance with Ethics Requirements
-
Ingrid Johnsrude declares that she has no conflict of interest.
-
Stephen Van Hedger declares that he has no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Van Hedger, S.C., Johnsrude, I.S. (2022). Speech Perception Under Adverse Listening Conditions. In: Holt, L.L., Peelle, J.E., Coffin, A.B., Popper, A.N., Fay, R.R. (eds) Speech Perception. Springer Handbook of Auditory Research, vol 74. Springer, Cham. https://doi.org/10.1007/978-3-030-81542-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-81542-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-81541-7
Online ISBN: 978-3-030-81542-4
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)