Abstract
The superior temporal gyrus (STG) has long been recognized as crucial to the human ability to perceive and comprehend spoken language. However, the nature of the neuronal computations and cortical representations responsible for this sensory and cognitive feat remain a mystery. The recent advance of methodologies for intracranial electrophysiology (iEEG) recordings, together with the emergence of novel computational approaches, have heralded progress toward understanding how neural processing in auditory cortex gives rise to the perceptual experience of speech. This chapter describes a collection of intracranial neurophysiology studies that illustrate two fundamental properties of STG encoding of speech sounds. First, this neural representation of speech is firmly rooted in the analysis of high-order acoustic features in the sensory stimulus. Second, the neural representation also differs dramatically from a linear representation of sound acoustics. The STG encodes an imperfect spectrotemporal representation of speech, sacrificing faithfulness to the sensory signal where it enhances the robust encoding of linguistically and behaviorally relevant information. Besides being insensitive to behaviorally irrelevant information carried by the speech signal, STG is also sensitive to behaviorally relevant information not contained within the speech signal (i.e., top-down cues). Overall, mounting evidence suggests that STG is a sensory-perceptual hub for the human speech perception system, functionally characterized by the behaviorally relevant cortical representation of speech that emerges therein.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abercrombie D (1967) Elements of general phonetics. Aldine, Chicago
Allen JS, Miller JL, DeSteno D (2003) Individual talker differences in voice-onset-time. J Acoust Soc Am 113(1):544. https://doi.org/10.1121/1.1528172
Assmann P, Summerfield Q (2004) The perception of speech under adverse conditions. In: Greenberg S et al (eds) Speech processing in the auditory system. Springer, New York, pp 231–308. https://doi.org/10.1007/0-387-21575-1_5
Bendor D, Wang X (2005) The neuronal representation of pitch in primate auditory cortex. Nature 436(7054):1161–1165. https://doi.org/10.1038/nature03867
Berezutskaya J, Freudenburg ZV, Güçlü U et al (2017) Neural tuning to low-level features of speech throughout the perisylvian cortex. J Neurosci 37(33):7906–7920. https://doi.org/10.1523/JNEUROSCI.0238-17.2017
Bialek W, Rieke F, de Ruyter van Steveninck RR et al (1991) Reading a neural code. Science 252(5014):1854–1857. https://doi.org/10.1126/SCIENCE.2063199
Bitterman Y, Mukamel R, Malach R et al (2008) Ultra-fine frequency tuning revealed in single neurons of human auditory cortex. Nature 451(7175):197–201. https://doi.org/10.1038/nature06476
Bolanowski SJ, Gescheider GA, Verrillo RT et al (1988) Four channels mediate the mechanical aspects of touch. J Acoust Soc Am 84(5):1680–1694. https://doi.org/10.1121/1.397184
Brugge JF (1992) An overview of central auditory processing. In: The mammalian auditory pathway: neurophysiology, vol 2. Springer, New York, pp 1–33. https://doi.org/10.1007/978-1-4612-2838-7_1
Buzsáki G, Anastassiou CA, Koch C (2012) The origin of extracellular fields and currents — EEG, ECoG, LFP and spikes. Nat Rev Neurosci 13(6):407–420. https://doi.org/10.1038/nrn3241
Chan AM, Dykstra AR, Jayaram V et al (2014) Speech-specific tuning of neurons in human superior temporal gyrus. Cereb Cortex 24(10):2679–2693. https://doi.org/10.1093/cercor/bht127
Chang EF (2015) Towards large-scale, human-based, mesoscopic neurotechnologies. Neuron 86(1):68–78
Chang EF, Rieger JW, Johnson K et al (2010) Categorical speech representation in human superior temporal gyrus. Nat Neurosci 13(11):1428–1432. https://doi.org/10.1038/nn.2641
Chang EF, Niziolek CA, Knight RT et al (2013) Human cortical sensorimotor network underlying feedback control of vocal pitch. PNAS 110(7):2653–2658. https://doi.org/10.1073/pnas.1216827110
Cherry EC (1953) Some experiments on the recognition of speech, with one and with two ears. J Acoust Soc Am 25(5):975–979. https://doi.org/10.1121/1.1907229
Cheung C, Hamilton LS, Johnson K et al (2016) The auditory representation of speech sounds in human motor cortex. elife 5:1–19. https://doi.org/10.7554/eLife.12577
Chi T, Gao Y, Guyton MC et al (1999) Spectro-temporal modulation transfer functions and speech intelligibility. J Acoust Soc Am 106(5):2719–2732. https://doi.org/10.1121/1.428100
Chomsky N, Halle M (1968) The sound pattern of English. Harper & Row, New York
Cibelli ES, Leonard MK, Johnson K et al (2015) The influence of lexical statistics on temporal lobe cortical dynamics during spoken word listening. Brain Lang 147:66–75. https://doi.org/10.1016/j.bandl.2015.05.005
Clayards MA, Tanenhaus MK, Aslin RN et al (2008) Perception of speech reflects optimal use of probabilistic speech cues. Cognition 108(3):804–809. https://doi.org/10.1016/j.cognition.2008.04.004
Crone NE, Miglioretti DL, Gordon B et al (1998) Functional mapping of human sensorimotor cortex with electrocorticographic spectral analysis. II. Event-related synchronization in the gamma band. Brain 121(12):2301–2315. https://doi.org/10.1093/brain/121.12.2301
Crone NE, Hao L, Hart J et al (2001) Electrocorticographic gamma activity during word production in spoken and sign language. Neurology 57(11):2045–2053
Crone NE, Sinai A, Korzeniewska A (2006) High-frequency gamma oscillations and human brain mapping with electrocorticography. Prog Brain Res 159:275–295. https://doi.org/10.1016/S0079-6123(06)59019-3
Cutler A, Dahan D, van Donselaar W (1997) Prosody in the comprehension of spoken language: a literature review. Lang Speech 40(2):141–201. https://doi.org/10.1177/002383099704000203
David SV (2018) Incorporating behavioral and sensory context into spectro-temporal models of auditory encoding. Hear Res 360:107–123. https://doi.org/10.1016/J.HEARES.2017.12.021
David SV, Mesgarani N, Shamma SA (2007) Estimating sparse spectro-temporal receptive fields with natural stimuli. Netw Comput Neural Syst 18(3):191–212. https://doi.org/10.1080/09548980701609235
Davis MH, Johnsrude IS (2007) Hearing speech sounds: top-down influences on the interface between audition and speech perception. Hear Res 229(1–2):132–147
de Saussure F (1916) Nature of the linguistic sign. In: Bally C, Sechehaye A (eds) Cours de linguistique générale. McGraw Hill Education
deCharms RC, Blake DT, Merzenich MM (1998) Optimizing sound features for cortical neurons. Science 280(5368):1439–1443. https://doi.org/10.1126/SCIENCE.280.5368.1439
Dehaene-Lambertz G (1997) Electrophysiological correlates of categorical phoneme perception in adults. Neuroreport 8(4):919–924. https://doi.org/10.1097/00001756-199703030-00021
Depireux DA, Simon JZ, Klein DJ, Shamma SA (2001) Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex. J Neurophysiol 85(3):1220–1234. https://doi.org/10.1152/jn.2001.85.3.1220
DeWitt I, Rauschecker JP (2012) Phoneme and word recognition in the auditory ventral stream. PNAS 109(8):E505–E514. https://doi.org/10.1073/pnas.1113427109
Di Liberto GM, O’Sullivan JA, Lalor EC (2015) Low-frequency cortical entrainment to speech reflects phoneme-level processing. Curr Biol 25(19):2457–2465. https://doi.org/10.1016/j.cub.2015.08.030
Diehl RL, Lotto AJ, Holt LL (2004) Speech perception. Annu Rev Psychol 55(1):149–179. https://doi.org/10.1146/annurev.psych.55.090902.142028
Ding N, Simon JZ (2012) Emergence of neural encoding of auditory objects while listening to competing speakers. PNAS 109(29):11854–11859. https://doi.org/10.1073/pnas.1205381109
Donders FC (1969) On the speed of mental processes. Acta Psychol 30:412–431. https://doi.org/10.1016/0001-6918(69)90065-1
Einevoll GT, Kayser C, Logothetis NK, Panzeri S (2013) Modelling and analysis of local field potentials for studying the function of cortical circuits. Nat Rev Neurosci 14(11):770–785. https://doi.org/10.1038/nrn3599
Elliott TM, Theunissen FE (2009) The modulation transfer function for speech intelligibility. PLoS Comput Biol 5(3):e1000302. https://doi.org/10.1371/journal.pcbi.1000302
Field DJ (1994) What is the goal of sensory coding? Neural Comput 6(4):559–601. https://doi.org/10.1162/neco.1994.6.4.559
Flinker A, Chang EF, Kirsch HE et al (2010) Single-trial speech suppression of auditory cortex activity in humans. J Neurosci 30(49):16643–16650. https://doi.org/10.1523/JNEUROSCI.1809-10.2010
Flinker A, Doyle WK, Mehta AD et al (2019) Spectrotemporal modulation provides a unifying framework for auditory cortical asymmetries. Nat Hum Behav 3(April):393–405. https://doi.org/10.1038/s41562-019-0548-z
Forseth KJ, Hickok G, Rollo PS, Tandon N (2020) Language prediction mechanisms in human auditory cortex. Nat Commun 11(1):1–14. https://doi.org/10.1038/s41467-020-19010-6
Fox NP, Leonard MK, Sjerps MJ, Chang EF (2020) Transformation of a temporal speech cue to a spatial neural code in human auditory cortex. elife 9:1–43. https://doi.org/10.7554/ELIFE.53051
Frye RE, Fisher JM, Coty A et al (2007) Linear coding of voice onset time. J Cogn Neurosci 19(9):1476–1487. https://doi.org/10.1162/jocn.2007.19.9.1476
Ganong WF (1980) Phonetic categorization in auditory word perception. J Exp Psychol Hum Percept Perform 6(1):110–125. https://doi.org/10.1037/0096-1523.6.1.110
Garofolo JS, Lamel LF, Fisher WM et al (1993) TIMIT acoustic-phonetic continuous speech corpus LDC93S1. Linguistic Data Consortium, Philadelphia
Griffiths TD, Kumar S, Sedley W et al (2010) Direct recordings of pitch responses from human auditory cortex. Curr Biol 20(12):1128–1132. https://doi.org/10.1016/J.CUB.2010.04.044
Grossberg S (2003) Resonant neural dynamics of speech perception. J Phon 31(3–4):423–445. https://doi.org/10.1016/S0095-4470(03)00051-2
Gussenhoven C, Repp BH, Rietveld A, Rump HH, Terken J (1997) The perceptual prominence of fundamental frequency peaks. J Acoust Soc Am 102(5):3009–3022. https://doi.org/10.1121/1.420355
Hamilton LS, Huth AG (2018) The revolution will not be controlled: natural stimuli in speech neuroscience. Lang Cogn Neurosci 35(5):573–582. https://doi.org/10.1080/23273798.2018.1499946
Hamilton LS, Edwards E, Chang EF (2018) A spatial map of onset and sustained responses to speech in the human superior temporal gyrus. Curr Biol 28(12):1860–1871.e4. https://doi.org/10.1016/j.cub.2018.04.033
Herff C, Schultz T (2016) Automatic speech recognition from neural signals: a focused review. Front Neurosci 10:429. https://doi.org/10.3389/fnins.2016.00429
Hickok G, Poeppel D (2007) The cortical organization of speech processing. Nat Rev Neurosci 8(5):393–402. https://doi.org/10.1038/nrn2113
Holdgraf CR, de Heer W, Pasley BN et al (2016) Rapid tuning shifts in human auditory cortex enhance speech intelligibility. Nat Commun 7(May):13654. https://doi.org/10.1038/ncomms13654
Holdgraf CR, Rieger JW, Micheli C, Martin S, Knight RT, Theunissen FE (2017) Encoding and decoding models in cognitive electrophysiology. Front Syst Neurosci 11(September):61. https://doi.org/10.3389/fnsys.2017.00061
Howard MA, Volkov IO, Mirsky R (2000) Auditory cortex on the human posterior superior temporal gyrus. J Comp Neurol 416(1):79–92
Howie JM (1976) Acoustical studies of Mandarin vowels and tones. Cambridge University Press, New York
Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106–154. https://doi.org/10.1113/jphysiol.1962.sp006837
Hullett PW, Hamilton LS, Mesgarani N, Schreiner CE, Chang EF (2016) Human superior temporal gyrus organization of spectrotemporal modulation tuning derived from speech stimuli. J Neurosci 36(6):2014–2026. https://doi.org/10.1523/JNEUROSCI.1779-15.2016
Huth AG, de Heer WA, Griffiths TL, Theunissen FE, Gallant JL (2016) Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 532(7600):453–458. https://doi.org/10.1038/nature17637
Jakobson R, Fant CGM, Halle M (1951) Preliminaries to speech analysis: the distinctive features and their correlates. MIT Press, Cambridge
Johnson K (2005) Speaker normalization in speech perception. In: Handbook of speech perception. Blackwell, pp 363–389
Johnson EL, Kam JWY, Tzovara A, Knight RT (2020) Insights into human cognition from intracranial EEG: a review of audition, memory, internal cognition, and causality. J Neural Eng 17(5):051001. https://doi.org/10.1088/1741-2552/abb7a5
Karas PJ, Magnotti JF, Metzger BA et al (2019) The visual speech head start improves perception and reduces superior temporal cortex responses to auditory speech. elife 8:1–19. https://doi.org/10.7554/eLife.48116
Khalighinejad B, da Silva GC, Mesgarani N (2017) Dynamic encoding of acoustic features in neural responses to continuous speech. J Neurosci 37(8):2176–2185. https://doi.org/10.1523/JNEUROSCI.2383-16.2017
Khalighinejad B, Herrero JL, Mehta AD, Mesgarani N (2019) Adaptation of the human auditory cortex to changing background noise. Nat Commun 10(1):1–11. https://doi.org/10.1038/s41467-019-10611-4
Khoshkhoo S, Leonard MK, Mesgarani N, Chang EF (2018) Neural correlates of sine-wave speech intelligibility in human frontal and temporal cortex. Brain Lang 187:83–91. https://doi.org/10.1016/j.bandl.2018.01.007
Klein DJ, Depireux DA, Simon JZ, Shamma SA (2000) Robust spectrotemporal reverse correlation for the auditory system: optimizing stimulus design. J Comput Neurosci 9(1):85–111. https://doi.org/10.1023/A:1008990412183
Kluender KR, Lotto AJ, Holt LL (2005) Contributions of nonhuman animal models to understanding human speech perception. In: Greenberg S, Ainsworth W (eds) Listening to speech: an auditory perspective. Oxford University Press, New York, pp 203–220
Kuhl PK (1986) Theoretical contributions of tests on animals to the special-mechanisms debate in speech. Exp Biol 45(3):233–265
Ladd DR (2008) Intonational phonology. Cambridge University Press, New York
Ladefoged P (1989) A note on “Information conveyed by vowels”. J Acoust Soc Am 85:2223–2224
Ladefoged P, Johnson K (2014) A course in phonetics. Nelson Education
Lee DK, Fedorenko E, Simon MV et al (2018) Neural encoding and production of functional morphemes in the posterior temporal lobe. Nat Commun 9(1):1–12. https://doi.org/10.1038/s41467-018-04235-3
Leonard MK, Bouchard KE, Tang C, Chang EF (2015) Dynamic encoding of speech sequence probability in human temporal cortex. J Neurosci 35(18):7203–7214. https://doi.org/10.1523/JNEUROSCI.4100-14.2015
Leonard MK, Baud MO, Sjerps MJ, Chang EF (2016) Perceptual restoration of masked speech in human cortex. Nat Commun 7:13619. https://doi.org/10.1038/ncomms13619
Łęski S, Lindén H, Tetzlaff T, Pettersen KH, Einevoll GT (2013) Frequency dependence of signal power and spatial reach of the local field potential. PLoS Comput Biol 9(7):e1003137. https://doi.org/10.1371/journal.pcbi.1003137
Leszczyński M, Barczak A, Kajikawa Y et al (2019) Dissociation of broadband high-frequency activity and neuronal firing in the neocortex. BioRxiv (August):1–13. https://doi.org/10.1101/531368
Liberman AM, Harris KS, Hoffman HS, Griffith BC (1957) The discrimination of speech sounds within and across phoneme boundaries. J Exp Psychol 54(5):358–368. https://doi.org/10.1037/h0044417
Liberman AM, Cooper FS, Shankweiler DP, Studdert-Kennedy M (1967) Perception of the speech code. Psychol Rev 74(6):431–461. https://doi.org/10.1037/h0020279
Liebenthal E, Ellingson ML, Spanaki MV, Prieto TE, Ropella KM, Binder JR (2003) Simultaneous ERP and fMRI of the auditory cortex in a passive oddball paradigm. NeuroImage 19(4):1395–1404. https://doi.org/10.1016/S1053-8119(03)00228-3
Luce PA, Pisoni DB (1998) Recognizing spoken words: the Neighborhood Activation Model. Ear Hear 19(1):1–36
Marslen-Wilson WD (1987) Functional parallelism in spoken word-recognition. Cognition 25(1–2):71–102. https://doi.org/10.1016/0010-0277(87)90005-9
Mattys SL, Davis MH, Bradlow AR, Scott SK (2012) Speech recognition in adverse conditions: a review. Lang Cogn Process 27(7–8):953–978. https://doi.org/10.1080/01690965.2012.705006
McClelland JL, Elman JL (1986) The TRACE model of speech perception. Cogn Psychol 18(1):1–86. https://doi.org/10.1016/0010-0285(86)90015-0
McDermott JH (2009) The cocktail party problem. Curr Biol 19(22):R1024–R1027. https://doi.org/10.1016/j.cub.2009.09.005
Menon V, Freeman WJ, Cutillo BA et al (1996) Spatio-temporal correlations in human gamma band electrocorticograms. Electroencephalogr Clin Neurophysiol 98(2):89–102. https://doi.org/10.1016/0013-4694(95)00206-5
Merzenich MM, Brugge JF (1973) Representation of the cochlear partition on the superior temporal plane of the macaque monkey. Brain Res 50(2):275–296. https://doi.org/10.1016/0006-8993(73)90731-2
Merzenich MM, Knight PL, Roth GL (1975) Representation of cochlea within primary auditory cortex in the cat. J Neurophysiol 38(2):231–249. https://doi.org/10.1152/jn.1975.38.2.231
Mesgarani N, Chang EF (2012) Selective cortical representation of attended speaker in multi-talker speech perception. Nature 485(7397):233–236. https://doi.org/10.1038/nature11020
Mesgarani N, David SV, Fritz JB, Shamma SA (2009) Influence of context and behavior on stimulus reconstruction from neural activity in primary auditory cortex. J Neurophysiol 102(6):3329–3339. https://doi.org/10.1152/jn.91128.2008
Mesgarani N, Cheung C, Johnson K, Chang EF (2014) Phonetic feature encoding in human superior temporal gyrus. Science 343(6174):1006–1010. https://doi.org/10.1126/science.1245994
Micheli C, Schepers IM, Ozker M, Yoshor D, Beauchamp MS, Rieger JW (2018) Electrocorticography reveals continuous auditory and visual speech tracking in temporal and occipital cortex. Eur J Neurosci 51(5):1364–1376. https://doi.org/10.1111/ejn.13992
Mitchell TM, Shinkareva SV, Carlson A et al (2008) Predicting human brain activity associated with the meanings of nouns. Science 320(5880):1191–1195. https://doi.org/10.1126/science.1152876
Moore RC, Lee T, Theunissen FE (2013) Noise-invariant neurons in the avian auditory cortex: hearing the song in noise. PLoS Comput Biol 9(3):e1002942. https://doi.org/10.1371/journal.pcbi.1002942
Moses DA, Mesgarani N, Leonard MK, Chang EF (2016) Neural speech recognition: continuous phoneme decoding using spatiotemporal representations of human cortical activity. J Neural Eng 13(5):056004. https://doi.org/10.1088/1741-2560/13/5/056004
Mukamel R, Fried I (2012) Human intracranial recordings and cognitive neuroscience. Annu Rev Psychol 63(1):511–537. https://doi.org/10.1146/annurev-psych-120709-145401
Myers EB (2007) Dissociable effects of phonetic competition and category typicality in a phonetic categorization task: an fMRI investigation. Neuropsychologia 45(7):1463–1473
Näätänen R (2001) The perception of speech sounds by the human brain as reflected by the mismatch negativity (MMN) and its magnetic equivalent (MMNm). Psychophysiology 38(1):1–21. https://doi.org/10.1111/1469-8986.3810001
Näätänen R, Picton T (1987) The N1 wave of the human electric and magnetic response to sound: a review and an analysis of the component structure. Psychophysiology 24(4):375–425. https://doi.org/10.1111/j.1469-8986.1987.tb00311.x
Näätänen R, Paavilainen P, Rinne T, Alho K (2007) The mismatch negativity (MMN) in basic research of central auditory processing: a review. Clin Neurophysiol 118(12):2544–2590. https://doi.org/10.1016/j.clinph.2007.04.026
Nearey TM (1989) Static, dynamic, and relational properties in vowel perception. J Acoust Soc Am 85(5):2088. https://doi.org/10.1121/1.397861
Nelken I, Fishbach A, Las L, Ulanovsky N, Farkas D (2003) Primary auditory cortex of cats: feature detection or something else? Biol Cybern 89(5):397–406. https://doi.org/10.1007/s00422-003-0445-3
Norris D, McQueen JM (2008) Shortlist B: a Bayesian model of continuous speech recognition. Psychol Rev 115(2):357–395. https://doi.org/10.1037/0033-295X.115.2.357
Nourski KV, Steinschneider M, Rhone AE, Kovach CK, Kawasaki H, Howard MA (2019) Differential responses to spectrally degraded speech within human auditory cortex: an intracranial electrophysiology study. Hear Res 371:53–65. https://doi.org/10.1016/j.heares.2018.11.009
O’Sullivan JA, Herrero J, Smith E et al (2019) Hierarchical encoding of attended auditory objects in multi-talker speech perception. Neuron 104(6):1195–1209.e3. https://doi.org/10.1016/j.neuron.2019.09.007
Obleser J, Eisner F (2009) Pre-lexical abstraction of speech in the auditory cortex. Trends Cogn Sci 13(1):14–19. https://doi.org/10.1016/J.TICS.2008.09.005
Oganian Y, Chang EF (2019) A speech envelope landmark for syllable encoding in human superior temporal gyrus. Sci Adv 5(11):eaay6279. https://doi.org/10.1126/sciadv.aay6279
Ojemann GA (1987) Surgical therapy for medically intractable epilepsy. J Neurosurg 66(4):489–499. https://doi.org/10.3171/jns.1987.66.4.0489
Parvizi J, Kastner S (2018) Promises and limitations of human intracranial electroencephalography. Nat Neurosci 21:474–483. https://doi.org/10.1038/s41593-018-0108-2
Pasley BN, David SV, Mesgarani N et al (2012) Reconstructing speech from human auditory cortex. PLoS Biol 10(1):e1001251. https://doi.org/10.1371/journal.pbio.1001251
Patterson RD, Uppenkamp S, Johnsrude IS, Griffiths TD (2002) The processing of temporal pitch and melody information in auditory cortex. Neuron 36(4):767–776. https://doi.org/10.1016/S0896-6273(02)01060-7
Perkell JS, Klatt DH (1986) Invariance and variability in speech processes. Lawrence Erlbaum, Hillsdale
Pesaran B, Vinck M, Einevoll GT (2018) Investigating large-scale brain dynamics using field potential recordings: analysis and interpretation. Nat Neurosci 21(7):903–919. https://doi.org/10.1038/s41593-018-0171-8
Peterson GE, Barney HL (1952) Control methods used in a study of the vowels. J Acoust Soc Am 24(2):175–184. https://doi.org/10.1121/1.1906875
Pisoni DB (1997) Some thoughts on “normalization” in speech perception. In: Johnson K, Mullennix JW (eds) Talker variability in speech processing. Academic Press, San Diego, pp 9–32
Pisoni DB, Tash J (1974) Reaction times to comparisons within and across phonetic categories. Percept Psychophys 15(2):285–290
Rabinowitz NC, Willmore BDB, King AJ, Schnupp JWH (2013) Constructing noise-invariant representations of sound in the auditory pathway. PLoS Biol 11(11):e1001710. https://doi.org/10.1371/journal.pbio.1001710
Ramirez AD, Ahmadian Y, Schumacher J (2011) Incorporating naturalistic correlation structure improves spectrogram reconstruction from neuronal activity in the songbird auditory midbrain. J Neurosci 31(10):3828–3842. https://doi.org/10.1523/JNEUROSCI.3256-10.2011
Rauschecker JP, Scott SK (2009) Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nat Neurosci 12(6):718–724. https://doi.org/10.1038/nn.2331
Ray S, Maunsell JHR (2011) Different origins of gamma rhythm and high-gamma activity in macaque visual cortex. PLoS Biol 9(4):e1000610. https://doi.org/10.1371/journal.pbio.1000610
Samuel AG (2011) Speech perception. Annu Rev Psychol 62:49–72. https://doi.org/10.1146/annurev.psych.121208.131643
Sapir E (1925) Sound patterns in language. Language 1(2):37–51. https://doi.org/10.2307/409004
Sarampalis A, Kalluri S, Edwards B, Hafter E (2009) Objective measures of listening effort: effects of background noise and noise reduction. J Speech Lang Hear Res 52(5):1230. https://doi.org/10.1044/1092-4388(2009/08-0111)
Schnupp J, Nelken I, King AJ (2011) Auditory neuroscience: making sense of sound. MIT Press
Sharma A, Dorman M (1999) Cortical auditory evoked potential correlates of categorical perception of voice-onset time. J Acoust Soc Am 106(2):1078–1083
Sharma A, Kraus N, McGee TJ, Carrell T, Nicol T (1993) Acoustic versus phonetic representation of speech as reflected by the mismatch negativity event-related potential. Electroencephalogr Clin Neurophysiol 88(1):64–71. https://doi.org/10.1016/0168-5597(93)90029-O
Shattuck-Hufnagel S, Turk AE (1996) A prosody tutorial for investigators of auditory sentence processing. J Psycholinguist Res 25(2):193–247. https://doi.org/10.1007/BF01708572
Sjerps MJ, Fox NP, Johnson K, Chang EF (2019) Speaker-normalized sound representations in the human auditory cortex. Nat Commun 10(1):1–9. https://doi.org/10.1038/s41467-019-10365-z
Steinschneider M, Nourski KV, Kawasaki HOH, Brugge JF, Howard MA (2011) Intracranial study of speech-elicited activity on the human posterolateral superior temporal gyrus. Cereb Cortex 21(Cv):2332–2347. https://doi.org/10.1093/cercor/bhr014
Steinschneider M, Nourski KV, Fishman YI (2013) Representation of speech in human auditory cortex: is it special? Hear Res 305:57–73
Stevens KN (2002) Toward a model for lexical access based on acoustic landmarks and distinctive features. J Acoust Soc Am 111(4):1872–1891. https://doi.org/10.1121/1.1458026
Stevens KN, Blumstein SE (1978) Invariant cues for place of articulation in stop consonants. J Acoust Soc Am 64(5):1358–1368. https://doi.org/10.1121/1.382102
Tang C, Hamilton LS, Chang EF (2017) Intonational speech prosody encoding in the human auditory cortex. Science 357(6353):797–801. https://doi.org/10.1126/science.aam8577
Theunissen FE, Shaevitz SS (2006) Auditory processing of vocal sounds in birds. Curr Opin Neurobiol 16(4):400–407. https://doi.org/10.1016/J.CONB.2006.07.003
Theunissen FE, David SV, Singh NC, Hsu A, Vinje WE, Gallant JL (2001) Estimating spatio-temporal receptive fields of auditory and visual neurons from their responses to natural stimuli. Netw Comput Neural Syst 12(3):289–316. https://doi.org/10.1080/net.12.3.289.316
Titze IR (1989) On the relation between subglottal pressure and fundamental frequency in phonation. J Acoust Soc Am 85(2):901–906. https://doi.org/10.1121/1.397562
Toscano JC, Anderson ND, Fabiani M, Gratton G, Garnsey SM (2018) The time-course of cortical responses to speech revealed by fast optical imaging. Brain Lang 184:32–42. https://doi.org/10.1016/J.BANDL.2018.06.006
Van Dommelen WA (1990) Acoustic parameters in human speaker recognition. Lang Speech 33(3):259–272. https://doi.org/10.1177/002383099003300302
Wang X, Lu T, Snider RK, Liang L (2005) Sustained firing in auditory cortex evoked by preferred stimuli. Nature 435(7040):341–346. https://doi.org/10.1038/nature03565
Wernicke C (1874) Der aphasische Symptomencomplex: eine psychologische Studie auf anatomischer Basis. M. Cohn und Weigert
Wong PCM, Diehl RL (2003) Perceptual normalization for inter- and intratalker variation in cantonese level tones. J Speech Lang Hear Res 46(2):413. https://doi.org/10.1044/1092-4388(2003/034)
Zevin JD, McCandliss BD (2005) Dishabituation of the BOLD response to speech sounds. Behav Brain Funct 1:4. https://doi.org/10.1186/1744-9081-1-4
Zion Golumbic EM, Ding N et al (2013) Mechanisms underlying selective neuronal tracking of attended speech at a “cocktail party”. Neuron 77(5):980–991. https://doi.org/10.1016/j.neuron.2012.12.037
Compliance with Ethics Requirements
-
Neal P. Fox declares that he has no conflicts of interest.
-
Yulia Oganian declares that she has no conflicts of interest.
-
Edward F. Chang declares that he has no conflicts of interest.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Oganian, Y., Fox, N.P., Chang, E.F. (2022). Cortical Representation of Speech Sounds: Insights from Intracranial Electrophysiology. In: Holt, L.L., Peelle, J.E., Coffin, A.B., Popper, A.N., Fay, R.R. (eds) Speech Perception. Springer Handbook of Auditory Research, vol 74. Springer, Cham. https://doi.org/10.1007/978-3-030-81542-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-81542-4_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-81541-7
Online ISBN: 978-3-030-81542-4
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)