Cortical Representation of Speech Sounds: Insights from Intracranial Electrophysiology

Oganian, Yulia; Fox, Neal P.; Chang, Edward F.

doi:10.1007/978-3-030-81542-4_3

Yulia Oganian²¹^na1,
Neal P. Fox²¹^na1 &
Edward F. Chang²¹^na1

Part of the book series: Springer Handbook of Auditory Research ((SHAR,volume 74))

1262 Accesses
1 Citations

Abstract

The superior temporal gyrus (STG) has long been recognized as crucial to the human ability to perceive and comprehend spoken language. However, the nature of the neuronal computations and cortical representations responsible for this sensory and cognitive feat remain a mystery. The recent advance of methodologies for intracranial electrophysiology (iEEG) recordings, together with the emergence of novel computational approaches, have heralded progress toward understanding how neural processing in auditory cortex gives rise to the perceptual experience of speech. This chapter describes a collection of intracranial neurophysiology studies that illustrate two fundamental properties of STG encoding of speech sounds. First, this neural representation of speech is firmly rooted in the analysis of high-order acoustic features in the sensory stimulus. Second, the neural representation also differs dramatically from a linear representation of sound acoustics. The STG encodes an imperfect spectrotemporal representation of speech, sacrificing faithfulness to the sensory signal where it enhances the robust encoding of linguistically and behaviorally relevant information. Besides being insensitive to behaviorally irrelevant information carried by the speech signal, STG is also sensitive to behaviorally relevant information not contained within the speech signal (i.e., top-down cues). Overall, mounting evidence suggests that STG is a sensory-perceptual hub for the human speech perception system, functionally characterized by the behaviorally relevant cortical representation of speech that emerges therein.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abercrombie D (1967) Elements of general phonetics. Aldine, Chicago
Google Scholar
Allen JS, Miller JL, DeSteno D (2003) Individual talker differences in voice-onset-time. J Acoust Soc Am 113(1):544. https://doi.org/10.1121/1.1528172
Article PubMed Google Scholar
Assmann P, Summerfield Q (2004) The perception of speech under adverse conditions. In: Greenberg S et al (eds) Speech processing in the auditory system. Springer, New York, pp 231–308. https://doi.org/10.1007/0-387-21575-1_5
Chapter Google Scholar
Bendor D, Wang X (2005) The neuronal representation of pitch in primate auditory cortex. Nature 436(7054):1161–1165. https://doi.org/10.1038/nature03867
Article CAS PubMed PubMed Central Google Scholar
Berezutskaya J, Freudenburg ZV, Güçlü U et al (2017) Neural tuning to low-level features of speech throughout the perisylvian cortex. J Neurosci 37(33):7906–7920. https://doi.org/10.1523/JNEUROSCI.0238-17.2017
Article CAS PubMed PubMed Central Google Scholar
Bialek W, Rieke F, de Ruyter van Steveninck RR et al (1991) Reading a neural code. Science 252(5014):1854–1857. https://doi.org/10.1126/SCIENCE.2063199
Article CAS PubMed Google Scholar
Bitterman Y, Mukamel R, Malach R et al (2008) Ultra-fine frequency tuning revealed in single neurons of human auditory cortex. Nature 451(7175):197–201. https://doi.org/10.1038/nature06476
Article CAS PubMed PubMed Central Google Scholar
Bolanowski SJ, Gescheider GA, Verrillo RT et al (1988) Four channels mediate the mechanical aspects of touch. J Acoust Soc Am 84(5):1680–1694. https://doi.org/10.1121/1.397184
Article PubMed Google Scholar
Brugge JF (1992) An overview of central auditory processing. In: The mammalian auditory pathway: neurophysiology, vol 2. Springer, New York, pp 1–33. https://doi.org/10.1007/978-1-4612-2838-7_1
Chapter Google Scholar
Buzsáki G, Anastassiou CA, Koch C (2012) The origin of extracellular fields and currents — EEG, ECoG, LFP and spikes. Nat Rev Neurosci 13(6):407–420. https://doi.org/10.1038/nrn3241
Article CAS PubMed PubMed Central Google Scholar
Chan AM, Dykstra AR, Jayaram V et al (2014) Speech-specific tuning of neurons in human superior temporal gyrus. Cereb Cortex 24(10):2679–2693. https://doi.org/10.1093/cercor/bht127
Article PubMed Google Scholar
Chang EF (2015) Towards large-scale, human-based, mesoscopic neurotechnologies. Neuron 86(1):68–78
Article CAS Google Scholar
Chang EF, Rieger JW, Johnson K et al (2010) Categorical speech representation in human superior temporal gyrus. Nat Neurosci 13(11):1428–1432. https://doi.org/10.1038/nn.2641
Article CAS PubMed PubMed Central Google Scholar
Chang EF, Niziolek CA, Knight RT et al (2013) Human cortical sensorimotor network underlying feedback control of vocal pitch. PNAS 110(7):2653–2658. https://doi.org/10.1073/pnas.1216827110
Article PubMed PubMed Central Google Scholar
Cherry EC (1953) Some experiments on the recognition of speech, with one and with two ears. J Acoust Soc Am 25(5):975–979. https://doi.org/10.1121/1.1907229
Article Google Scholar
Cheung C, Hamilton LS, Johnson K et al (2016) The auditory representation of speech sounds in human motor cortex. elife 5:1–19. https://doi.org/10.7554/eLife.12577
Article Google Scholar
Chi T, Gao Y, Guyton MC et al (1999) Spectro-temporal modulation transfer functions and speech intelligibility. J Acoust Soc Am 106(5):2719–2732. https://doi.org/10.1121/1.428100
Article CAS PubMed Google Scholar
Chomsky N, Halle M (1968) The sound pattern of English. Harper & Row, New York
Google Scholar
Cibelli ES, Leonard MK, Johnson K et al (2015) The influence of lexical statistics on temporal lobe cortical dynamics during spoken word listening. Brain Lang 147:66–75. https://doi.org/10.1016/j.bandl.2015.05.005
Article PubMed PubMed Central Google Scholar
Clayards MA, Tanenhaus MK, Aslin RN et al (2008) Perception of speech reflects optimal use of probabilistic speech cues. Cognition 108(3):804–809. https://doi.org/10.1016/j.cognition.2008.04.004
Article PubMed PubMed Central Google Scholar
Crone NE, Miglioretti DL, Gordon B et al (1998) Functional mapping of human sensorimotor cortex with electrocorticographic spectral analysis. II. Event-related synchronization in the gamma band. Brain 121(12):2301–2315. https://doi.org/10.1093/brain/121.12.2301
Article PubMed Google Scholar
Crone NE, Hao L, Hart J et al (2001) Electrocorticographic gamma activity during word production in spoken and sign language. Neurology 57(11):2045–2053
Article CAS Google Scholar
Crone NE, Sinai A, Korzeniewska A (2006) High-frequency gamma oscillations and human brain mapping with electrocorticography. Prog Brain Res 159:275–295. https://doi.org/10.1016/S0079-6123(06)59019-3
Article PubMed Google Scholar
Cutler A, Dahan D, van Donselaar W (1997) Prosody in the comprehension of spoken language: a literature review. Lang Speech 40(2):141–201. https://doi.org/10.1177/002383099704000203
Article PubMed Google Scholar
David SV (2018) Incorporating behavioral and sensory context into spectro-temporal models of auditory encoding. Hear Res 360:107–123. https://doi.org/10.1016/J.HEARES.2017.12.021
Article PubMed Google Scholar
David SV, Mesgarani N, Shamma SA (2007) Estimating sparse spectro-temporal receptive fields with natural stimuli. Netw Comput Neural Syst 18(3):191–212. https://doi.org/10.1080/09548980701609235
Article Google Scholar
Davis MH, Johnsrude IS (2007) Hearing speech sounds: top-down influences on the interface between audition and speech perception. Hear Res 229(1–2):132–147
Article Google Scholar
de Saussure F (1916) Nature of the linguistic sign. In: Bally C, Sechehaye A (eds) Cours de linguistique générale. McGraw Hill Education
Google Scholar
deCharms RC, Blake DT, Merzenich MM (1998) Optimizing sound features for cortical neurons. Science 280(5368):1439–1443. https://doi.org/10.1126/SCIENCE.280.5368.1439
Article CAS PubMed Google Scholar
Dehaene-Lambertz G (1997) Electrophysiological correlates of categorical phoneme perception in adults. Neuroreport 8(4):919–924. https://doi.org/10.1097/00001756-199703030-00021
Article CAS PubMed Google Scholar
Depireux DA, Simon JZ, Klein DJ, Shamma SA (2001) Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex. J Neurophysiol 85(3):1220–1234. https://doi.org/10.1152/jn.2001.85.3.1220
Article CAS PubMed Google Scholar
DeWitt I, Rauschecker JP (2012) Phoneme and word recognition in the auditory ventral stream. PNAS 109(8):E505–E514. https://doi.org/10.1073/pnas.1113427109
Article PubMed PubMed Central Google Scholar
Di Liberto GM, O’Sullivan JA, Lalor EC (2015) Low-frequency cortical entrainment to speech reflects phoneme-level processing. Curr Biol 25(19):2457–2465. https://doi.org/10.1016/j.cub.2015.08.030
Article CAS PubMed Google Scholar
Diehl RL, Lotto AJ, Holt LL (2004) Speech perception. Annu Rev Psychol 55(1):149–179. https://doi.org/10.1146/annurev.psych.55.090902.142028
Article PubMed Google Scholar
Ding N, Simon JZ (2012) Emergence of neural encoding of auditory objects while listening to competing speakers. PNAS 109(29):11854–11859. https://doi.org/10.1073/pnas.1205381109
Article PubMed PubMed Central Google Scholar
Donders FC (1969) On the speed of mental processes. Acta Psychol 30:412–431. https://doi.org/10.1016/0001-6918(69)90065-1
Article CAS Google Scholar
Einevoll GT, Kayser C, Logothetis NK, Panzeri S (2013) Modelling and analysis of local field potentials for studying the function of cortical circuits. Nat Rev Neurosci 14(11):770–785. https://doi.org/10.1038/nrn3599
Article CAS PubMed Google Scholar
Elliott TM, Theunissen FE (2009) The modulation transfer function for speech intelligibility. PLoS Comput Biol 5(3):e1000302. https://doi.org/10.1371/journal.pcbi.1000302
Article CAS PubMed PubMed Central Google Scholar
Field DJ (1994) What is the goal of sensory coding? Neural Comput 6(4):559–601. https://doi.org/10.1162/neco.1994.6.4.559
Article Google Scholar
Flinker A, Chang EF, Kirsch HE et al (2010) Single-trial speech suppression of auditory cortex activity in humans. J Neurosci 30(49):16643–16650. https://doi.org/10.1523/JNEUROSCI.1809-10.2010
Article CAS PubMed PubMed Central Google Scholar
Flinker A, Doyle WK, Mehta AD et al (2019) Spectrotemporal modulation provides a unifying framework for auditory cortical asymmetries. Nat Hum Behav 3(April):393–405. https://doi.org/10.1038/s41562-019-0548-z
Article PubMed PubMed Central Google Scholar
Forseth KJ, Hickok G, Rollo PS, Tandon N (2020) Language prediction mechanisms in human auditory cortex. Nat Commun 11(1):1–14. https://doi.org/10.1038/s41467-020-19010-6
Article CAS Google Scholar
Fox NP, Leonard MK, Sjerps MJ, Chang EF (2020) Transformation of a temporal speech cue to a spatial neural code in human auditory cortex. elife 9:1–43. https://doi.org/10.7554/ELIFE.53051
Article Google Scholar
Frye RE, Fisher JM, Coty A et al (2007) Linear coding of voice onset time. J Cogn Neurosci 19(9):1476–1487. https://doi.org/10.1162/jocn.2007.19.9.1476
Article PubMed PubMed Central Google Scholar
Ganong WF (1980) Phonetic categorization in auditory word perception. J Exp Psychol Hum Percept Perform 6(1):110–125. https://doi.org/10.1037/0096-1523.6.1.110
Article PubMed Google Scholar
Garofolo JS, Lamel LF, Fisher WM et al (1993) TIMIT acoustic-phonetic continuous speech corpus LDC93S1. Linguistic Data Consortium, Philadelphia
Google Scholar
Griffiths TD, Kumar S, Sedley W et al (2010) Direct recordings of pitch responses from human auditory cortex. Curr Biol 20(12):1128–1132. https://doi.org/10.1016/J.CUB.2010.04.044
Article CAS PubMed PubMed Central Google Scholar
Grossberg S (2003) Resonant neural dynamics of speech perception. J Phon 31(3–4):423–445. https://doi.org/10.1016/S0095-4470(03)00051-2
Article Google Scholar
Gussenhoven C, Repp BH, Rietveld A, Rump HH, Terken J (1997) The perceptual prominence of fundamental frequency peaks. J Acoust Soc Am 102(5):3009–3022. https://doi.org/10.1121/1.420355
Article CAS PubMed Google Scholar
Hamilton LS, Huth AG (2018) The revolution will not be controlled: natural stimuli in speech neuroscience. Lang Cogn Neurosci 35(5):573–582. https://doi.org/10.1080/23273798.2018.1499946
Article PubMed PubMed Central Google Scholar
Hamilton LS, Edwards E, Chang EF (2018) A spatial map of onset and sustained responses to speech in the human superior temporal gyrus. Curr Biol 28(12):1860–1871.e4. https://doi.org/10.1016/j.cub.2018.04.033
Article CAS PubMed Google Scholar
Herff C, Schultz T (2016) Automatic speech recognition from neural signals: a focused review. Front Neurosci 10:429. https://doi.org/10.3389/fnins.2016.00429
Article PubMed PubMed Central Google Scholar
Hickok G, Poeppel D (2007) The cortical organization of speech processing. Nat Rev Neurosci 8(5):393–402. https://doi.org/10.1038/nrn2113
Article CAS PubMed Google Scholar
Holdgraf CR, de Heer W, Pasley BN et al (2016) Rapid tuning shifts in human auditory cortex enhance speech intelligibility. Nat Commun 7(May):13654. https://doi.org/10.1038/ncomms13654
Article CAS PubMed PubMed Central Google Scholar
Holdgraf CR, Rieger JW, Micheli C, Martin S, Knight RT, Theunissen FE (2017) Encoding and decoding models in cognitive electrophysiology. Front Syst Neurosci 11(September):61. https://doi.org/10.3389/fnsys.2017.00061
Article PubMed PubMed Central Google Scholar
Howard MA, Volkov IO, Mirsky R (2000) Auditory cortex on the human posterior superior temporal gyrus. J Comp Neurol 416(1):79–92
Article CAS Google Scholar
Howie JM (1976) Acoustical studies of Mandarin vowels and tones. Cambridge University Press, New York
Google Scholar
Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106–154. https://doi.org/10.1113/jphysiol.1962.sp006837
Article CAS PubMed PubMed Central Google Scholar
Hullett PW, Hamilton LS, Mesgarani N, Schreiner CE, Chang EF (2016) Human superior temporal gyrus organization of spectrotemporal modulation tuning derived from speech stimuli. J Neurosci 36(6):2014–2026. https://doi.org/10.1523/JNEUROSCI.1779-15.2016
Article CAS PubMed PubMed Central Google Scholar
Huth AG, de Heer WA, Griffiths TL, Theunissen FE, Gallant JL (2016) Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 532(7600):453–458. https://doi.org/10.1038/nature17637
Article PubMed PubMed Central Google Scholar
Jakobson R, Fant CGM, Halle M (1951) Preliminaries to speech analysis: the distinctive features and their correlates. MIT Press, Cambridge
Google Scholar
Johnson K (2005) Speaker normalization in speech perception. In: Handbook of speech perception. Blackwell, pp 363–389
Chapter Google Scholar
Johnson EL, Kam JWY, Tzovara A, Knight RT (2020) Insights into human cognition from intracranial EEG: a review of audition, memory, internal cognition, and causality. J Neural Eng 17(5):051001. https://doi.org/10.1088/1741-2552/abb7a5
Article PubMed PubMed Central Google Scholar
Karas PJ, Magnotti JF, Metzger BA et al (2019) The visual speech head start improves perception and reduces superior temporal cortex responses to auditory speech. elife 8:1–19. https://doi.org/10.7554/eLife.48116
Article Google Scholar
Khalighinejad B, da Silva GC, Mesgarani N (2017) Dynamic encoding of acoustic features in neural responses to continuous speech. J Neurosci 37(8):2176–2185. https://doi.org/10.1523/JNEUROSCI.2383-16.2017
Article CAS PubMed PubMed Central Google Scholar
Khalighinejad B, Herrero JL, Mehta AD, Mesgarani N (2019) Adaptation of the human auditory cortex to changing background noise. Nat Commun 10(1):1–11. https://doi.org/10.1038/s41467-019-10611-4
Article CAS Google Scholar
Khoshkhoo S, Leonard MK, Mesgarani N, Chang EF (2018) Neural correlates of sine-wave speech intelligibility in human frontal and temporal cortex. Brain Lang 187:83–91. https://doi.org/10.1016/j.bandl.2018.01.007
Article PubMed PubMed Central Google Scholar
Klein DJ, Depireux DA, Simon JZ, Shamma SA (2000) Robust spectrotemporal reverse correlation for the auditory system: optimizing stimulus design. J Comput Neurosci 9(1):85–111. https://doi.org/10.1023/A:1008990412183
Article CAS PubMed Google Scholar
Kluender KR, Lotto AJ, Holt LL (2005) Contributions of nonhuman animal models to understanding human speech perception. In: Greenberg S, Ainsworth W (eds) Listening to speech: an auditory perspective. Oxford University Press, New York, pp 203–220
Google Scholar
Kuhl PK (1986) Theoretical contributions of tests on animals to the special-mechanisms debate in speech. Exp Biol 45(3):233–265
CAS PubMed Google Scholar
Ladd DR (2008) Intonational phonology. Cambridge University Press, New York
Book Google Scholar
Ladefoged P (1989) A note on “Information conveyed by vowels”. J Acoust Soc Am 85:2223–2224
Article CAS Google Scholar
Ladefoged P, Johnson K (2014) A course in phonetics. Nelson Education
Google Scholar
Lee DK, Fedorenko E, Simon MV et al (2018) Neural encoding and production of functional morphemes in the posterior temporal lobe. Nat Commun 9(1):1–12. https://doi.org/10.1038/s41467-018-04235-3
Article CAS Google Scholar
Leonard MK, Bouchard KE, Tang C, Chang EF (2015) Dynamic encoding of speech sequence probability in human temporal cortex. J Neurosci 35(18):7203–7214. https://doi.org/10.1523/JNEUROSCI.4100-14.2015
Article CAS PubMed PubMed Central Google Scholar
Leonard MK, Baud MO, Sjerps MJ, Chang EF (2016) Perceptual restoration of masked speech in human cortex. Nat Commun 7:13619. https://doi.org/10.1038/ncomms13619
Article CAS PubMed PubMed Central Google Scholar
Łęski S, Lindén H, Tetzlaff T, Pettersen KH, Einevoll GT (2013) Frequency dependence of signal power and spatial reach of the local field potential. PLoS Comput Biol 9(7):e1003137. https://doi.org/10.1371/journal.pcbi.1003137
Article CAS PubMed PubMed Central Google Scholar
Leszczyński M, Barczak A, Kajikawa Y et al (2019) Dissociation of broadband high-frequency activity and neuronal firing in the neocortex. BioRxiv (August):1–13. https://doi.org/10.1101/531368
Liberman AM, Harris KS, Hoffman HS, Griffith BC (1957) The discrimination of speech sounds within and across phoneme boundaries. J Exp Psychol 54(5):358–368. https://doi.org/10.1037/h0044417
Article CAS PubMed Google Scholar
Liberman AM, Cooper FS, Shankweiler DP, Studdert-Kennedy M (1967) Perception of the speech code. Psychol Rev 74(6):431–461. https://doi.org/10.1037/h0020279
Article CAS PubMed Google Scholar
Liebenthal E, Ellingson ML, Spanaki MV, Prieto TE, Ropella KM, Binder JR (2003) Simultaneous ERP and fMRI of the auditory cortex in a passive oddball paradigm. NeuroImage 19(4):1395–1404. https://doi.org/10.1016/S1053-8119(03)00228-3
Article PubMed Google Scholar
Luce PA, Pisoni DB (1998) Recognizing spoken words: the Neighborhood Activation Model. Ear Hear 19(1):1–36
Article CAS Google Scholar
Marslen-Wilson WD (1987) Functional parallelism in spoken word-recognition. Cognition 25(1–2):71–102. https://doi.org/10.1016/0010-0277(87)90005-9
Article CAS PubMed Google Scholar
Mattys SL, Davis MH, Bradlow AR, Scott SK (2012) Speech recognition in adverse conditions: a review. Lang Cogn Process 27(7–8):953–978. https://doi.org/10.1080/01690965.2012.705006
Article Google Scholar
McClelland JL, Elman JL (1986) The TRACE model of speech perception. Cogn Psychol 18(1):1–86. https://doi.org/10.1016/0010-0285(86)90015-0
Article CAS PubMed Google Scholar
McDermott JH (2009) The cocktail party problem. Curr Biol 19(22):R1024–R1027. https://doi.org/10.1016/j.cub.2009.09.005
Article CAS PubMed Google Scholar
Menon V, Freeman WJ, Cutillo BA et al (1996) Spatio-temporal correlations in human gamma band electrocorticograms. Electroencephalogr Clin Neurophysiol 98(2):89–102. https://doi.org/10.1016/0013-4694(95)00206-5
Article CAS PubMed Google Scholar
Merzenich MM, Brugge JF (1973) Representation of the cochlear partition on the superior temporal plane of the macaque monkey. Brain Res 50(2):275–296. https://doi.org/10.1016/0006-8993(73)90731-2
Article CAS PubMed Google Scholar
Merzenich MM, Knight PL, Roth GL (1975) Representation of cochlea within primary auditory cortex in the cat. J Neurophysiol 38(2):231–249. https://doi.org/10.1152/jn.1975.38.2.231
Article CAS PubMed Google Scholar
Mesgarani N, Chang EF (2012) Selective cortical representation of attended speaker in multi-talker speech perception. Nature 485(7397):233–236. https://doi.org/10.1038/nature11020
Article CAS PubMed Google Scholar
Mesgarani N, David SV, Fritz JB, Shamma SA (2009) Influence of context and behavior on stimulus reconstruction from neural activity in primary auditory cortex. J Neurophysiol 102(6):3329–3339. https://doi.org/10.1152/jn.91128.2008
Article PubMed PubMed Central Google Scholar
Mesgarani N, Cheung C, Johnson K, Chang EF (2014) Phonetic feature encoding in human superior temporal gyrus. Science 343(6174):1006–1010. https://doi.org/10.1126/science.1245994
Article CAS PubMed PubMed Central Google Scholar
Micheli C, Schepers IM, Ozker M, Yoshor D, Beauchamp MS, Rieger JW (2018) Electrocorticography reveals continuous auditory and visual speech tracking in temporal and occipital cortex. Eur J Neurosci 51(5):1364–1376. https://doi.org/10.1111/ejn.13992
Article PubMed PubMed Central Google Scholar
Mitchell TM, Shinkareva SV, Carlson A et al (2008) Predicting human brain activity associated with the meanings of nouns. Science 320(5880):1191–1195. https://doi.org/10.1126/science.1152876
Article CAS PubMed Google Scholar
Moore RC, Lee T, Theunissen FE (2013) Noise-invariant neurons in the avian auditory cortex: hearing the song in noise. PLoS Comput Biol 9(3):e1002942. https://doi.org/10.1371/journal.pcbi.1002942
Article CAS PubMed PubMed Central Google Scholar
Moses DA, Mesgarani N, Leonard MK, Chang EF (2016) Neural speech recognition: continuous phoneme decoding using spatiotemporal representations of human cortical activity. J Neural Eng 13(5):056004. https://doi.org/10.1088/1741-2560/13/5/056004
Article PubMed PubMed Central Google Scholar
Mukamel R, Fried I (2012) Human intracranial recordings and cognitive neuroscience. Annu Rev Psychol 63(1):511–537. https://doi.org/10.1146/annurev-psych-120709-145401
Article PubMed Google Scholar
Myers EB (2007) Dissociable effects of phonetic competition and category typicality in a phonetic categorization task: an fMRI investigation. Neuropsychologia 45(7):1463–1473
Article Google Scholar
Näätänen R (2001) The perception of speech sounds by the human brain as reflected by the mismatch negativity (MMN) and its magnetic equivalent (MMNm). Psychophysiology 38(1):1–21. https://doi.org/10.1111/1469-8986.3810001
Article PubMed Google Scholar
Näätänen R, Picton T (1987) The N1 wave of the human electric and magnetic response to sound: a review and an analysis of the component structure. Psychophysiology 24(4):375–425. https://doi.org/10.1111/j.1469-8986.1987.tb00311.x
Article PubMed Google Scholar
Näätänen R, Paavilainen P, Rinne T, Alho K (2007) The mismatch negativity (MMN) in basic research of central auditory processing: a review. Clin Neurophysiol 118(12):2544–2590. https://doi.org/10.1016/j.clinph.2007.04.026
Article PubMed Google Scholar
Nearey TM (1989) Static, dynamic, and relational properties in vowel perception. J Acoust Soc Am 85(5):2088. https://doi.org/10.1121/1.397861
Article CAS PubMed Google Scholar
Nelken I, Fishbach A, Las L, Ulanovsky N, Farkas D (2003) Primary auditory cortex of cats: feature detection or something else? Biol Cybern 89(5):397–406. https://doi.org/10.1007/s00422-003-0445-3
Article PubMed Google Scholar
Norris D, McQueen JM (2008) Shortlist B: a Bayesian model of continuous speech recognition. Psychol Rev 115(2):357–395. https://doi.org/10.1037/0033-295X.115.2.357
Article PubMed Google Scholar
Nourski KV, Steinschneider M, Rhone AE, Kovach CK, Kawasaki H, Howard MA (2019) Differential responses to spectrally degraded speech within human auditory cortex: an intracranial electrophysiology study. Hear Res 371:53–65. https://doi.org/10.1016/j.heares.2018.11.009
Article PubMed Google Scholar
O’Sullivan JA, Herrero J, Smith E et al (2019) Hierarchical encoding of attended auditory objects in multi-talker speech perception. Neuron 104(6):1195–1209.e3. https://doi.org/10.1016/j.neuron.2019.09.007
Article CAS PubMed PubMed Central Google Scholar
Obleser J, Eisner F (2009) Pre-lexical abstraction of speech in the auditory cortex. Trends Cogn Sci 13(1):14–19. https://doi.org/10.1016/J.TICS.2008.09.005
Article PubMed Google Scholar
Oganian Y, Chang EF (2019) A speech envelope landmark for syllable encoding in human superior temporal gyrus. Sci Adv 5(11):eaay6279. https://doi.org/10.1126/sciadv.aay6279
Article PubMed PubMed Central Google Scholar
Ojemann GA (1987) Surgical therapy for medically intractable epilepsy. J Neurosurg 66(4):489–499. https://doi.org/10.3171/jns.1987.66.4.0489
Article CAS PubMed Google Scholar
Parvizi J, Kastner S (2018) Promises and limitations of human intracranial electroencephalography. Nat Neurosci 21:474–483. https://doi.org/10.1038/s41593-018-0108-2
Article CAS PubMed PubMed Central Google Scholar
Pasley BN, David SV, Mesgarani N et al (2012) Reconstructing speech from human auditory cortex. PLoS Biol 10(1):e1001251. https://doi.org/10.1371/journal.pbio.1001251
Article CAS PubMed PubMed Central Google Scholar
Patterson RD, Uppenkamp S, Johnsrude IS, Griffiths TD (2002) The processing of temporal pitch and melody information in auditory cortex. Neuron 36(4):767–776. https://doi.org/10.1016/S0896-6273(02)01060-7
Article CAS PubMed Google Scholar
Perkell JS, Klatt DH (1986) Invariance and variability in speech processes. Lawrence Erlbaum, Hillsdale
Google Scholar
Pesaran B, Vinck M, Einevoll GT (2018) Investigating large-scale brain dynamics using field potential recordings: analysis and interpretation. Nat Neurosci 21(7):903–919. https://doi.org/10.1038/s41593-018-0171-8
Article CAS PubMed PubMed Central Google Scholar
Peterson GE, Barney HL (1952) Control methods used in a study of the vowels. J Acoust Soc Am 24(2):175–184. https://doi.org/10.1121/1.1906875
Article Google Scholar
Pisoni DB (1997) Some thoughts on “normalization” in speech perception. In: Johnson K, Mullennix JW (eds) Talker variability in speech processing. Academic Press, San Diego, pp 9–32
Google Scholar
Pisoni DB, Tash J (1974) Reaction times to comparisons within and across phonetic categories. Percept Psychophys 15(2):285–290
Article Google Scholar
Rabinowitz NC, Willmore BDB, King AJ, Schnupp JWH (2013) Constructing noise-invariant representations of sound in the auditory pathway. PLoS Biol 11(11):e1001710. https://doi.org/10.1371/journal.pbio.1001710
Article CAS PubMed PubMed Central Google Scholar
Ramirez AD, Ahmadian Y, Schumacher J (2011) Incorporating naturalistic correlation structure improves spectrogram reconstruction from neuronal activity in the songbird auditory midbrain. J Neurosci 31(10):3828–3842. https://doi.org/10.1523/JNEUROSCI.3256-10.2011
Article CAS PubMed PubMed Central Google Scholar
Rauschecker JP, Scott SK (2009) Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nat Neurosci 12(6):718–724. https://doi.org/10.1038/nn.2331
Article CAS PubMed PubMed Central Google Scholar
Ray S, Maunsell JHR (2011) Different origins of gamma rhythm and high-gamma activity in macaque visual cortex. PLoS Biol 9(4):e1000610. https://doi.org/10.1371/journal.pbio.1000610
Article CAS PubMed PubMed Central Google Scholar
Samuel AG (2011) Speech perception. Annu Rev Psychol 62:49–72. https://doi.org/10.1146/annurev.psych.121208.131643
Article PubMed Google Scholar
Sapir E (1925) Sound patterns in language. Language 1(2):37–51. https://doi.org/10.2307/409004
Article Google Scholar
Sarampalis A, Kalluri S, Edwards B, Hafter E (2009) Objective measures of listening effort: effects of background noise and noise reduction. J Speech Lang Hear Res 52(5):1230. https://doi.org/10.1044/1092-4388(2009/08-0111)
Article PubMed Google Scholar
Schnupp J, Nelken I, King AJ (2011) Auditory neuroscience: making sense of sound. MIT Press
Google Scholar
Sharma A, Dorman M (1999) Cortical auditory evoked potential correlates of categorical perception of voice-onset time. J Acoust Soc Am 106(2):1078–1083
Article CAS Google Scholar
Sharma A, Kraus N, McGee TJ, Carrell T, Nicol T (1993) Acoustic versus phonetic representation of speech as reflected by the mismatch negativity event-related potential. Electroencephalogr Clin Neurophysiol 88(1):64–71. https://doi.org/10.1016/0168-5597(93)90029-O
Article CAS PubMed Google Scholar
Shattuck-Hufnagel S, Turk AE (1996) A prosody tutorial for investigators of auditory sentence processing. J Psycholinguist Res 25(2):193–247. https://doi.org/10.1007/BF01708572
Article CAS PubMed Google Scholar
Sjerps MJ, Fox NP, Johnson K, Chang EF (2019) Speaker-normalized sound representations in the human auditory cortex. Nat Commun 10(1):1–9. https://doi.org/10.1038/s41467-019-10365-z
Article CAS Google Scholar
Steinschneider M, Nourski KV, Kawasaki HOH, Brugge JF, Howard MA (2011) Intracranial study of speech-elicited activity on the human posterolateral superior temporal gyrus. Cereb Cortex 21(Cv):2332–2347. https://doi.org/10.1093/cercor/bhr014
Article PubMed PubMed Central Google Scholar
Steinschneider M, Nourski KV, Fishman YI (2013) Representation of speech in human auditory cortex: is it special? Hear Res 305:57–73
Article Google Scholar
Stevens KN (2002) Toward a model for lexical access based on acoustic landmarks and distinctive features. J Acoust Soc Am 111(4):1872–1891. https://doi.org/10.1121/1.1458026
Article PubMed Google Scholar
Stevens KN, Blumstein SE (1978) Invariant cues for place of articulation in stop consonants. J Acoust Soc Am 64(5):1358–1368. https://doi.org/10.1121/1.382102
Article CAS PubMed Google Scholar
Tang C, Hamilton LS, Chang EF (2017) Intonational speech prosody encoding in the human auditory cortex. Science 357(6353):797–801. https://doi.org/10.1126/science.aam8577
Article CAS PubMed Google Scholar
Theunissen FE, Shaevitz SS (2006) Auditory processing of vocal sounds in birds. Curr Opin Neurobiol 16(4):400–407. https://doi.org/10.1016/J.CONB.2006.07.003
Article CAS PubMed Google Scholar
Theunissen FE, David SV, Singh NC, Hsu A, Vinje WE, Gallant JL (2001) Estimating spatio-temporal receptive fields of auditory and visual neurons from their responses to natural stimuli. Netw Comput Neural Syst 12(3):289–316. https://doi.org/10.1080/net.12.3.289.316
Article CAS Google Scholar
Titze IR (1989) On the relation between subglottal pressure and fundamental frequency in phonation. J Acoust Soc Am 85(2):901–906. https://doi.org/10.1121/1.397562
Article CAS PubMed Google Scholar
Toscano JC, Anderson ND, Fabiani M, Gratton G, Garnsey SM (2018) The time-course of cortical responses to speech revealed by fast optical imaging. Brain Lang 184:32–42. https://doi.org/10.1016/J.BANDL.2018.06.006
Article PubMed PubMed Central Google Scholar
Van Dommelen WA (1990) Acoustic parameters in human speaker recognition. Lang Speech 33(3):259–272. https://doi.org/10.1177/002383099003300302
Article PubMed Google Scholar
Wang X, Lu T, Snider RK, Liang L (2005) Sustained firing in auditory cortex evoked by preferred stimuli. Nature 435(7040):341–346. https://doi.org/10.1038/nature03565
Article CAS PubMed Google Scholar
Wernicke C (1874) Der aphasische Symptomencomplex: eine psychologische Studie auf anatomischer Basis. M. Cohn und Weigert
Google Scholar
Wong PCM, Diehl RL (2003) Perceptual normalization for inter- and intratalker variation in cantonese level tones. J Speech Lang Hear Res 46(2):413. https://doi.org/10.1044/1092-4388(2003/034)
Article PubMed Google Scholar
Zevin JD, McCandliss BD (2005) Dishabituation of the BOLD response to speech sounds. Behav Brain Funct 1:4. https://doi.org/10.1186/1744-9081-1-4
Article PubMed PubMed Central Google Scholar
Zion Golumbic EM, Ding N et al (2013) Mechanisms underlying selective neuronal tracking of attended speech at a “cocktail party”. Neuron 77(5):980–991. https://doi.org/10.1016/j.neuron.2012.12.037
Article CAS PubMed PubMed Central Google Scholar

Download references

Compliance with Ethics Requirements

Neal P. Fox declares that he has no conflicts of interest.
Yulia Oganian declares that she has no conflicts of interest.
Edward F. Chang declares that he has no conflicts of interest.

Author information

Yulia Oganian and Neal P. Fox contributed equally with all other contributors.

Authors and Affiliations

Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, USA
Yulia Oganian, Neal P. Fox & Edward F. Chang

Authors

Yulia Oganian
View author publications
You can also search for this author in PubMed Google Scholar
Neal P. Fox
View author publications
You can also search for this author in PubMed Google Scholar
Edward F. Chang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Edward F. Chang .

Editor information

Editors and Affiliations

Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, USA
Lori L. Holt
Department of Otolaryngology, Washington University in St. Louis, Saint Louis, MO, USA
Jonathan E. Peelle
Integrative Physiology and Neuroscience, Washington State University, Vancouver, WA, USA
Allison B. Coffin
Department of Biology, University of Maryland, Silver Spring, MD, USA
Arthur N. Popper
Department of Psychology, Loyola University Chicago, Chicago, IL, USA
Richard R. Fay

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Oganian, Y., Fox, N.P., Chang, E.F. (2022). Cortical Representation of Speech Sounds: Insights from Intracranial Electrophysiology. In: Holt, L.L., Peelle, J.E., Coffin, A.B., Popper, A.N., Fay, R.R. (eds) Speech Perception. Springer Handbook of Auditory Research, vol 74. Springer, Cham. https://doi.org/10.1007/978-3-030-81542-4_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-81542-4_3
Published: 22 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-81541-7
Online ISBN: 978-3-030-81542-4
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics