Abstract
The motor theory of speech perception assumes that activation of the motor system is essential in the perception of speech. However, deficits in speech perception and comprehension do not arise from damage that is restricted to the motor cortex, few functional imaging studies reveal activity in the motor cortex during speech perception, and the motor cortex is strongly activated by many different sound categories. Here, we evaluate alternative roles for the motor cortex in spoken communication and suggest a specific role in sensorimotor processing in conversation. We argue that motor cortex activation is essential in joint speech, particularly for the timing of turn taking.
Similar content being viewed by others
References
Kluender, K. R. & Alexander, J. M. in The Senses, a Comprehensive Reference vol. 3 (eds Basbaum, A. I. et al.) 829–860 (Adademic, San Diego, 2008).
Liberman, A. M., Delattre, P. & Cooper, F. S. The role of selected stimulus-variables in the perception of the unvoiced stop consonants. Am. J. Psychol. 65, 497–516 (1952).
Liberman, A. M, Cooper, F. S., Shankweiler, D. P. & Studdert-Kennedy, M. Perception of the speech code. Psychol. Rev. 74, 431–461 (1967).
Liberman, A. M. & Mattingly, I. G. The motor theory of speech-perception revised. Cognition 21, 1–36 (1985).
Fowler, C. A. An event approach to the study of speech-perception from a direct realist perspective. J. Phon. 14, 3–28 (1986).
Galantucci, B., Fowler, C. A. & Turvey, M. T. The motor theory of speech perception reviewed. Psychon. Bull. Rev. 13, 361–377 (2005).
Diehl, R. L. & Kluender, K. R. On the objects of speech perception. Ecol. Psychol. 1, 121–144 (1989).
Lisker, L. Rapid vs rabid: a catalogue of acoustical features that may cue the distinction. Haskins Laboratories Status Report on Speech Research 54, 127–132 (1978).
Scott, S. K. & Johnsrude, I. S. The neuroanatomical and functional organization of speech perception. Trends Neurosci. 26, 100–107 (2003).
Hickok, G. & Poeppel, D. The cortical organization of speech processing. Nature Rev. Neurosci. 8, 393–402 (2007).
Hickok, G. Eight problems for the mirror neuron theory of action understanding in monkeys and humans. J. Cogn. Neurosci. 13 Jan 2009 (doi:10.1162/jocn.2009.2 1189).
Lotto, A. J., Hickok, G. & Holt, L. L. Reflections on mirror neurons and speech perception. Trends Cogn. Sci. 17 Feb 2009 (doi: 10.1016/j.tics.2008.11.008).
Meister, I. G., Wilson, S. M., Deblieck, C., Wu, A. D. & Iacoboni, M. The essential role of premotor cortex in speech perception. Curr. Biol. 17, 1692–1696 (2007).
Wise, R. J. S., Greene, J., Büchel, C. & Scott, S. K. Brain systems for word perception and articulation. Lancet 353, 1057–1061 (1999).
Watkins, K. E., Strafella, A. P. & Paus, T. Seeing and hearing speech excites the motor system involved in speech production. Neuropsychologia 41, 989–994 (2003).
Wilson, S. M., Saygin, A. P., Sereno, M. I. & Iacoboni, M. Listening to speech activates motor areas involved in speech production. Nature Neurosci. 7, 701–702 (2004).
Wilson, S. M. & Iacoboni, M. Neural responses to non-native phonemes varying in producibility: evidence for the sensorimotor nature of speech perception. Neuroimage 33, 316–325 (2006).
Fadiga, L., Craighero, L., Buccino, G. & Rizzolatti, G. Speech listening specifically modulates the excitability of tongue muscles: a TMS study. Eur. J. Neurosci. 15, 399–402 (2002).
Tardif, E., Spierer, L., Clarke, S. & Murray, M. M. Interactions between auditory 'what' and 'where' pathways revealed by enhanced near-threshold discrimination of frequency and position. Neuropsychologia 46, 958–966 (2008).
Scott, S. K., Blank, C. C., Rosen, S. & Wise, R. J. S. Identification of a pathway for intelligible speech in the left temporal lobe. Brain 123, 2400–2406 (2000).
Wise, R. J. S., Scott, S. K., Blank, S. C., Mummery, C. J. & Warburton, E. Identifying separate neural sub-systems within 'Wernicke's area'. Brain 124, 83–95 (2001).
Romanski, L. M. et al. Dual streams of auditory afferents target multiple domains in the primate prefrontal cortex. Nature Neurosci. 2, 1131–1136 (1999).
Repp, B. H. Phase correction, phase resetting, and phase shifts after subliminal timing perturbations in sensorimotor synchronization. J. Exp. Psychol. Hum. Percept. Perform. 27, 600–621 (2001).
Tourville, J. A., Reilly, K. J. & Guenther, F. H. Neural mechanisms underlying auditory feedback control of speech. Neuroimage 39, 1429–1443 (2008).
Scott, S. K., Rosen, S., Lang, H. & Wise, R. J. S. Neural correlates of intelligibility in speech investigated with noise-vocoded speech – a positron emission tomography study. J. Acoust. Soc. Am. 120, 1075–1083 (2006).
Mohr, J. P. et al. Broca aphasia – pathologic and clinical. Neurology 28, 311–324 (1978).
Blank, S. C., Bird, H., Turkheimer, F. & Wise, R. J. Speech production after stroke: the role of the right pars opercularis. Ann. Neurol. 54, 310–320 (2003).
Crinion, J. T. et al. Listening to narrative speech after aphasic stroke: the role of the left anterior temporal lobe. Cereb. Cortex 16, 1116–1125 (2006).
Bogen, J. E. & Bogen, G. M. Wernicke's region – where is it? Ann. NY Acad. Sci. 280, 834–843 (1976).
Basso, A., Casati, G. & Vignolo, L. A. Phonemic identification defect in aphasia. Cortex 13, 85–95 (1977).
Mogford, K. in Language Development in Exceptional Circumstances (eds Bishop, D. V. M. & Mogford, K.) 110–131 (Churchill Livingstone, New York, 1988).
Bishop, D. V. M. in Language Development in Exceptional Circumstances (eds Bishop, D. V. M. & Mogford, K.) 220–238 (Churchill Livingstone, New York, 1988).
Werker, J. F. & Yeung, H. H. Infant speech perception bootstraps word learning. Trends Cogn. Sci. 9, 519–527 (2005).
Tsao, F.-M., Liu, H. M. & Kuhl, P. K. Speech perception in infancy predicts language development in the second year of life: a longitudinal study. Child. Dev. 75, 1067–1084 (2004).
Bates, E. & Dick, F. Language, gesture, and the developing brain. Dev. Psychobiol. 40, 293–310 (2002).
Alcock, K. J. & Krawczyk, K. Motor skills and the vocabulary burst. (International Conference for the Study of Child Language, Berlin, 2005).
Wise, R. et al. Distribution of cortical neural networks involved in word comprehension and word retrieval. Brain 114, 1803–1817 (1991).
Mummery, C. J., Ashburner, J., Scott, S. K. & Wise, R. J. S. Functional neuroimaging of speech perception in six normal and two aphasic subjects. J. Acoust. Soc. Am. 106, 449–457 (1999).
Narain, C. et al. Defining a left-lateralized response specific to intelligible speech using fMRI. Cereb. Cortex 13, 1362–1368 (2003).
Liebenthal, E., Binder, J. R., Spitzer, S. M., Possing, E. T. & Medler, D. A. Neural substrates of phonemic perception. Cereb. Cortex 15, 1621–1631 (2005).
Uppenkamp, S., Johnsrude, I. S., Marslen-Wilson, W. & Patterson, R. D. Locating the initial stages of speech-sound processing in human temporal cortex. Neuroimage 31, 1284–1296 (2006).
Obleser, J., Scott, S. K. & Eulitz, C. Now you hear it, now you don't: transient traces of consonants and their nonspeech analogues in the human brain. Cereb. Cortex 16, 1069–1076 (2006).
Obleser, J. & Eisner, F. Pre-lexical abstraction of speech in the auditory cortex. Trends Cogn. Sci. 13, 14–19 (2009).
Patterson, K., Nestor, P. J. & Rogers, T. T. Where do you know what you know? The representation of semantic knowledge in the human brain. Nature Rev. Neurosci. 8, 976–987 (2007).
Davis, M. H. & Johnsrude, I. S. Hierarchical processing in spoken language comprehension. J. Neurosci. 23, 3423–3431 (2003).
Davis, M. H., Johnsrude, I. S., Hervais-Adelman, A. G. & Rogers, J. C. Motor regions contribute to speech perception: awareness, adaptation and categorization. J. Acoust. Soc. Am. 123, 3580 (2008).
Jardri, R. et al. Self awareness and speech processing: an fMRI study. Neuroimage 35, 1645–1653 (2007).
Fogassi, L. & Ferrari, P. F. Mirror neurons and the evolution of embodied language. Curr. Dir. Psychol. Sci. 16, 136–141 (2007).
Greenfield, P. M. Language, tools and brain: the ontology and phylogeny of hierarchically organized sequential behaviour. Behav. Brain Sci. 14, 531–595 (1991).
Friederici, A. D. Broca's area and the ventral premotor cortex in language: functional differentiation and specificity. Cortex 42, 472–475 (2006).
Fiebach, C. J. & Schubotz, R. I. Dynamic anticipatory processing of hierarchical sequential events: a common role for Broca's area and ventral premotor cortex across domains? Cortex 42, 499–502 (2006).
Schubotz, R. I. & von Cramon, D. Y. Functional-anatomical concepts of human premotor cortex: evidence from fMRI and PET studies. Neuroimage 20, S120–S131 (2003).
Fischer, M. H. & Zwaan, R. A. Embodied language: a review of the role of the motor system in language comprehension. Q. J. Exp. Psychol. 61, 825–850 (2008).
Creem, S. H. & Proffitt, D. R. Grasping objects by their handles: a necessary interaction between cognition and action. J. Exp. Psychol. Hum. Percept. Perform. 27, 218–228 (2001).
Pulvermüller, F. Brain mechanisms linking language and action. Nature Rev. Neurosci. 6, 576–582 (2005).
Wise, R. J. et al. Noun imageability and the temporal lobes. Neuropsychologia 38, 985–994 (2000).
Fiebach, C. J. & Friederici, A. D. Processing concrete words: fMRI evidence against a specific right-hemisphere involvement. Neuropsychologia 42, 62–70 (2004).
Fridriksson, J. et al. Motor speech perception modulates the cortical language areas. Neuroimage, 41, 605–613 (2008).
Roy, A. C., Craighero, L., Fabbri-Destro, M. & Fadiga, L. Phonological and lexical motor facilitation during speech listening: a transcranial magnetic stimulation study. J. Physiol. Paris 102, 101–105 (2009).
Levelt, W. J. M. Speaking: from Intention to Articulation (MIT press, Cambridge, Massachusetts, 1989).
Beebe, B., Alson, D., Jaffe, J., Feldstein, S. & Crown, C. Vocal congruence in mother-infant play. J. Psycholinguist. Res. 17, 245–259 (1988).
Beattie, G. Talk: an Analysis of Speech and Non-Verbal Behaviour in Conversation (Open Univ. Press, Milton Keynes, 1983).
Condon, W. S. & Ogston, W. D. A segmentation of behaviour. J. Psychiatr. Res. 5, 221–235 (1967).
Chartrand, T. L. & Bargh, J. A. The chameleon effect: the perception-behavior link and social interaction. J. Pers. Soc. Psychol. 76, 893–910 (1999).
Garrod, S. & Pickering, M. J. Why is conversation so easy? Trends Cogn. Sci. 8, 8–11 (2004).
Pickering, M. J. & Garrod, S. Do people use language production to make predictions during comprehension? Trends Cogn. Sci. 11, 105–110 (2007).
McFarland, D. H. Respiratory markers of conversational interaction. J. Speech Lang. Hear. Res. 44, 128–143 (2001).
Pardo, J. S. On phonetic convergence during conversational interaction. J. Acoust. Soc. Am. 119, 2382–2393 (2006).
Pulvermüller, F. et al. Motor cortex maps articulatory features of speech sounds. Proc. Natl Acad. Sci. USA 103, 7865–7870 (2006).
Sacks, H., Schegloff, E. A. & Jefferson, G. A. A simplest systematics for the organization of turn-taking in conversation. Language 50, 697–735 (1974).
De Ruiter, J. P, Mitterer, H. & Enfield, N. J. Projecting the end of a speaker's turn: a cognitive cornerstone of conversation. Language 82, 515–535 (2006).
Beattie, G. W. & Barnard, P. J. The temporal structure of natural telephone conversations (Directory Inquiry calls). Linguistics 17, 213–229 (1979).
Wilson, M. & Wilson, T. P. An oscillator model of the timing of turn-taking. Psychon. Bull. Rev. 12, 957–968 (2005).
Nobuhiko, K. & Kenzo, I. Pure delay effects on speech quality in telecommunications. IEEE J. Sel. Areas Commun. 9, 586–593 (1991).
Iacoboni, M. in Perspectives on Imitation: from Neuroscience to Social Science (eds Hurley, S. & Chater, N.) 77–100 (MIT Press, 2005).
Cummins, F. Practice and performance in speech produced synchronously. J. Phon. 31, 139–148 (2003).
Cummins, F. Rhythm as entrainment: the case of synchronous speech. J. Phon. 6 Oct 2008 (doi: 10.1016/j.wocn.2008.08.003).
Prinz, W. What re-enactment earns us. Cortex 42, 515–517 (2006).
Schienberg, S. & Holland, A. L. in Clinical Aphasiology: Conference Proceedings (ed. Brookshire, R. H.) 106–110 (BRK Publishers, Minneapolis, 1980).
Warren, J. E. et al. Positive emotions preferentially engage an auditory-motor “mirror” system. J. Neurosci. 26, 13067–13075 (2006).
Scott, S. K. The point of P-centres. Psychol. Res. 61, 4–11 (1998).
Warren, W. H. Jr. & Verbrugge, R. R. Auditory perception of breaking and bouncing events: a case study in ecological acoustics. J. Exp. Psychol. Hum. Percept. Perform. 10, 704–712 (1984).
Hove, M. J., Keller, P. E. & Krumhansl, C. L. Sensorimotor synchronization with chords containing tone-onset asynchronies. Percept. Psychophys. 69, 699–708 (2007).
Gordon, J. W. The perceptual attack time of musical tones. J. Acoust. Soc. Am. 82, 88–105 (1987).
Rasch, R. Synchronization in performed ensemble music. Acustica 43, 121–131 (1979).
Marcus, S. M. Acoustic determinants of perceptual centre (P-center) location. Percept. Psychophys. 30, 247–256 (1981).
Kohler, E. et al. Hearing sounds, understanding actions: action representation in mirror neurons. Science 297, 846–848 (2002).
Gazzola, V., Aziz-Zadeh, L. & Keysers, C. Empathy and the somatotopic auditory mirror system in humans. Curr. Biol. 16, 1824–1829 (2006).
Lahav, A., Saltzman, E. & Schlaug, G. Action representation of sound: audiomotor recognition network while listening to newly acquired actions. J. Neurosci. 27, 308–314 (2007).
Meyer, M., Zysset, S., von Cramon, D. Y. & Alter, K. Distinct fMRI responses to laughter, speech, and sounds along the human peri-sylvian cortex. Brain Res. Cogn. Brain Res. 24, 291–306 (2005).
Provine, R. R. Contagious laughter - laughter is a sufficient stimulus for laughs and smiles. Bull. Psychon. Soc. 30, 1–4 (1992).
Wiltermuth, S. S. & Heath, C. Synchrony and cooperation. Psychol. Sci. 20, 1–5 (2008).
Rauschecker, J. P. & Tian, B. Mechanisms and streams for processing of “what” and “where” in auditory cortex. Proc. Natl Acad. Sci. USA 97, 11800–11806 (2000).
Binder, J. R., Swanson, S. J., Hammeke, T. A. & Sabsevitz, D. S. A comparison of five fMRI protocols for mapping speech comprehension systems. Epilepsia 49, 1980–1997 (2008).
Wilson, S. M., Molnar-Szakacs, I. & Iacoboni, M. Beyond superior temporal cortex: intersubject correlations in narrative speech comprehension. Cereb. Cortex 18, 230–242 (2008).
Scott, S. K., Rosen, S., Wickham, L. & Wise, R. J. S. A positron emission tomography study of the neural basis of informational and energetic masking effects in speech perception. J. Acoust. Soc. Am. 115, 813–821 (2004).
Callan, D. E. et al. Song and speech: brain regions involved with perception and covert production. Neuroimage 31, 1327–1342 (2006).
Doehrmann, O., Naumer, M. J., Volz, S., Kaiser, J. & Altmann, C. F. Probing category selectivity for environmental sounds in the human auditory brain. Neuropsychologia 46, 2776–2786 (2008).
Lewis, J. W., Brefczynski, J. A., Phinney, R. E., Janik, J. J. & DeYoe, E. A. Distinct cortical pathways for processing tool versus animal sounds. J. Neurosci. 25, 5148–5158 (2005).
Bangert, M. et al. Shared networks for auditory and motor processing in professional pianists: evidence from fMRI conjunction. Neuroimage 30, 917–926 (2006).
Dale, A. M., Fischl, B. & Sereno, M. I. Cortical surface-based analysis. I: Segmentation and surface reconstruction. Neuroimage 9, 179–194 (1999).
Fischl, B., Sereno, M. I. & Dale, A. M. Cortical surface-based analysis. II: Inflation, flattening, and a surface-based coordinate system. Neuroimage 9, 195–207 (1999).
Acknowledgements
S.K.S., C.M. and F.E. are funded by Wellcome Trust Grant WT074414MA. We would like to thank K. Kluender, H. Mitterer, L. Bernstein, T. Manly and M. Davis for very helpful discussions on many of these issues.
Author information
Authors and Affiliations
Corresponding author
Related links
Glossary
- Convergence
-
In this context, the way that different aspects of joint speech (both motoric and linguistic) become united, or coordinated, between speakers.
- Diphone
-
A cluster of two phones that can be legally combined in a language (for example, /sk/ is legal at the start of a syllable in English, but /ks/ is not); diphones thus contain transitional information between the two phones, and are more information-rich than single phones.
- Embodied semantic representations
-
In this context, theories of semantic representations that link the more abstract elements of the representations to more concrete elements of their material properties; for example, part of the meaning of 'a football' is represented by how one might kick it.
- Expressive aphasia
-
A speech-production deficit in which people have reduced fluency, grammatical errors and problems in articulating accurately.
- Linguistics
-
In this context, the phonemic, semantic or syntactic processing of heard speech, which is distinct from the processing of the basic acoustic properties of speech (for example, loudness).
- Local structure computation
-
The sequential analysis of heard speech (for example, 'the sandwich was eaten'), as opposed to higher-order, hierarchical computations across longer timescales (for example, 'the sandwiches were eaten by the children at the party').
- Phone
-
A single speech sound (which is always a variant of a phoneme); for example, the aspirated /p/ at the start of 'port' is a different phone from the /p/ of 'sport', but these are both examples (allophones) of the phoneme /p/.
- Phoneme
-
An elemental sound of speech (such as /p/ or /t/) that can be used in the explicit transcription and classification of the sounds of a language.
- Phonemic
-
Pertaining to the representation and processing of phonemes.
- Phonetic
-
Pertaining to speech sounds (phones).
- Pre-lexical processing
-
In this context, the neural processing of speech sounds before the representation of word identity and meaning.
- Receptive aphasia
-
A speech-perception and -comprehension deficit in which the patient has great difficulty in following what is being said to them. Speech production is unimpaired in terms of fluency but speech content can be meaningless, and many patients are unaware that they have a problem.
- Semantic
-
Relating to the meaning of things, in this case words and language.
- Spectral centre of gravity
-
The average value of the spectral components of a sound, which captures how the sound is weighted across low to high frequencies; for example, 's' has a higher spectral centre of gravity than 'sh'.
- Speech comprehension
-
In this context, post-perceptual, lexical, semantic and linguistic processing of speech. Although speech comprehension does require good speech perception, comprehension can also be enhanced by higher-order syntactic and semantic features (for example, sentence predictability).
- Speech perception
-
In this context, the pre-lexical perceptual processing of the speech signal.
- Syllable
-
Like a diphone, a syllable typically contains information about the organization of speech at a level higher than the phoneme. A single-syllable word, like 'start', can be broken down into an onset and a rhyme (for example, st-art), and may consist of only the rhyme (for example, 'art'): the rhyme may be further broken down into a nucleus and coda (for example, ar-t).
- Syntax
-
The rules that determine the correct arrangement and inflection of words in spoken or written language.
- Voicing
-
The sound made by vibrations of the vocal folds; for example, the sound at the start of 'zoo' is voiced, whereas that at the start of 'sue' is unvoiced.
Rights and permissions
About this article
Cite this article
Scott, S., McGettigan, C. & Eisner, F. A little more conversation, a little less action — candidate roles for the motor cortex in speech perception. Nat Rev Neurosci 10, 295–302 (2009). https://doi.org/10.1038/nrn2603
Published:
Issue Date:
DOI: https://doi.org/10.1038/nrn2603
- Springer Nature Limited
This article is cited by
-
What neural oscillations can and cannot do for syntactic structure building
Nature Reviews Neuroscience (2023)
-
Speaking rhythmically can shape hearing
Nature Human Behaviour (2020)
-
Abnormal functional connectivity and degree centrality in anterior cingulate cortex in patients with long-term sensorineural hearing loss
Brain Imaging and Behavior (2020)
-
Understanding rostral–caudal auditory cortex contributions to auditory perception
Nature Reviews Neuroscience (2019)
-
Prediction is Production: The missing link between language production and comprehension
Scientific Reports (2018)