Biomedical Engineering Letters

, Volume 5, Issue 1, pp 10–21 | Cite as

Progress in speech decoding from the electrocorticogram

  • Shreya Chakrabarti
  • Hilary M. Sandberg
  • Jonathan S. Brumberg
  • Dean J. Krusienski
Review Article
Part of the following topical collections:
  1. International Biomedical Engineering Conference (IBEC) 2014

Abstract

Recent advances in neuroimaging methods have improved our ability to explore the neurological processes underlying speech and language. As a result of these investigations, it is now possible to decode aspects of speech directly from neural activity toward the development of neuroprosthetic devices for individuals with severe neuromuscular and communication disorders. Much of what is known about the neural correlates of speech articulation and perception is based on lesion and cortical electrical stimulation studies, as well as modern non-invasive neuroimaging. Though extremely important to the current understanding of brain function, traditional neuroimaging methods are primarily limited by the spatial and temporal resolution of the imaging technique. Electrical activity measured from the cortex, or electrocorticography (ECoG), offers several advantages over other neuroimaging modalities for characterization and real-time decoding of brain activity. Specifically, ECoG is well-suited for the study of speech and language owing to its unique spatial and temporal resolution capabilities that allow it to accurately capture the fast-changing dynamics of the large cortical networks underlying speech processing. This review presents the current progress of ECoG-based speech characterization and decoding studies, including an overview of prior neuroimaging studies, ECoG representations of speech production and perception, and a discussion of future directions.

Keywords

Electrocorticography ECoG Speech Neuroprosthetics 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1].
    Wolpaw JR, Birbaumer N, McFarland DJ, Pfurtscheller G, Vaughan TM. Brain-computer interfaces for communication and control. Clin Neurophysiol. 2002; 113(6):767–91.CrossRefGoogle Scholar
  2. [2].
    Ficke RC. Digest of data on persons with disabilities. Washington, DC: National Institute on Disability and Rehabilitation Research. 1992.Google Scholar
  3. [3].
    Pasley BN, Knight RT. Decoding speech for understanding and treating aphasia. Prog Brain Res. 2013; 207:435–56.CrossRefGoogle Scholar
  4. [4].
    Price CJ. A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading. Neuroimage. 2012; 62(2):816–47.MathSciNetGoogle Scholar
  5. [5].
    Pei X, Hill J, Schalk G. Silent communication: toward using brain signals. IEEE Pulse. 2012; 3(1):43–6.CrossRefGoogle Scholar
  6. [6].
    Ojemann GA. Cortical organization of language. J Neurosci. 1991; 11(8):2281–7.Google Scholar
  7. [7].
    Broca P. Perte de la parole, ramollissement chronique et destruction partielle du lobe antérieur gauche du cerveau. Bull Soc Anthropol. 1861; 2:235–8.Google Scholar
  8. [8].
    Wernicke C. Der aphasische symptomenkomplex. Springer Berlin Heidelberg: 1974.Google Scholar
  9. [9].
    Hickok G, Poeppel D. The cortical organization of speech processing. Nat Rev Neurosci. 2007; 8(5):393–402.CrossRefGoogle Scholar
  10. [10].
    Hickok G. Computational neuroanatomy of speech production. Nat Rev Neurosci. 2012; 13(2):135–45.CrossRefGoogle Scholar
  11. [11].
    Guenther FH, Ghosh SS, Tourville JA. Neural modeling and imaging of the cortical interactions underlying syllable production. Brain Lang. 2006; 96(3):280–301.CrossRefGoogle Scholar
  12. [12].
    Price CJ, Wise RJ, Warburton EA, Moore CJ, Howard D, Patterson K, Frackowiak RS, Friston KJ. Hearing and saying the functional neuro-anatomy of auditory word processing. Brain. 1996; 119(3):919–31.CrossRefGoogle Scholar
  13. [13].
    Price CJ. The anatomy of language: contributions from functional neuroimaging. J Anat. 2000; 197(3):335–59.CrossRefGoogle Scholar
  14. [14].
    Fiez JA, Petersen SE. Neuroimaging studies of word reading. Proc Natl Acad Sci USA. 1998; 95(3):914–21.CrossRefGoogle Scholar
  15. [15].
    Binder JR, Frost JA, Hammeke TA, Bellgowan PS, Springer JA, Kaufman JN, Possing ET. Human temporal lobe activation by speech and nonspeech sounds. Cereb Cortex. 2000; 10(5):512–28.CrossRefGoogle Scholar
  16. [16].
    Talavage TM, Gonzalez-Castillo J, Scott SK. Auditory neuroimaging with fMRI and PET. Hear Res. 2014; 307:4–15.CrossRefGoogle Scholar
  17. [17].
    Ganushchak LY, Christoffels IK, Schiller NO. The use of electroencephalography in language production research: a review. Front Psychol. 2011; 2(208):1–6.Google Scholar
  18. [18].
    Sanders LD, Neville HJ. An ERP study of continuous speech processing: I. Segmentation, semantics, and syntax in native speakers. Brain Res Cogn Brain Res. 2003; 15(3):228–40.CrossRefGoogle Scholar
  19. [19].
    Hagoort P, Brown CM. ERP effects of listening to speech: semantic ERP effects. Neuropsychologia. 2000; 38(11):1518–30.CrossRefGoogle Scholar
  20. [20].
    Indefrey P, Levelt WJ. The spatial and temporal signatures of word production components. Cognition. 2004; 92(1–2):101–44.CrossRefGoogle Scholar
  21. [21].
    Leuthardt EC, Pei XM, Breshears J, Gaona C, Sharma M, Freudenberg Z, Barbour D, Schalk G. Temporal evolution of gamma activity in human cortex during an overt and covert word repetition task. Front Hum Neurosci. 2012; 6:99.Google Scholar
  22. [22].
    Palmini A. The concept of the epileptogenic zone: a modern look at Penfield and Jasper’s views on the role of interictal spikes. Epileptic Disord. 2006; 8 Suppl 2:S10–5.Google Scholar
  23. [23].
    Schalk G, Leuthardt EC. Brain-computer interfaces using electrocorticographic signals. IEEE Rev Biomed Eng. 2011; 4:140–54.CrossRefGoogle Scholar
  24. [24].
    Kellis S, Miller K, Thomson K, Brown R, House P, Greger B. Decoding spoken words using local field potentials recorded from the cortical surface.J Neural Eng. 2010; 7(5):056007.Google Scholar
  25. [25].
    Blakely T, Miller KJ, Rao RP, Holmes MD, Ojemann JG. Localization and classification of phonemes using high spatial resolution electrocorticography (ECoG) grids. Conf Proc IEEE Eng Med Biol Soc. 2008; 2008:4964–7.Google Scholar
  26. [26].
    Chang EF, Rieger JW, Johnson K, Berger MS, Barbaro NM, Knight RT. Categorical speech representation in human superior temporal gyrus. Nat Neurosci. 2010; 13(11):1428–32.CrossRefGoogle Scholar
  27. [27].
    Leuthardt EC, Gaona C, Sharma M, Szrama N, Roland J, Freudenberg Z, Solis J, Breshears J, Schalk G. Using the electrocorticographic speech network to control a brain — computer interface in humans. J Neural Eng. 2011; 8(3):036004.Google Scholar
  28. [28].
    Schwartz AB, Cui XT, Weber DJ, Moran DW. Brain-controlled interfaces: movement restoration with neural prosthetics. Neuron. 2006; 52(1):205–20.CrossRefGoogle Scholar
  29. [29].
    Sillay KA, Rutecki P, Cicora K, Worrell G, Drazkowski J, Shih JJ, Sharan AD, Morrell MJ, Williams J, Wingeier B. Long-term measurement of impedance in chronically implanted depth and subdural electrodes during responsive neurostimulation in humans. Brain Stimul. 2013; 6(5):718–26.CrossRefGoogle Scholar
  30. [30].
    Wu C, Evans JJ, Skidmore C, Sperling MR, Sharan AD. Impedance variations over time for a closed-loop neurostimulation device: early experience with chronically implanted electrodes. Neuromodulation. 2013; 16(1):46–50.CrossRefGoogle Scholar
  31. [31].
    Crone NE, Sinai A, Korzeniewska A. High-frequency gamma oscillations and human brain mapping with electrocorticography. Prog Brain Res. 2006; 159:275–95.CrossRefGoogle Scholar
  32. [32].
    Leuthardt EC, Schalk G, Wolpaw JR, Ojemann JG, Moran DW. A brain-computer interface using electrocorticographic signals in humans. J Neural Eng. 2004; 1(2):63–71.CrossRefGoogle Scholar
  33. [33].
    Yanagisawa T, Hirata M, Saitoh Y, Goto T, Kishima H, Fukuma R, Yokoi H, Kamitani Y, Yoshimine T. Real-time control of a prosthetic hand using human electrocorticography signals. J Neurosurg. 2011; 114(6):1715–22.CrossRefGoogle Scholar
  34. [34].
    Schalk G, Miller KJ, Anderson NR, Wilson JA, Smyth MD, Ojemann JG, Moran DW, Wolpaw JR, Leuthardt EC. Twodimensional movement control using electrocorticographic signals in humans. J Neural Eng. 2008; 5(1):75–84.CrossRefGoogle Scholar
  35. [35].
    Hinterberger T, Widman G, Lal TN, Hill J, Tangermann M, Rosenstiel W, Schölkopf B, Elger C, Birbaumer N. Voluntary brain regulation and communication with electrocorticogram signals. Epilepsy Behav. 2008; 13(2):300–6.CrossRefGoogle Scholar
  36. [36].
    Crone NE, Boatman D, Gordon B, Hao L. Induced electrocorticographic gamma activity during auditory perception. Clin Neurophysiol. 2001; 112(4):565–82.CrossRefGoogle Scholar
  37. [37].
    Pei X, Leuthardt EC, Gaona CM, Brunner P, Wolpaw JR, Schalk G. Spatiotemporal dynamics of electrocorticographic high gamma activity during overt and covert word repetition. Neuroimage. 2011; 54(4):2960–72.CrossRefGoogle Scholar
  38. [38].
    Crone NE, Hao L, Hart J Jr, Boatman D, Lesser RP, Irizarry R, Gordon B. Electrocorticographic gamma activity during word production in spoken and sign language. Neurology. 2001; 57(11):2045–53.CrossRefGoogle Scholar
  39. [39].
    Sinai A, Bowers CW, Crainiceanu CM, Boatman D, Gordon B, Lesser RP, Lenz FA, Crone NE. Electrocorticographic high gamma activity versus electrical cortical stimulation mapping of naming. Brain. 2005; 128(7):1556–70.CrossRefGoogle Scholar
  40. [40].
    Edwards E, Soltani M, Deouell LY, Berger MS, Knight RT. High gamma activity in response to deviant auditory stimuli recorded directly from human cortex. J Neurophysiol. 2005; 94(6):4269–80.CrossRefGoogle Scholar
  41. [41].
    Bouchard KE, Mesgarani N, Johnson K, Chang EF. Functional organization of human sensorimotor cortex for speech articulation. Nature. 2013; 495(7441):327–32.CrossRefGoogle Scholar
  42. [42].
    Miller KJ, Abel TJ, Hebb AO, Ojemann JG. Rapid online language mapping with electrocorticography. J Neurosurg Pediatr. 2011; 7(5):482–90.CrossRefGoogle Scholar
  43. [43].
    Kubanek J, Brunner P, Gunduz A, Poeppel D, Schalk G. The tracking of speech envelope in the human cortex. PLoS One. 2013; 8(1):e53398.Google Scholar
  44. [44].
    Edwards E, Soltani M, Kim W, Dalal SS, Nagarajan SS, Berger MS, Knight RT. Comparison of time-frequency responses and the event-related potential to auditory speech stimuli in human cortex. J Neurophysiol. 2009; 102(1):377–86.CrossRefGoogle Scholar
  45. [45].
    Nourski KV, Reale RA, Oya H, Kawasaki H, Kovach CK, Chen H, Howard MA 3rd, Brugge JF. Temporal envelope of timecompressed speech represented in the human auditory cortex. J Neurosci. 2009; 29(49):15564–74.CrossRefGoogle Scholar
  46. [46].
    Canolty RT, Soltani M, Dalal SS, Edwards E, Dronkers NF, Nagarajan SS, Kirsch HE, Barbaro NM, Knight RT. Spatiotemporal dynamics of word processing in the human brain. Front Neurosci. 2007; 1(1):185–96.CrossRefGoogle Scholar
  47. [47].
    Chang EF, Niziolek CA, Knight RT, Nagarajan SS, Houde JF. Human cortical sensorimotor network underlying feedback control of vocal pitch. Proc Natl Acad Sci USA. 2013; 110(7):2653–8.CrossRefGoogle Scholar
  48. [48].
    Towle VL, Yoon HA, Castelle M, Edgar JC, Biassou NM, Frim DM, Spire JP, Kohrman MH. ECoG gamma activity during a language task: differentiating expressive and receptive speech areas. Brain. 2008; 131(8):2013–27.CrossRefGoogle Scholar
  49. [49].
    Greenlee JD, Jackson AW, Chen F, Larson CR, Oya H, Kawasaki H, Chen H, Howard MA 3rd. Human auditory cortical activation during self-vocalization. PLoS One. 2011; 6(3):e14744.Google Scholar
  50. [50].
    Wang W, Degenhart AD, Sudre GP, Pomerleau DA, Tyler-Kabara EC. Decoding semantic information from human electrocorticographic (ECoG) signals. Conf Proc IEEE Eng Med Biol Soc. 2011; 2011:6294–8.Google Scholar
  51. [51].
    Pei X, Barbour DL, Leuthardt EC, Schalk G. Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans. J Neural Eng. 2011; 8(4):046028.Google Scholar
  52. [52].
    Kanas VG, Mporas I, Benz HL, Sgarbas KN, Bezerianos A, Crone NE. Joint spatial-spectral feature space clustering for speech activity detection from ECoG signals. IEEE Trans Biomed Eng. 2014; 61(4):1241–50.CrossRefGoogle Scholar
  53. [53].
    Zhang D, Gong E, Wu W, Lin J, Zhou W, Hong B. Spoken sentences decoding based on intracranial high gamma response using dynamic time warping. Conf Proc IEEE Eng Med Biol Soc. 2012; 2012:3292–5.Google Scholar
  54. [54].
    Mugler EM, Patton JL, Flint RD, Wright ZA, Schuele SU, Rosenow J, Shih JJ, Krusienski DJ, Slutzky MW. Direct classification of all American English phonemes using signals from functional speech motor cortex. J Neural Eng. 2014; 11(3):035015.Google Scholar
  55. [55].
    Martin S, Brunner P, Holdgraf C, Heinze HJ, Crone NE, Rieger J, Schalk G, Knight RT, Pasley BN. Decoding spectrotemporal features of overt and covert speech from the human cortex. Front Neuroeng. 2014; 7:14.CrossRefGoogle Scholar
  56. [56].
    Zavaglia M, Canolty RT, Schofield TM, Leff AP, Ursino M, Knight RT, Penny WD. A dynamical pattern recognition model of gamma activity in auditory cortex. Neural Netw. 2012; 28:1–14.CrossRefGoogle Scholar
  57. [57].
    Pasley BN, David SV, Mesgarani N, Flinker A, Shamma SA, Crone NE, Knight RT, Chang EF. Reconstructing speech from human auditory cortex. PLoS Biol. 2012; 10(1):e1001251.Google Scholar
  58. [58].
    Behroozmand R, Larson CR. Error-dependent modulation of speech-induced auditory suppression for pitch-shifted voice feedback. BMC Neurosci.2011; 12:54.CrossRefGoogle Scholar
  59. [59].
    Parbery-Clark A, Strait DL, Anderson S, Hittner E, Kraus N. Musical experience and the aging auditory system: implications for cognitive abilities and hearing speech in noise. PLoS One. 2011; 6(5):e18082.Google Scholar
  60. [60].
    Heinrich A, Schneider BA, Craik FI. Investigating the influence of continuous babble on auditory short-term memory performance. Q J Exp Psychol. 2008; 61(5):735–51.CrossRefGoogle Scholar
  61. [61].
    Pichora-Fuller MK. Audition and cognition: What audiologistsneed to know about listening. In: Palmer C, Seewald R, editors. Hearing Care for Adults. Stäfa, Switzerland: Phonak; 2007. pp 71–85.Google Scholar
  62. [62].
    Deng L, O’Shaughnessy D. Speech processing: a dynamic and optimization-oriented approach. CRC Press; 2003.Google Scholar
  63. [63].
    Guenther FH, Brumberg JS, Wright EJ, Nieto-Castanon A, Tourville JA, Panko M, Law R, Siebert SA, Bartels JL, Andreasen DS, Ehirim P, Mao H, Kennedy PR. A wireless brain-machine interface for real-time speech synthesis. PLoS One. 2009; 4(12):e8218.Google Scholar
  64. [64].
    Brumberg JS, Wright EJ, Andreasen DS, Guenther FH, Kennedy PR. Classification of intended phoneme production from chronic intracortical microelectrode recordings in speechmotor cortex. Front Neurosci. 2011; 5:65.Google Scholar

Copyright information

© Korean Society of Medical and Biological Engineering and Springer 2015

Authors and Affiliations

  • Shreya Chakrabarti
    • 1
  • Hilary M. Sandberg
    • 2
  • Jonathan S. Brumberg
    • 3
  • Dean J. Krusienski
    • 1
  1. 1.Dept of Electrical and Computer EngineeringOld Dominion UniversityNorfolkUSA
  2. 2.Dept of Communication Sciences and DisordersOld Dominion UniversityNorfolkUSA
  3. 3.Dept of Speech-Language-Hearing: Sciences & DisordersUniversity of KansasLawrenceUSA

Personalised recommendations