Velar Movement Assessment for Speech Interfaces: An Exploratory Study Using Surface Electromyography

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 511)


In the literature several silent speech interfaces based on Surface Electromyography (EMG) can be found. However, it is yet unclear if we are able to sense muscles activity related to nasal port opening/closing. Detecting the nasality phenomena, would increase the performance of languages with strong nasal characteristics such as European Portuguese. In this paper we explore the use of surface EMG electrodes, a non-invasive method, positioned in the face and neck regions to explore the existence of useful information about the velum movement. For an accurate interpretation and validation of the proposed method, we use velum movement information extracted from Real-Time Magnetic Resonance Imaging (RT-MRI) data. Overall, results of this study show that differences can be found in the EMG signals for the case of nasal vowels, by sensors positioned below the ear between the mastoid process and the mandible in the upper neck region.


Velum movement detection Surface electromyography Silent speech interfaces 



This work was partially funded by Marie Curie IAPP Golem (ref.251415, FP7-PEOPLE-2009-IAPP), Marie Curie IAPP IRIS (ref. 610986, FP7-PEOPLE-2013-IAPP) and by FEDER through the Operational Program Competitiveness factors - COMPETE under the scope of QREN 5329 FalaGlobal, by National Funds through FCT (Foundation for Science and Technology) in the context of the Project HERON II (PTDC/EEA-PLP/098298/2008) and by project Cloud Thinking (funded by the QREN Mais Centro program: CENTRO-07-ST24-FEDER-002031).


  1. 1.
    Huang, X., Acero, A., Hon, H.: Spoken Language Processing. Prentice Hall PTR, Upper Saddle River (2001)Google Scholar
  2. 2.
    Flynn, R., Jones, E.: Combined speech enhancement and auditory modelling for robust distributed speech recognition. Speech Commun. 50(10), 797–809 (2008)CrossRefGoogle Scholar
  3. 3.
    Stark, A., Paliwal, K.: MMSE estimation of log-filterbank energies for robust speech recognition. Speech Commun. 53(3), 403–416 (2011)CrossRefGoogle Scholar
  4. 4.
    Yang, C., Brown, G., Lu, L., Yamagishi, J., King, S.: Noise-robust whispered speech recognition using a non-audible-murmur microphone with VTS compensation. In: 2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 220–223 (2012)Google Scholar
  5. 5.
    Denby, B., Schultz, T., Honda, K., Hueber, T., Gilbert, J.M., Brumberg, J.S.: Silent speech interfaces. Speech Commun. 52(4), 270–287 (2009)CrossRefGoogle Scholar
  6. 6.
    Schultz, T., Wand, M.: Modeling coarticulation in large vocabulary EMG-based speech recognition. Speech Commun. 52(4), 341–353 (2010)CrossRefGoogle Scholar
  7. 7.
    Jorgensen, C., Lee, D., Agabon, S.: Sub auditory speech recognition based on EMG signals. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN), pp. 3128–3133 (2003)Google Scholar
  8. 8.
    Teixeira, J.S.: Síntese Articulatória das Vogais Nasais do Português Europeu [Articulatory Synthesis of Nasal Vowels for European Portuguese]. Ph.D. Thesis, Universidade de Aveiro (2000)Google Scholar
  9. 9.
    Freitas, J., Teixeira, A. Dias, M.S.: Towards a silent speech interface for portuguese: surface electromyography and the nasality challenge. In: International Conference on Bio-Inspired Systems and Signal Processing, Vilamoura, Algarve, Portugal (2012)Google Scholar
  10. 10.
    Seikel, J.A., King, D.W., Drumright, D.G.: Anatomy and Physiology for Speech, Language, and Hearing. Delmar Learning, Clifton Park (2010)Google Scholar
  11. 11.
    Hardcastle, W.J.: Physiology of Speech Production: An Introduction for Speech Scientists. Academic Press, London (1976)Google Scholar
  12. 12.
    Fritzell, B.: The velopharyngeal muscles in speech: an electromyographic and cineradiographic study. Acta Otolaryngolica 250, 1–81 (1969)Google Scholar
  13. 13.
    Kuehn, D.P., Folkins, J.W., Linville, R.N.: An electromyographic study of the musculus uvulae. Cleft Palate J. 25(4), 348–355 (1988)Google Scholar
  14. 14.
    Rossato, S., Teixeira, A., Ferreira, L.: Les Nasales du Portugais et du Français: une étude comparative sur les données EMMA. In: XXVI Journées d’Études de la Parole, Dinard, France (2006)Google Scholar
  15. 15.
    Lacerda, A., Head, B.F.: Análise de sons nasais e sons nasalizados do Português. Revista do Laboratório de Fonética Experimental (de Coimbra), vol. 6, pp. 5–70 (1996)Google Scholar
  16. 16.
    Trigo, R.L.: The inherent structure of nasal segments. In: Huffman, M.K., Krakow, R.A. (eds.) Nasals, Nasalization, and the Velum, Phonetics and Phonology, vol. 5, pp. 369–400. Academic Press Inc., London (1993)Google Scholar
  17. 17.
    Teixeira, A., Moutinho, L.C., Coimbra, R.L.: Production, acoustic and perceptual studies on European portuguese nasal vowels height. In: Internat. Congress Phonetic Sciences (ICPhS), pp. 3033–3036 (2003)Google Scholar
  18. 18.
    Martins, P., Carbone, I.C., Pinto, A., Silva, A., Teixeira, A.: European Portuguese MRI based speech production studies. Speech Commun. 50(11/12), 925–952 (2008). ISSN 0167–6393CrossRefGoogle Scholar
  19. 19.
    Bell-Berti, F.: An electromyographic study of velopharyngeal function. Speech J. Speech Hearing Res. 19, 225–240 (1976)CrossRefGoogle Scholar
  20. 20.
    Kuehn, D.P., Folkins, J.W., Cutting, C.B.: Relationships between muscle activity and velar position. Cleft Palate J. 19(1), 25–35 (1982)Google Scholar
  21. 21.
    Lubker, J.F.: An electromyographic-cinefluorographic investigation of velar function during normal speech production. Cleft Palate J. 5(1), 17 (1968)Google Scholar
  22. 22.
    McGill, S., Juker, D., Kropf, P.: Appropriately placed surface EMG electrodes reflect deep muscle activity (psoas, quadratus lumborum, abdominal wall) in the lumbar spine. J. Biomech. 29(11), 1503–1507 (1996)CrossRefGoogle Scholar
  23. 23.
    Chan, A.D.C., Englehart, K., Hudgins, B., Lovely, D.F.: Hidden Markov model classification of myoelectric signals in speech. In: Proceedings of the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol. 2, pp. 1727–1730 (2001)Google Scholar
  24. 24.
    Jou, S., Schultz, T., Waibel, A.: Continuous electromyographic speech recognition with a multi-stream decoding architecture. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2007, Honolulu, Hawaii, US (2007)Google Scholar
  25. 25.
    Wand, M., Schultz, T.: Investigations on speaking mode discrepancies in emg-based speech recognition. In: Interspeech 2011, Florence, Italy (2011)Google Scholar
  26. 26.
    Herff, C., Janke, M., Wand, M., Schultz, T.: Impact of different feedback mechanisms in EMG-based speech recognition. In: Interspeech 2011, Florence, Italy (2011)Google Scholar
  27. 27.
    Wand, M., Schultz, T.: Analysis of phone confusion in EMG-based speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2011, Prague, Czech Republic (2011)Google Scholar
  28. 28.
    Wand, M., Schultz, T.: Session-independent EMG-based speech recognition. In: International Conference on Bio-Inspired Systems and Signal Processing 2011, Biosignals 2011, Rome, Italy (2011)Google Scholar
  29. 29.
    Wand, M., Schultz, C., Janke, M., Schultz, T.: Array-based electromyographic silent speech interface. In: 6th International Conference on Bio-Inspired Systems and Signal Processing, Biosignals 2013, Barcelona, Spain (2013)Google Scholar
  30. 30.
    Freitas, J., Teixeira, A., Vaz, F., Dias, M.S.: Automatic speech recognition based on ultrasonic doppler sensing for European portuguese. In: Torre Toledano, D., Ortega Giménez, A., Teixeira, A., González Rodr\’ıguez, J., Hernández Gómez, L., San Segundo Hernández, R., Ramos Castro, D. (eds.) IberSPEECH 2012. CCIS, vol. 328, pp. 227–236. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  31. 31.
    Galatas, G., Potamianos, G., Makedon, F.: Audio-visual speech recognition incorporating facial depth information captured by the Kinect. In: Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pp. 2714–2717, 27–31 August 2012 (2012)Google Scholar
  32. 32.
    Teixeira, A., Martins, P., Oliveira, C., Ferreira, C., Silva, A., Shosted, R.: Real-Time MRI for portuguese. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds.) PROPOR 2012. LNCS, vol. 7243, pp. 306–317. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  33. 33.
    Silva, S., Martins, P., Oliveira, C., Silva, A., Teixeira, A.: Segmentation and analysis of the oral and nasal cavities from MR time sequences. In: Campilho, A., Kamel, M. (eds.) ICIAR 2012, Part II. LNCS, vol. 7325, pp. 214–221. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  34. 34.
    Plux Wireless Biosignals, Portugal.
  35. 35.
    Hudgins, B., Parker, P., Scott, R.: A new strategy for multifunction myoelectric control. IEEE Trans. Biomed. Eng. 40(1), 82–94 (1993)CrossRefGoogle Scholar
  36. 36.
    Quatieri, T.F., Brady, K., Messing, D., Campbell, J.P., Campbell, W.M., Brandstein, M.S., Weinstein, C., Tardelli, J., Gatewood, P.D.: Exploiting nonacoustic sensors for speech encoding. IEEE Trans. Audio, Speech, Lang. Process. 14(2), 533–544 (2006)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Microsoft Language Development CenterLisbonPortugal
  2. 2.Department Electronics Telecommunications and Informatics/IEETAUniversity of AveiroAveiroPortugal
  3. 3.Health School/IEETAUniversity of AveiroAveiroPortugal
  4. 4.ISCTE-University Institute of Lisbon (ISCTE-IUL)LisbonPortugal

Personalised recommendations