Skip to main content

Development of a Silent Speech Interface for Augmented Reality Applications

  • Conference paper
  • First Online:
Computer Methods, Imaging and Visualization in Biomechanics and Biomedical Engineering II (CMBBE 2021)

Abstract

Silent speech interfaces using non-invasive electromyography (EMG) sensors have been utilized to control internet-of-things devices [1] and provide communications in acoustically challenging environments [2]. However, they have yet to be implemented into Augmented Reality displays, an area they can potentially revolutionize as a human-machine interface by offering low-profile and fluid input. This study overviews the development of a silent speech interface that receives and decodes input from subvocalizations recorded by skin surface EMG sensors, to be used to control a heads-up-display built on a Microsoft HoloLens. Measured muscle activation of the anterior cervical region while a subject subvocalized words from a predetermined library were collected. Trials consisting of subvocalized words were parsed for individual subvocalizations to build a dataset for training of a speech recognition model. The speech recognition model based on a one dimensional convolutional neural network employed to classify subvocalized words was built with the Keras application programming interface in Python, using the TensorFlow library. Preliminary results demonstrate effectiveness in classifying commands, with classification accuracies for ten trained models showing promise. Successful classification was achieved with models showing accuracy in the range of 66.6% to 100%. An average word classification accuracy of 82.5% between all models is observed. While all models were trained and tested on the same datasets, the stochastic nature of the model has significant effects on output, with the dropout layer adding artificial noise to training, and the gradient-descent based optimization algorithm adding random variance to the completed model effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Wadkins, E.: A continuous silent speech recognition system for AlterEgo, a silent speech interface, 24 May 2019. https://dspace.mit.edu/bitstream/handle/1721.1/123121/1128187233-MIT.pdf?sequence=1&isAllowed=y

  2. Meltzner, G.S., Heaton, J.T., Deng, Y.: The MUTE silent speech recognition system. In: INTERSPEECH, Burlington (2013)

    Google Scholar 

  3. ASHA American Speech - Language - Hearing Association, “Quick Facts,”. Accessed 12 Sept 2021

    Google Scholar 

  4. Gonzalez-Lopez, J.A.. Gomez-Alanis, A., Martin-Donas, J.M., Perez-Cordoba, J.L., Gomez, A.M.: Silent speech interfaces for speech restoration: a review, Granada (2020)

    Google Scholar 

  5. Hummel, J., et al.: Evaluation of a new electromagnetic tracking system using a standardized assessment protocol. Phys. Med. Biol. 51(10), 27 (2006)

    Article  Google Scholar 

  6. Fagan, M.J., Ell, S.R., Gilbert, J.M., Sarrazin, E., Chapman, P.M.: Development of a (silent) Speech Recognition System for Patients Following Laryngectomy. Med. Eng. Phys. 30, 419–425 (2008)

    Google Scholar 

  7. Hueber, T., Chollet, G., Denby, B., Stone, M., Zouari, L.: Ouisper: corpus based synthesis driven by articulatory data. In: 16th International Congress of Phonetic Sciences (2007)

    Google Scholar 

  8. Nakajima, Y.: Development and evaluation of soft silicone NAM. In: IEICE, pp. 7–12 (2005)

    Google Scholar 

  9. Bos, J., Tack, D.: Speech input hardware investigation for future dismounted soldier computer systems (2005)

    Google Scholar 

  10. Hansen, J.H., Patil, S.A.: The physiological microphone (PMIC): a competitive alternative for speaker assessment in stress detection and speaker verification. Speech Commun. 52(4), 327–340 (2010)

    Article  Google Scholar 

  11. Titze, I.R., Story, B.H., Burnett, G.C., Holzrichter, J.F., Ng, L.C., Lea, W.A.: Comparison between electroglottography and electromagnetic glottography. J. Acoust. Soc. Am. 107(1), 581–588 (2000)

    Google Scholar 

  12. Spinlab: Tuned Electromagnetic Resonator Collor Sensor (2004). Accessed 2021

    Google Scholar 

  13. Tamm, M.-O., Muhammad, Y., Muhammad, N.: Classification of vowels from imagined speech with convolutional neural networks, University of Tartu: Institute of Computer Science (2020)

    Google Scholar 

  14. Neuper, C., Müller, G.R., Kübler, A., Birbaumer, N., Pfurtscheller, G.: Clinical Application of an EEG-based brain computer interface: a case study in a patient. Clin. Nueropychol. 114, 399–409 (2003)

    Google Scholar 

  15. Bartels, J., et al.: Neurotrophic electrode: method of assembly and implantation. J. Neurosci. Methods 174(2), 168–176 (2008)

    Google Scholar 

  16. Hochberg, L.R., et al.: Neuronal ensemble control of prosthetic devices by a human with tetraplegia. Nature 442(7099), 164–171 (2006)

    Google Scholar 

  17. Huang, Y., Low, K., Lim, H.: Initial analysis of EMG signals of hand functions associated to rehabilitation tasks. In: International Conference on Robotics and Biomimetics, Singapore (2009)

    Google Scholar 

  18. Regents of the University of Michigan: Neurosciences: Movement Disorders (2021). https://www.uofmhealth.org/conditions-treatments/brain-neurological-conditions/movement-disorders

Download references

Acknowledgements

Special thanks to Embry-Riddle Aeronautical University’s biomedical laboratory facilities and management, which is supported by the Mechanical Engineering Department.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christine Walck .

Editor information

Editors and Affiliations

Ethics declarations

Declaration Statements

N/A

Funding: Ignite

Ethical Approval: IRB 21-117

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Walck, C., Rivas, T., Flanagan, R., Fornito, M. (2023). Development of a Silent Speech Interface for Augmented Reality Applications. In: Tavares, J.M.R.S., Bourauel, C., Geris, L., Vander Slote, J. (eds) Computer Methods, Imaging and Visualization in Biomechanics and Biomedical Engineering II. CMBBE 2021. Lecture Notes in Computational Vision and Biomechanics, vol 38. Springer, Cham. https://doi.org/10.1007/978-3-031-10015-4_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-10015-4_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-10014-7

  • Online ISBN: 978-3-031-10015-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics