Abstract
Silent speech interfaces using non-invasive electromyography (EMG) sensors have been utilized to control internet-of-things devices [1] and provide communications in acoustically challenging environments [2]. However, they have yet to be implemented into Augmented Reality displays, an area they can potentially revolutionize as a human-machine interface by offering low-profile and fluid input. This study overviews the development of a silent speech interface that receives and decodes input from subvocalizations recorded by skin surface EMG sensors, to be used to control a heads-up-display built on a Microsoft HoloLens. Measured muscle activation of the anterior cervical region while a subject subvocalized words from a predetermined library were collected. Trials consisting of subvocalized words were parsed for individual subvocalizations to build a dataset for training of a speech recognition model. The speech recognition model based on a one dimensional convolutional neural network employed to classify subvocalized words was built with the Keras application programming interface in Python, using the TensorFlow library. Preliminary results demonstrate effectiveness in classifying commands, with classification accuracies for ten trained models showing promise. Successful classification was achieved with models showing accuracy in the range of 66.6% to 100%. An average word classification accuracy of 82.5% between all models is observed. While all models were trained and tested on the same datasets, the stochastic nature of the model has significant effects on output, with the dropout layer adding artificial noise to training, and the gradient-descent based optimization algorithm adding random variance to the completed model effectiveness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Wadkins, E.: A continuous silent speech recognition system for AlterEgo, a silent speech interface, 24 May 2019. https://dspace.mit.edu/bitstream/handle/1721.1/123121/1128187233-MIT.pdf?sequence=1&isAllowed=y
Meltzner, G.S., Heaton, J.T., Deng, Y.: The MUTE silent speech recognition system. In: INTERSPEECH, Burlington (2013)
ASHA American Speech - Language - Hearing Association, “Quick Facts,”. Accessed 12 Sept 2021
Gonzalez-Lopez, J.A.. Gomez-Alanis, A., Martin-Donas, J.M., Perez-Cordoba, J.L., Gomez, A.M.: Silent speech interfaces for speech restoration: a review, Granada (2020)
Hummel, J., et al.: Evaluation of a new electromagnetic tracking system using a standardized assessment protocol. Phys. Med. Biol. 51(10), 27 (2006)
Fagan, M.J., Ell, S.R., Gilbert, J.M., Sarrazin, E., Chapman, P.M.: Development of a (silent) Speech Recognition System for Patients Following Laryngectomy. Med. Eng. Phys. 30, 419–425 (2008)
Hueber, T., Chollet, G., Denby, B., Stone, M., Zouari, L.: Ouisper: corpus based synthesis driven by articulatory data. In: 16th International Congress of Phonetic Sciences (2007)
Nakajima, Y.: Development and evaluation of soft silicone NAM. In: IEICE, pp. 7–12 (2005)
Bos, J., Tack, D.: Speech input hardware investigation for future dismounted soldier computer systems (2005)
Hansen, J.H., Patil, S.A.: The physiological microphone (PMIC): a competitive alternative for speaker assessment in stress detection and speaker verification. Speech Commun. 52(4), 327–340 (2010)
Titze, I.R., Story, B.H., Burnett, G.C., Holzrichter, J.F., Ng, L.C., Lea, W.A.: Comparison between electroglottography and electromagnetic glottography. J. Acoust. Soc. Am. 107(1), 581–588 (2000)
Spinlab: Tuned Electromagnetic Resonator Collor Sensor (2004). Accessed 2021
Tamm, M.-O., Muhammad, Y., Muhammad, N.: Classification of vowels from imagined speech with convolutional neural networks, University of Tartu: Institute of Computer Science (2020)
Neuper, C., Müller, G.R., Kübler, A., Birbaumer, N., Pfurtscheller, G.: Clinical Application of an EEG-based brain computer interface: a case study in a patient. Clin. Nueropychol. 114, 399–409 (2003)
Bartels, J., et al.: Neurotrophic electrode: method of assembly and implantation. J. Neurosci. Methods 174(2), 168–176 (2008)
Hochberg, L.R., et al.: Neuronal ensemble control of prosthetic devices by a human with tetraplegia. Nature 442(7099), 164–171 (2006)
Huang, Y., Low, K., Lim, H.: Initial analysis of EMG signals of hand functions associated to rehabilitation tasks. In: International Conference on Robotics and Biomimetics, Singapore (2009)
Regents of the University of Michigan: Neurosciences: Movement Disorders (2021). https://www.uofmhealth.org/conditions-treatments/brain-neurological-conditions/movement-disorders
Acknowledgements
Special thanks to Embry-Riddle Aeronautical University’s biomedical laboratory facilities and management, which is supported by the Mechanical Engineering Department.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Declaration Statements
N/A
Funding: Ignite
Ethical Approval: IRB 21-117
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Walck, C., Rivas, T., Flanagan, R., Fornito, M. (2023). Development of a Silent Speech Interface for Augmented Reality Applications. In: Tavares, J.M.R.S., Bourauel, C., Geris, L., Vander Slote, J. (eds) Computer Methods, Imaging and Visualization in Biomechanics and Biomedical Engineering II. CMBBE 2021. Lecture Notes in Computational Vision and Biomechanics, vol 38. Springer, Cham. https://doi.org/10.1007/978-3-031-10015-4_18
Download citation
DOI: https://doi.org/10.1007/978-3-031-10015-4_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10014-7
Online ISBN: 978-3-031-10015-4
eBook Packages: EngineeringEngineering (R0)