Skip to main content
Log in

Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment

  • Original Article
  • Published:
Virtual Reality Aims and scope Submit manuscript

Abstract

Speech recognition technology is a promising hands-free interfacing modality for virtual reality (VR) applications. However, it has several drawbacks, such as limited usability in a noisy environment or a public place and limited accessibility to those who cannot generate loud and clear voices. These limitations may be overcome by employing a silent speech recognition (SSR) technology utilizing facial electromyograms (fEMGs) in a VR environment. In the conventional SSR systems, however, fEMG electrodes were attached around the user’s lips and neck, thereby creating new practical issues, such as the requirement of an additional wearable system besides the VR headset, necessity of a complex and time-consuming procedure for attaching the fEMG electrodes, and discomfort and limited facial muscle movements of the user. To solve these problems, we propose an SSR system using fEMGs measured by a few electrodes attached around the eyes of a user, which can also be easily incorporated into available VR headsets. To enhance the accuracy of classifying the fEMG signals recorded from limited recording locations relatively far from the phonatory organs, a deep neural network-based classification method was developed using similar fEMG data previously collected from other individuals and then transformed by dynamic positional warping. In the experiments, the proposed SSR system could classify six different fEMG patterns generated by six silently spoken words with an accuracy of 92.53%. To further demonstrate that our SSR system can be used as a hands-free control interface in practical VR applications, an online SSR system was implemented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of data and material

Not applicable.

Code availability

Not applicable.

Notes

  1. https://www.oculus.com.

  2. https://store.steampowered.com/app/598400/Starship_Commander_Arcade.

  3. https://store.steampowered.com/app/414120/Modbox/.

References

Download references

Acknowledgements

This was supported in part by the Samsung Science and Technology Foundation [SRFC-TB1703-05, facial electromyogram-based facial expression recognition for interactive VR applications], in part by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT). (No. NRF-2019R1A2C2086593), and in part by the Institute of Information and communications Technology Planning and Evaluation (IITP) grant funded by the Korea government (MIST) (No. 2020-0-01373, Artificial Intelligence Graduate School Program (Hanyang University)).

Funding

This was supported in part by the Samsung Science and Technology Foundation [SRFC-TB1703-05, facial electromyogram-based facial expression recognition for interactive VR applications], in part by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT). (No. NRF-2019R1A2C2086593), and in part by the Institute of Information and communications Technology Planning and Evaluation (IITP) grant funded by the Korea government (MIST) (No. 2020-0-01373, Artificial Intelligence Graduate School Program (Hanyang University)).

Author information

Authors and Affiliations

Authors

Contributions

HC conducted overall data analyses and wrote a major part of the paper. WC provided an algorithm of dynamic positional warping and insight for the data analysis. CI provided important insight for the design of the paper and revised the manuscript. All authors listed have made considerable contribution to this paper and approved the submitted version.

Corresponding author

Correspondence to Chang-Hwan Im.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cha, HS., Chang, WD. & Im, CH. Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment. Virtual Reality 26, 1047–1057 (2022). https://doi.org/10.1007/s10055-021-00616-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10055-021-00616-0

Keywords

Navigation