New wireless connection between user and VE using speech processing

Mirzaei, M. Ali; Merienne, Frederic; Oliver, James H.

doi:10.1007/s10055-014-0248-y

New wireless connection between user and VE using speech processing

Original Article
Published: 20 July 2014

Volume 18, pages 235–243, (2014)
Cite this article

Virtual Reality Aims and scope Submit manuscript

M. Ali Mirzaei¹,
Frederic Merienne¹ &
James H. Oliver²

351 Accesses
1 Citation
Explore all metrics

An Erratum to this article was published on 12 March 2015

Abstract

This paper presents a novel speak-to-VR virtual-reality peripheral network (VRPN) server based on speech processing. The server uses a microphone array as a speech source and streams the results of the process through a Wi-Fi network. The proposed VRPN server provides a handy, portable and wireless human machine interface that can facilitate interaction in a variety interfaces and application domains including HMD- and CAVE-based virtual reality systems, flight and driving simulators and many others. The VRPN server is based on a speech processing software development kits and VRPN library in C++. Speak-to-VR VRPN works well even in the presence of background noise or the voices of other users in the vicinity. The speech processing algorithm is not sensitive to the user’s accent because it is trained while it is operating. Speech recognition parameters are trained by hidden Markov model in real time. The advantages and disadvantages of the speak-to-VR server are studied under different configurations. Then, the efficiency and the precision of the speak-to-VR server for a real application are validated via a formal user study with ten participants. Two experimental test setups are implemented on a CAVE system by using either Kinect Xbox or array microphone as input device. Each participant is asked to navigate in a virtual environment and manipulate an object. The experimental data analysis shows promising results and motivates additional research opportunities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Boyle A et al (2008) Pick your top geek gift-cosmic log. Science 3147:10
Google Scholar
Day PN, Holt PO, Russell GT (2001) The cognitive effects of delayed visual feedback: working memory disruption while driving in virtual environments. Cognitive technology: instruments of mind. Springer, Berlin, pp 75–82
Chapter Google Scholar
DiVerdi S, Rakkolainen I, Höllerer T, Olwal A (2006) A novel walk-through 3d display. In: Electronic imaging 2006, p 605519. International Society for Optics and Photonics
Fischbach M, Wiebusch D, Giebler-Schubert A, Latoschik ME, Rehfeld S, Tramberend H (2011) Sixton’s curse-simulator x demonstration. In: 2011 IEEE virtual reality conference (VR), pp 255–256. IEEE
Intel. Voice recognition and synthesis using the intel perceptual computing sdk, 2013
iSpeech. Speech processing sdk for mobile developer, 8 2011
Jinghui G, Zijing J, Jinming H (2005) Implement of speech application program based on speech sdk [j]. J Guangxi Acad Sci 3:169–172
Google Scholar
Joystiq. Kinect: the company behind the tech explains how it works, March 21 2011
Kennedy RS, Lane NE, Berbaum KS, Lilienthal MG (1993) Simulator sickness questionnaire: an enhanced method for quantifying simulator sickness. Int J Aviat Psychol 3(3):203–220
Article Google Scholar
Lyngsø R (2012) Hidden Markov models. Narotama 1:1–24
Google Scholar
Nilsson M, Ejnarsson M (2002) Speech recognition using hidden Markov model. Master’s thesis, Department of Telecommunications and Speech Processing, Blekinge Institute of Technology
Pulakka H, Alku P (2011) Bandwidth extension of telephone speech using a neural network and a filter bank implementation for highband mel spectrum. IEEE Trans Audio Speech Lang Process 19(7):2170–2183
Article Google Scholar
R. M. T. II. (2008) Vrpn 07.30—http://www.cs.unc.edu/research/vrpn/
Retrieved P (2010) Kinect xbox 360 specification, July 2 2010. This information is based on specifications supplied by manufacturers and should be used for guidance only
Rodríguez-Andina J, Fagundes RDR, Junior DB (2001) A fpga-based viterbi algorithm implementation for speech recognition systems. In: 2001 IEEE international conference on acoustics, speech, and signal processing, 2001 (Proceedings ICASSP’01), vol 2, pp 1217–1220, IEEE
Rubsamen M, Gershman AB (2012) Robust adaptive beamforming using multidimensional covariance fitting. IEEE Trans Signal Process 60(2):740–753
Article MathSciNet Google Scholar
SAR (2005) Sri language modeling toolkit and speech sdk
Shao W, Qian Z (2013) A new partially adaptive minimum variance distortionless response beamformer with constrained stability least mean squares algorithm. Adv Sci Lett 19(4):1071–1074
Article Google Scholar
Stone JE, Kohlmeyer A, Vandivort KL, Schulten K (2010) Immersive molecular visualization and interactive modeling with commodity hardware. In: Advances in visual computing. Springer, Berlin, pp 382–393
Store M (2010) Kinect for xbox 360, 7 July 2010. Array of 4 microphones supporting single speaker voice recognition
Suma EA, Lange B, Rizzo A, Krum DM, Bolas M (2011) Faast: The flexible action and articulated skeleton toolkit. In: 2011 IEEE virtual reality conference (VR), pp 247–248, IEEE
Taylor II RM, Hudson TC, Seeger A, Weber H, Juliano J, Helser AT (2001) Vrpn: a device-independent, network-transparent vr peripheral system. In: Proceedings of the ACM symposium on Virtual reality software and technology, pp 55–61, ACM
Zhu F-W, Li D-Q, Yuan Z-P, Wu J-Q, Cheng X (2004) An ar tracker based on planar marker. J Shanghai Univ (Nat Sci Ed) 5:005

Download references

Author information

Authors and Affiliations

Lab. Le2i, Institute Image, Paris-Tech, Paris, France
M. Ali Mirzaei & Frederic Merienne
Virtual Reality Application Center, Iowa State University, Ames, IA, USA
James H. Oliver

Authors

M. Ali Mirzaei
View author publications
You can also search for this author in PubMed Google Scholar
Frederic Merienne
View author publications
You can also search for this author in PubMed Google Scholar
James H. Oliver
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Ali Mirzaei.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mirzaei, M.A., Merienne, F. & Oliver, J.H. New wireless connection between user and VE using speech processing. Virtual Reality 18, 235–243 (2014). https://doi.org/10.1007/s10055-014-0248-y

Download citation

Received: 21 November 2013
Accepted: 23 June 2014
Published: 20 July 2014
Issue Date: November 2014
DOI: https://doi.org/10.1007/s10055-014-0248-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

New wireless connection between user and VE using speech processing

Abstract

Access this article

Similar content being viewed by others

Speech-Based Interface for Embedded Systems

Operator-Friendly UAV Control System with HMI Using Speech and Gesture Recognition

An expandable voice user interface as lab assistant based on an improved version of Google’s speech recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

New wireless connection between user and VE using speech processing

Abstract

Access this article

Similar content being viewed by others

Speech-Based Interface for Embedded Systems

Operator-Friendly UAV Control System with HMI Using Speech and Gesture Recognition

An expandable voice user interface as lab assistant based on an improved version of Google’s speech recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation