Abstract
This paper describes a proposal for a full body gesture recognition system to be used in an intelligent space to allow users to control their environment. We describe a successful adaptation of the traditional strategy applied in the design of spoken language recognition systems, to the new domain of full body gesture recognition. The experimental evaluation has been done on a realistic task where different elements in the environment can be controlled by the users using gesture sequences. The evaluation results have been obtained applying a rigorous experimental procedure, evaluating different feature extraction strategies. The average recognition rates achieved are around 97 % for the gestural sentence level, and over 98 % at the gesture level, thus experimentally validating the proposal.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Gestures and actions in the same table row are not necessarily related, the columns represent just a list of the different elements considered.
References
Gunes, H., Piccardi, M., Jan, T.: Face and body gesture recognition for a vision-based multimodal analyzer. In: Piccardi, M., Hintz, T., He, S., Huang, M.L., Feng, D.D. (eds.) VIP, vol. 36 of CRPIT, pp. 19–28. Australian Computer Society (2003)
Mehrabian, A.: Communication without words. Psychol. Today 2(9), 52–55 (1968)
Wikipedia: Wiimote sensor general information. https://en.wikipedia.org/wiki/Wii_Remote. Accessed January 2016
Wikipedia: Kinect sensor general information. http://en.wikipedia.org/wiki/Kinect . Accessed January 2016
Wikipedia: Google glass general information. https://en.wikipedia.org/wiki/Google_Glass. Accessed January 2016
Cassell, J.: A framework for gesture generation and interpretation. In: Computer Vision in Human-Machine Interaction, pp. 191–215. Cambridge University Press, Cambridge (2000)
Puranam, M.B.: Towards full-body gesture analysis and recognition. PhD thesis, University of Kentucky (2005)
Xu, D., Chen, Y., Lin, C., Kong, X., Wu, X.: Real-time dynamic gesture recognition system based on depth perception for robot navigation. In: 2012 IEEE International Conference on Robotics and Biomimetics, ROBIO 2012, Guangzhou, China, pp. 689–694, 11–14 December 2012
Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)
Song, Y., Gu, Y., Wang, P., Liu, Y., Li, A.: A kinect based gesture recognition algorithm using GMM and HMM. In: 6th International Conference on Biomedical Engineering and Informatics, BMEI 2013, Hangzhou, China, pp. 750–754, 16–18 December 2013
Yin, Y.: Real-time continuous gesture recognition for natural multimodal interaction. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA (2014)
Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 1st edn. Prentice Hall PTR, Upper Saddle River (2000)
Lee, J.H., Hashimoto, H.: Intelligent space concept and contents. Adv. Rob. 16(3), 265–280 (2002)
Shotton, J., Fitzgibbon, A.W., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: The 24th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colorado Springs, CO, USA, pp. 1297–1304, 20–25 June 2011
Pattis, R.E.: Ebnf: A notation to describe syntax. While developing a manuscript for a textbook on the Ada programming language in the late 1980s, I wrote a chapter on EBNF (1980)
OpenNI: https://github.com/OpenNI/OpenNI. Accessed January 2016
Müller, M., Röder, T., Clausen, M.: Efficient content-based retrieval of motion capture data. In: ACM Trans. Graph. (TOG), vol. 24, pp. 677–685. ACM (2005)
Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book, vol. 3. Cambridge University Engineering Department, Cambridge (2002)
Lee, K.F.: Context-dependent phonetic hidden markov models for speaker-independent continuous speech recognition. Acoust. Speech Signal Proc. IEEE Trans. 38(4), 599–609 (1990)
Young, S.J., Odell, J.J., Woodland, P.C.: Tree-based state tying for high accuracy acoustic modelling. In: Proceedings of the Workshop on Human Language Technology, Association for Computational Linguistics, pp. 307–312 (1994)
Acknowledgements
This work has been supported by the Spanish Ministry of Economy and Competitiveness under project SPACES-UAH (TIN2013-47630-C2-1-R), and by the University of Alcalá under projects DETECTOR and ARMIS.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Casillas-Perez, D., Macias-Guarasa, J., Marron-Romera, M., Fuentes-Jimenez, D., Fernandez-Rincon, A. (2016). Full Body Gesture Recognition for Human-Machine Interaction in Intelligent Spaces. In: Ortuño, F., Rojas, I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2016. Lecture Notes in Computer Science(), vol 9656. Springer, Cham. https://doi.org/10.1007/978-3-319-31744-1_58
Download citation
DOI: https://doi.org/10.1007/978-3-319-31744-1_58
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31743-4
Online ISBN: 978-3-319-31744-1
eBook Packages: Computer ScienceComputer Science (R0)