Journal on Multimodal User Interfaces

, Volume 11, Issue 1, pp 1–7 | Cite as

Gesture recognition based on HMM-FNN model using a Kinect

  • Xiao-Li Guo
  • Ting-Ting Yang
Original Paper


Addressing the problem of complex dynamic gesture recognition, this paper obtains the body depth image through the body feeling sensor device—Kinect; the threshold segmentation method is used to segment the gestures depth image, on the basis of the common distance between hand and body. Then, the HMM-FNN model, which combines the hidden markov model (HMM) and the fuzzy neural network (FNN), is used for dynamic gesture recognition. This paper mainly focuses on the trainees’ common operations of equipment in virtual substation to set the custom gesture interaction sets. Based on the characteristic of the complex dynamic gesture, gesture image was decomposed into three feature sequences—hand shape change, hand position changes in the two-dimensional plane, and movement in the Z-axis direction, for feature extraction. The HMM model is respectively built according to the three sub sequences, and the FNN was connected to judge the semantics of gesture using the fuzzy reasoning. By experimental verification, the HMM-FNN model can quickly and effectively identify complicated dynamic hand gestures. Meanwhile, it has strong robustness. The recognition effect is superior to that of the simple HMM model.


Kinect Threshold segmentation method Complex dynamic gesture HMM-FNN  Gesture recognition 


Compliance with ethical standards


This study was funded by the key transformation project of provincial science and technology plan (No. 20140307008GX) and the “Double ten” cultivation project of Jilin provincial education department ( Open image in new window [2014] No. 109).

Conflict of interest

The authors declare that they have no conflict of interest.

Research involving human participants and/or animals

In this research, ten people were chose to participate in the gesture recognition experiment. They are Tingting Yang, Yanli Wen, Liqing Sun, Chunlei Shi, Xudong Ma, Jun Qi, Qing Li, Yang Yu, Jiajia Zhang, Ning Zhou.

Informed consent

All participants voluntarily agreed to participate in this study and all gave written informed consent.


  1. 1.
    Wang Y, Zhang Q-Z (2013) Gesture recognition based on Kinect depth information. J Beijing Inform Sci Technol Univ 28(1):22–26Google Scholar
  2. 2.
    Tomoya M, Jungpil S, Pankoo K (2014) Hand gesture and character recognition based on Kinect sensor. Int J Distrib Sens Netw 2014:1–6. doi: 10.1155/2014/278460 Google Scholar
  3. 3.
    Chen Y, Chen Z, Zhou X (2013) Gesture recognition based on Kinect and application in the virtual assembly technology. Electron Des Eng 21(10):3–7Google Scholar
  4. 4.
    Plouffe G, Cretu AM (2016) Static and dynamic hand gesture recognition in depth data using dynamic time warping. IEEE Trans Instrum Meas 65(2):305–316CrossRefGoogle Scholar
  5. 5.
    Pedersoli F, Benini S, Adami N, Leonardi R (2014) An open source framework for hand pose and gesture recognition using Kinect. Vis Comput 30(10):1107–1122CrossRefGoogle Scholar
  6. 6.
    Zhang H, Parker Lynne E (2016) CoDe4D: color-depth local spatio-temporal features for human activity recognition from RGB-D videos. IEEE Trans Circuits Syst Video Technol 26(3):541–555CrossRefGoogle Scholar
  7. 7.
    Cheng H, Dai Z, Liu Z (2016) An image-to-class dynamic time warping approach for both 3D static and trajectory hand gesture recognition. doi: 10.1016/j.patcog.2016.01.011
  8. 8.
    Halim Z, Abbas G (2015) A Kinect-based sign language hand gesture recognition system for hearing- and speech-impaired: a pilot study of Pakistani sign language. Assist Technol 27(1):34–43CrossRefGoogle Scholar
  9. 9.
    Wu X, Yang C, Feng Q (2015) Research on Kinect-based hand gesture recognition algorithm and its applications. Comput Appl Softw 32(7):173–177Google Scholar
  10. 10.
    Wang X, Dai G, Zhang X et al (2008) Recognition of complex dynamic gesture based on HMM-FNN model. J Softw 18(9):2302–2312CrossRefGoogle Scholar
  11. 11.
  12. 12.
  13. 13.
    Lei J, Wang W (2013) Research and implementation of image threshold segmentation based on OpenCV. Mod Electron Tech 36(24):72–76Google Scholar
  14. 14.
    Cao C, Li R, Zhao L (2012) Hand posture recognition method based on depth image technology. Comput Eng 38(8):16–18Google Scholar
  15. 15.
    Chong W, Zhong L, Shing-Chow C (2015) Superpixel-based hand gesture recognition with Kinect depth camera. IEEE Trans Multimed 1(17):29–39Google Scholar
  16. 16.
    Wang D, Shi C, Zhang M (2012) Multi-touch gesture recognition based on propagation neural networks. Pattern Recogn Artif Intell 23(3):408–413Google Scholar
  17. 17.
    Lin C, Hu J, Jie X (2012) Survey on models and evaluation of quality of experience. Chin J Comput 35(1):1–15CrossRefGoogle Scholar
  18. 18.
    Tan T, Ren K, Chen X et al (2011) Fuzzy neural network technology. J Chongqing Univ Arts Sci (Nat Sci Ed) 30(1):71–74Google Scholar
  19. 19.
    Zhu Q, Li K, Zhang Z et al (2010) An improved Gaussian mixture model for an adaptive background model. J Harb Eng Univ 31(10): 1348–1353, 1392Google Scholar
  20. 20.
    Qu Z, Hou S, Zhang Y et al (2014) Realization of substation visualization training platform. J Northeast Dianli Univ 34(3):75–79Google Scholar

Copyright information

© OpenInterface Association 2016

Authors and Affiliations

  1. 1.Information Engineering CollegeNortheast Dianli UniversityJilinChina

Personalised recommendations