Human gesture recognition using a simplified dynamic Bayesian network

Roh, Myung-Cheol; Lee, Seong-Whan

doi:10.1007/s00530-014-0414-9

Human gesture recognition using a simplified dynamic Bayesian network

Regular Paper
Published: 09 October 2014

Volume 21, pages 557–568, (2015)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Myung-Cheol Roh² &
Seong-Whan Lee¹

588 Accesses
13 Citations
Explore all metrics

Abstract

In video-based human gesture recognition, it is very important to combine useful features and analyze the dynamic structure thereof as efficiently as possible. In this paper, we proposed a dynamic Bayesian network model that is a simplified model of dynamics at the level of hidden variables and employs observation windows of observation time slices for robust modeling and handling of noise and other variabilities. The proposed Simplified dynamic Bayesian network (DBN) was tested on a gesture database and an American sign language database. According to the experiments, the proposed DBN outperformed other methods: Conditional Random Fields (CRFs), conventional Bayesian Networks (BNs), DBNs, and Hidden Markov Models (HMMs). The proposed DBN achieved 98 % recognition accuracy in gesture recognition and 94.6 % in ASL recognition whereas the HMM and the CRF did 80 and 86 % in gesture recognition and 75.4 and 85.4 % in ASL (American Sign Language) recognition, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-modal gesture recognition using integrated model of motion, audio and video

Article 19 July 2015

Real-time Hand Tracking for Dynamic Gesture Recognition

Multi-modal Gesture Recognition Using Skeletal Joints and Motion Trail Model

Notes

Korea University Gesture Database, http://gesturedb.korea.ac.kr.
American Sign Language Database, http://www.bu.edu/asllrp/ncslgr.html.
We would like to thank H.-D. Yang, the first author of [22] for providing the feature data and the results.

References

Mitra, S., Acharya, T.: Gesture recognition: A survey. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 37(3), 311–324 (2007). doi:10.1109/TSMCC.2007.893280
Article Google Scholar
Bian, W., Tao, D., Rui, Y.: Cross-domain human action recognition. IEEE Trans. Syst. Man Cybern. Part B Appl. Rev. 42(2), 298–307 (2012). doi:10.1109/TSMCB.2011.2166761
Article Google Scholar
Dielmann, A., Renals, S.: Automatic meeting segmentation using dynamic bayesian networks. IEEE Trans. Multimed. 9(1), 25–36 (2007)
Article Google Scholar
Du, Y., Chen, F., Xu, W., Li, Y.: Recognizing interaction activities using dynamic bayesian network. In: Proceedings of the 17th International Conference on Pattern Recognition, vol. 1, pp. 618–621 (2006)
Robertson, N., Reid, I.: Behaviour understanding in video: a combined method. In: Proceedings of The Tenth IEEE International Conference on Computer Vision, vol. 1, pp. 808–815 (2005)
Suk, H.I., Shin, B.K., Lee, S.W.: Hand gesture recognition based on dynamic bayesian network framework. Pattern Recognit. 43(9), 3059–3072 (2010)
Wang, T., Diao, Q., Zhang, Y., Song, G., Lai, C., Bradski, G.: A dynamic bayesian network approach to multi-cue based visual tracking. In: Proceedings of the 17th International Conference on Pattern Recognition, vol. 2, pp. 167–170 (2004)
Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. In: Proceedings of IEEE, vol. 77, pp. 257–286 (1989)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of International Conference on Machine Learning, pp. 282–289, USA (2001)
Fenton, N., Neil, M.: Making decisions: using bayesian nets and mcda. Knowl. Based Syst. 14, 307–325 (2001)
Article Google Scholar
Heckerman, D.: A tutorial on learning with Bayesian networks. Technical report msr-tr-95-06, Microsoft Research (1995)
Murphy, K.: Dynamic bayesian networks: Representation, inference and learning. Ph.D. thesis, University Of California, Berkeley (2002)
Bitmes, J., Bartels, C.: Graphical model architectures for speech recognition. IEEE Signal Process. Mag. 22(5), 89–100 (2005)
Article Google Scholar
Ji, Q., Lan, P., Looney, C.: A probabilistic framework for modeling and real-time monitoring human fatigue. IEEE Trans. Syst. Man Cybern. A 36(35), 862–875 (2006)
Google Scholar
Nikolopoulos, S., Papadopoulos, G., Kompatsiaris, I., Patras, I.: Evidence-driven image interpretation by combining implicit and explicit knowledge in a bayesian network. IEEE Trans. Syst. Man Cybern. Part B Appl. Rev. 41(5), 1366–1381 (2011). doi:10.1109/TSMCB.2011.2147781
Article Google Scholar
Park, S., Aggarwal, J.: A hierarchical bayesian network for event recognition of human actions and interactions. Multimed. Syst. 10(2), 164–179 (2004)
Article Google Scholar
Darrell, T., Pentland, A.: Space-time gestures. In: Computer Vision and Pattern Recognition. In: Proceedings of CVPR ’93, 1993 IEEE Computer Society Conference on (1993)
Li, H., Greenspan, M.: Multi-scale gesture recognition from time-varying contours. Int. Conf. Comput. Vis. 1, 226–234 (2005)
Google Scholar
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoustics Speech Signal Proc. 26(1), 43–49 (1978)
Ahmad, M., Lee, S.W.: Human action recognition using shape and CLG-motion flow from multi-view image sequences. Pattern Recognit. 41(7), 2237–2252 (2008)
Article MATH Google Scholar
Starner, T., Weaver, J., Pentland, A.: Real-time american sign language recognition using desk and wearable computer based video. IEEE Trans. Pattern Anal. Mach. Intell. 20(12), 1371–1375 (1998)
Article Google Scholar
Yang, H.D., Sclaroff, S., Lee, S.W.: Sign language spotting with a threshold model based on conditional random fields. IEEE Trans. Pattern Anal. Mach. Intell. 31(7), 1264–1277 (2009)
Article Google Scholar
Moenne-Loccoz, N., Bremond, F., Thonnat, M.: Recurrent bayesian network for the recognition of human behaviors from video. In: Proceedings of 3rd International Conference on Computer Vision Systems, pp. 68–77 (2003)
Wang, S., Quattoni, A., Morency, L.P., Demirdjian, D., Darrell, T.: Hidden conditional random fields for gesture recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1521–1527 (2006)
Murphy, K.: Bayes net toolbox for Matlab (2014). http://code.google.com/p/bnt/Sept.(2014)
Kudo, T.: CRF++: Yet another CRF toolkit (2005). http://code.google.com/p/crfpp/Sept.(2014)
Lee, H.K., Kim, J.H.: An hmm-based threshold model approach for gesture recognition. IEEE Trans. Pattern Anal. Mach. Recognit. 21(10), 961–973 (1999)
Article Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE conference on Computer Vision and Patter Recognition, vol. 1, pp. 511–519 (2001)
Yang, H.D., Lee, S.W., Lee, S.W.: Multiple human detection and tracking based on weighted temporal texture features. Int. J. Pattern Recognit. Artif. Intell. 20(3), 377–391 (2006)
Article MathSciNet Google Scholar

Download references

Acknowledgments

This research was supported by the Implementation of Technologies for Identification, Behavior, and Location of Human based on Sensor Network Fusion Program through the Ministry of Trade, Industry and Energy (Grant No. 10041629) and the 2014 R&D Program for S/W Computing Industrial Core Technology through the Ministry of Science, ICT and Future Planning/Korea Evaluation Institute of Industrial Technology (Project No. 2014-044-023-001), Korea.

Author information

Authors and Affiliations

Department of Brain and Cognitive Engineering, Korea University, Seoul, Korea
Seong-Whan Lee
S-1 Corporation, Seoul, Korea
Myung-Cheol Roh

Authors

Myung-Cheol Roh
View author publications
You can also search for this author in PubMed Google Scholar
Seong-Whan Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seong-Whan Lee.

Additional information

Communicated by Q. Tian.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Roh, MC., Lee, SW. Human gesture recognition using a simplified dynamic Bayesian network. Multimedia Systems 21, 557–568 (2015). https://doi.org/10.1007/s00530-014-0414-9

Download citation

Received: 11 June 2013
Accepted: 26 August 2014
Published: 09 October 2014
Issue Date: November 2015
DOI: https://doi.org/10.1007/s00530-014-0414-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human gesture recognition using a simplified dynamic Bayesian network

Abstract

Access this article

Similar content being viewed by others

Multi-modal gesture recognition using integrated model of motion, audio and video

Real-time Hand Tracking for Dynamic Gesture Recognition

Multi-modal Gesture Recognition Using Skeletal Joints and Motion Trail Model

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Human gesture recognition using a simplified dynamic Bayesian network

Abstract

Access this article

Similar content being viewed by others

Multi-modal gesture recognition using integrated model of motion, audio and video

Real-time Hand Tracking for Dynamic Gesture Recognition

Multi-modal Gesture Recognition Using Skeletal Joints and Motion Trail Model

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation