Kinect-based Taiwanese sign-language recognition system

Abstract

Gesture-recognition is an important component for many intelligent human–computer interaction applications. For example, a realtime sign-language recognition system would detect and interpret hand gestures. Many vision-based sign-language recognition methods have been proposed over the years with mix results of usability. Some system are limited to recognize only a few gestures, while others require the use of 3D camera to provides depth information to improve recognition accuracy. In this paper, a Kinect-based Taiwanese sign-language recognition system is proposed. Three main features are extracted from the signing gestures, namely hand positions, hand signing direction, and hand shapes. The hand positions are readily available through the input sensor. The signing direction is determined using HMM on trajectory of the hand movement, and a SVM is trained and used to recognize the hand shapes. Experimental results show that the proposed system achieved an 85.14 % recognition rate.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

References

  1. 1.

    Accord.Net library, http://www.ohloh.net/p/Accord-NET

  2. 2.

    Anant A, Manish KT (2013) Sign language recognition using Microsoft Kinect. Proceedings of the IEEE international conference on contemporary computing, 181–185.s

  3. 3.

    Brashear H, Henderson V, Park KH, Hamilton H, Lee S, Starner T (2006) American sign language recognition in game development for deaf children. Proceedings of the ACM international conference on computers and accessibility, 79–86

  4. 4.

    Chang CC, Lin CJ (2011) LIBSVM: a Library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27

    Article  Google Scholar 

  5. 5.

    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(7):273–297

    MATH  Google Scholar 

  6. 6.

    Dimitrios K, Anastasios D, Nikolaos D (2005) Gesture-based video summarization. Proc IEEE Int Conf Image Process 3:1220–1223

    Google Scholar 

  7. 7.

    Dreuw P, Rybach D, Deselaers T, Zahedi M, and Ney H (2007) Speech recognition techniques for a sign language recognition system. Interspeech, 2513–2516

  8. 8.

    Feng Z, Xu S, Zhang X, Jin L, Ye Z, Yang W (2012) Real-time fingertip tracking and detection using Kinect depth sensor for a new writing-in-the air system. Proceedings of the ACM international conference on internet multimedia computing and service, 70–74

  9. 9.

    Giovanni G, Pierpaolo M, Alessandro C, Stefano DM et al (2013) White paper on industrial applications of computer vision and pattern recognition. Lect Notes Comput Sci 8157:721–730

    Article  Google Scholar 

  10. 10.

    Honghai L, Shengyong C, Kubota N (2013) Intelligent video systems and analytics: a survey. IEEE Trans Ind Inform 9(3):1222–1233

    Article  Google Scholar 

  11. 11.

    Kadous MW (1996) Machine recognition of auslan signs using powergloves: towards large-lexicon recognition of sign language. Proceedings of the workshop on the integration of gesture in language and speech, 165–174

  12. 12.

    Kalin S, Jonas B. (2013) A Kinect corpus of Swedish sign language signs. Proceedings of the workshop on multimodal corpora: beyond audio and video.

  13. 13.

    Kelly D, Delannoy JR, Donald JM, Markham C (2009) A framework for continuous multimodal sign language recognition. Proceedings of the ACM international conference on multimodal interfaces, 351–358

  14. 14.

    Lee B, Cho Y, Cho S (1995) Translation, scale and rotation invariant pattern recognition using principal component analysis (PCA) and reduced second-order neural network. Neural Parallel Sci Comput 3:417–429

    Google Scholar 

  15. 15.

    Leonard EB, Ted P (1966) Statistical inference for probabilistic functions of finite state markov chains. Ann Math Stat 37:1554–1563

    Article  Google Scholar 

  16. 16.

    Nikolaos D, Anastasios D, Dimitrios K (2005) Content-based decomposition of gesture videos, Proceedings of IEEE international workshop on signal processing systems design and implementation, 319–324

  17. 17.

    Otsu N (1975) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern Syst 9(1):62–66

    Google Scholar 

  18. 18.

    Pugeault N, Bowden R (2011) Spelling it out: real-time ASL fingerspelling recognition. Proceedings of the IEEE international conference on computer vision, 1114–1119

  19. 19.

    Ren Z, Meng J, Yuan J, Zhang Z (2011) Robust hand gesture recognition with Kinect sensor. Proceedings of the ACM international conference on multimedia, 759–760

  20. 20.

    Segen J, Kumar S (1999) Shadow gestures: 3D hand pose estimation using a single camera. Proceedings of the IEEE international conference on computer vision and pattern recognition, 1479–1485

  21. 21.

    Siddiky FA, Alam MS, Ahsan T, Rahim MS (2007) An efficient approach to rotation invariant face detection using PCA, generalized regression neural network and Mahalanobis distance by reducing search space. Proceedings of international conference on computer and information technology, 1–6

  22. 22.

    Simon L, Marco B, Raúl R (2012) Sign language recognition using Kinect. Artif Intell Soft Comput Lect Notes Comput Sci 7267:394–402

    Article  Google Scholar 

  23. 23.

    Son DT and Larry SD (2008) Event modeling and recognition using Markov logic networks. Proceedings of IEEE European Conference on Computer Vision, 610–623

  24. 24.

    Starner T, Pentland A (1995) Real-time american sign language recognition from video using hidden markov models. Proceedings of the IEEE international conference on computer vision, 265–270

  25. 25.

    Starner T, Weaver J, Pentland A (1998) Real-time american sign language recognition using desk and wearable computer based video. IEEE Trans Pattern Anal Mach Intell 20(12):1371–1375

    Article  Google Scholar 

  26. 26.

    Stehman SV (1997) Selecting and interpreting measures of thematic classification accuracy. Remote Sens Environ 62(1):77–89

    Article  Google Scholar 

  27. 27.

    Vogler C, Metaxas D (1998) ASL recognition based on a coupling between HMMs and 3D motion analysis. Proceedings of the IEEE international conference on computer vision, 363–369

  28. 28.

    Yi L (2012) Hand gesture recognition using Kinect. Proceedings of the IEEE international conference on software engineering and service science, 196–199

  29. 29.

    Zafrulla Z, Brashear H, Starner T, Hamilton H, Presti P (2011) American sign language recognition with the kinect. Proceedings of the ACM international conference on multimodal interfaces, 279–286

  30. 30.

    Zhigang M, Yi Y, Zhongwen X, Shuicheng Y, Nicu S, Alexander GH (2013) Complex event detection via multi-source video attributes. Proceedings of the IEEE international conference on computer vision and pattern recognition, 2627–2633

  31. 31.

    Zieren J, Kraiss KF (2004) Non-intrusive sign language recognition for human-computer interaction. Proceedings of the IFAC/IFIP/IFORS/IEA international symposium on analysis, design and evaluation of human machine systems

Download references

Acknowledgments

This research was partially supported by the Ministry of Science and Technology of Taiwan, R.O.C., under grant numbers 100-2511-S-003-020-MY2 and 101-2511-S-003-057-MY3.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Fu-Hao Yeh.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lee, G.C., Yeh, F. & Hsiao, Y. Kinect-based Taiwanese sign-language recognition system. Multimed Tools Appl 75, 261–279 (2016). https://doi.org/10.1007/s11042-014-2290-x

Download citation

Keywords

  • Sign-language recognition
  • Gesture recognition
  • Kinect