Skip to main content
Log in

Hand gesture recognition from depth and infrared Kinect data for CAVE applications interaction

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper presents a real-time framework that combines depth data and infrared laser speckle pattern (ILSP) images, captured from a Kinect device, for static hand gesture recognition to interact with CAVE applications. At the startup of the system, background removal and hand position detection are performed using only the depth map. After that, tracking is started using the hand positions of the previous frames in order to seek for the hand centroid of the current one. The obtained point is used as a seed for a region growing algorithm to perform hand segmentation in the depth map. The result is a mask that will be used for hand segmentation in the ILSP frame sequence. Next, we apply motion restrictions for gesture spotting in order to mark each image as a ‘Gesture’ or ‘Non-Gesture’. The ILSP counterparts of the frames labeled as “Gesture” are enhanced by using mask subtraction, contrast stretching, median filter, and histogram equalization. The result is used as the input for the feature extraction using a scale invariant feature transform algorithm (SIFT), bag-of-visual-words construction and classification through a multi-class support vector machine (SVM) classifier. Finally, we build a grammar based on the hand gesture classes to convert the classification results in control commands for the CAVE application. The performed tests and comparisons show that the implemented plugin is an efficient solution. We achieve state-of-the-art recognition accuracy as well as efficient object manipulation in a virtual scene visualized in the CAVE.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. Video Demonstration at http://virtual01.lncc.br/ilsp-hand-gesture/videos.html

  2. Video Demonstration at http://virtual01.lncc.br/ilsp-hand-gesture/videos.html

References

  1. Arsićc D., Roalter L, Wöllmer M., Eyben F, Schuller B, Kaiser M, Kranz M, Rigoll G (2010) 3D gesture recognition applying long Short-Term memory and contextual knowledge in a CAVE. In: Proceedings of the 1st ACM international workshop on multimodal pervasive video analysis, MPVA ’10. ACM, pp 33–36

  2. Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-Up Robust features (SURF). Comput Vis Image Underst 110(3):346–359

    Article  Google Scholar 

  3. Bibby C, Reid ID (2010) Real-time tracking of multiple occluding objects using level sets. In: CVPR, pp. 1307–1314. IEEE Computer Society

  4. Biggs K, Burris M, Stanley M (2014) The complete guide to night vision. Createspace independent pub

  5. Cai Z, Han J, Liu L, Shao L (2016) RGB-D datasets using microsoft kinect or similar sensors: a survey. Multimedia Tools and Applications:1–43

  6. Caputo M, Denker K, Dums B, Umlauf G (2012) 3D hand gesture recognition based on sensor fusion of commodity hardware. In: Reiterer H., Deussen O. (eds) Mensch & Computer, Oldenbourg Verlag, pp 293–302

  7. Chang CC, Lin CJ (2011) LIBSVM: A library for support vector machines. ACM trans. Intell Syst Technol 2(3):1–27

    Article  Google Scholar 

  8. Chaudhary A, Raheja J, Das K, Raheja S (2011) A survey on hand gesture recognition in context of soft computing. In: MeGhanathan N., Kaushik B., Nagamalai D. (eds) Advanced Computing, Communications in Computer and Information Science, vol 133. Springer, Berlin Heidelberg, pp 46–55

  9. Corradini A (2001) Dynamic time warping for Off-Line recognition of a small gesture vocabulary. In: Proceedings of the IEEE ICCV workshop on recognition, analysis, and tracking of faces and gestures in real-time systems (RATFG-RTS’01), RATFG-RTS’01. IEEE computer society, Washington, DC, USA, pp 82–

  10. Cruz-Neira C, Sandin D, DeFanti T, Kenyon R, Hart J (1992) The CAVE - audio visual experience automatic virtual environment. Commun ACM 35:65–72

    Article  Google Scholar 

  11. Dardas N, Georganas ND (2011) Real-time hand gesture detection and recognition using bag-of-features and support vector machine techniques. IEEE Trans Instrum Meas 60(11):3592–3607

    Article  Google Scholar 

  12. Davis F (1985) A technology acceptance model for empirically testing new end-user information systems: theory and results. Massachusetts institute of technology Sloan school of management

  13. de Almeida TV, de Oliveira JC, Rosa P (2012) 3D object handling support system in a CAVE setup. 2011 XIII Symposium on Virtual Reality 0:108–115

    Google Scholar 

  14. Dias JMS, Nande P, Barata N, Correia A (2004) OGRE - Open gestures recognition engine 17th Brazilian symposium on computer graphics and image processing, 2004. Proceedings, pp 33–40

  15. Elmezain M, Al-Hamadi A, Sadek S, Michaelis B (2010) Robust methods for hand gesture spotting and recognition using hidden Markov models and conditional random fields. In: 2010 IEEE international symposium on signal processing and information technology (ISSPIT), pp 131–136

  16. Elmezain M, Hamadi A, Michaelis B (2010) Hand gesture spotting and recognition using HMMs and CRFs in color image sequences. Ph.D. thesis, Otto-von-Guericke-Universitat Magdeburg

  17. Fosty B, Crispim-Junior C, Badie J, Bremond F, Thonnat M (2013) Event recognition system for older people monitoring using an RGB-d camera. In: 2nd workshop on assistance and service robotics in a human environment (in conjunction with IEEE/IROS). Tokyo, Japan

  18. Hackenberg G, McCall R, Broll W (2011) Lightweight palm and finger tracking for Real-Time 3D gesture control. In: Virtual reality conference (VR), 2011 IEEE, pp 19–26

  19. Hartanto R, Susanto A, Santosa P (2014) Real time static hand gesture recognition system prototype for Indonesian sign language. In: 6th international conference on information technology and electrical engineering (ICITEE), 2014, pp 1–6

  20. Hartigan JA (1975) Clustering algorithms. John Wiley & Sons

  21. Hasan H, Abdul-Kareem S (2014) Static hand gesture recognition using neural networks. Artif Intell Rev 41(2):147–181

    Article  Google Scholar 

  22. Hsieh CC, Liou DH (2012) Novel haar features for Real-Time hand gesture recognition using SVM. J Real-Time Image Proc:1–14

  23. Hulik R, Beran V, Spanel M, Krsek P, Smrz P (2012) Fast and accurate plane segmentation in depth maps for indoor scenes. In: IEEE/RSJ international conference on intelligent robots and systems (IROS), 2012, pp 1665–1670

  24. Iason Oikonomidis NK, Argyros A (2011) Efficient Model-Based 3D tracking of hand articulations using kinect. In: Proceedings of the british machine vision conference. BMVA press, pp 101.1–101.11

  25. Joo SI, Weon SH, Choi HI (2014) Real-Time Depth-Based Hand detection and tracking. ScientificWorldJournal 2014(284):827

    Google Scholar 

  26. Khan NY, McCane B, Wyvill G. (2011) SIFT and SURF performance evaluation against various image deformations on benchmark dataset. In: Proc. of the 2011 int. conf. on digital image computing: Techn. and app., DICTA ’11. USA, Washington, DC, pp 501–506

  27. Le VB, Nguyen AT, Zhu Y (2014) Hand detecting and positioning based on depth image of kinect sensor. International Journal of Information and Electronics Engineering 4(3):176–179

    Article  Google Scholar 

  28. Lee U, Tanaka T (2012) Hand controller : Image manipulation interface using fingertips and palm tracking with kinect depth data. In: APCHI ’12: Proceedings Of the 10th asia pacific conference on computer human interaction, pp 705–706

  29. Lee H, Tateyama Y, Ogi T (2012) Hand gesture recognition using Blob detection for immersive projection display system. World Acad Sci Eng Technol 6(2):745–748

    Google Scholar 

  30. Leite DATQ, Duarte JC, de Oliveira JC, de Almeida Thomaz V, Giraldi G.A. (2014) A system to interact with CAVE applications using hand gesture recognition from depth data. In: SVR 2014, Salvador, Hahia, Brazil, May 12-15, pp 246–253

  31. Li Y (2012) Multi-scenario gesture recognition using kinect. 2014 computer games: AI, Animation, Mobile, Multimedia, Educational and Serious Games (CGAMES) 0:126–130

    Google Scholar 

  32. Li Q, Zhang H, Guo J, Bhanu B, An L (2013) Reference-Based scheme combined with k-SVD for scene image categorization. IEEE Signal Process Lett 20 (1):67–70

    Article  Google Scholar 

  33. Liang H, Yuan J, Thalmann D. (2012) 3D fingertip and palm tracking in depth image sequences. In: Proceedings of the 20th ACM international conference on multimedia, MM ’12. ACM, New York, NY, USA, pp 785–788

  34. Likert R (1932) A technique for the measurement of attitudes. Arch Psychol 22 (140):1–55

    Google Scholar 

  35. Lin WS, Wu YL, Hung WC, Tang CY (2013) A study of Real-Time hand gesture recognition using SIFT on binary images. In: Pan J. S., Yang C. N., Lin C. C. (eds) Advances in intelligent systems and applications - proceedings of the international computer symposium ICS 2012 held at Hualien, Taiwan, December 12–14, 2012, vol 2. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 235–246

  36. Lowe DG (2004) Distinctive image features from Scale-Invariant keypoints. Int Comput Vision 60(2):91–110

    Article  Google Scholar 

  37. Mic V, Zalevsky Z, Garca J, Teicher M, Beiderman Y, Valero E, Garca-Martnez P, Ferreira C (2011) Three-dimensional mapping and ranging of objects using speckle pattern analysis. In: Ferraro P., Wax A., Zalevsky Z. (eds) Coherent Light Microscopy, Springer Series in Surface Sciences, vol 46. Springer, Berlin Heidelberg, pp 347–367

  38. Microsoft (2013) Kinect. www.microsoft.com/en-us/kinectforwindows/

  39. Mitra S, Acharya T (2007) Gesture recognition: a survey. Trans Sys Man Cyber Part C 37(3):311–324

    Article  Google Scholar 

  40. Moehring M, Froehlich B (2011) Natural interaction metaphors for functional validations of virtual car models. IEEE Trans Vis Comput Graph 17(9):1195–1208

    Article  Google Scholar 

  41. Morguet P, Lang M (1998) Spotting dynamic hand gestures in video image sequences using hidden markov models. In: 1998 international conference on image processing, 1998. ICIP 98. Proceedings, vol 3, pp 193–197

  42. Nagarajan S, Subashini TS (2013) Article: static hand gesture recognition for sign language alphabets using edge oriented histogram and multi class SVM. Int J Comput Appl 82(4):28–35. Full text available

    Google Scholar 

  43. National laboratory for scientific computing ILSP image database. http://virtual01.lncc.br/ilsp-hand-gesture/,

  44. OpenCV Community OpenCV. http://opencv.org/

  45. Otiniano-Rodríguez K., Chávez G.C. (2013) Finger spelling recognition from RGB-d information using kernel descriptor. In: XXVI Conference on graphics, patterns and images, SIBGRAPI 2013, Arequipa, Peru, August 5-8, 2013, pp 1–7

  46. Padam Priyal S, Bora PK (2013) A robust static hand gesture recognition system using geometry based normalizations and krawtchouk moments. Pattern Recogn 46 (8):2202–2219

    Article  MATH  Google Scholar 

  47. Pansare JR, Bansal M, Saxena S, Desale D (2013) Gestuelle: A system to recognize dynamic hand gestures using hidden Markov model to control windows applications. Int J Comput Appl 62(17):19–24. Published by Foundation of Computer Science, New York, USA

    Google Scholar 

  48. Papadopoulos GT, Axenopoulos A, Daras P (2014) Real-time skeleton-tracking-based human action recognition using Kinect data. Springer, pp 473–483

  49. Pedersoli F, Benini S, Adami N, Leonardi R.: (2014) XKin: an open source framework for hand pose and gesture recognition using Kinect. Vis. Comput. 30 (10):1107–1122

    Article  Google Scholar 

  50. Plouffe G, Cretu AM (2016) Static and dynamic hand gesture recognition in depth data using dynamic time warping. IEEE Trans Instrum Meas 65(2):305–316

    Article  Google Scholar 

  51. Pugeault N, Bowden R (2011) Spelling it out: Real-time ASL Fingerspelling Recognition. In: ICCV Workshops. IEEE, pp 1114–1119

  52. Rao G, Satyanarayana C (2013) Visual object target tracking using particle filter: a survey. Int. Journal of Image, Graphics and Sig. Proc. 5(6):57–71

    Article  MathSciNet  Google Scholar 

  53. Rao VS, Mahanta C (2006) Gesture based robot control. In: Fourth international conference on intelligent sensing and information processing, 2006. ICISIP 2006, pp 145–148

  54. Rautaray SS, Agrawal A (2015) Vision based hand gesture recognition for human computer interaction: a survey. Artif Intell Rev 43(1):1–54

    Article  Google Scholar 

  55. Ros G, del Rincón J. M., Mateos GG (2012) Articulated particle filter for hand tracking. In: 2012 21st international conference on, pattern recognition (ICPR), pp 3581–3585

  56. SciPy Library Particle filter software. http://scipy-cookbook.readthedocs.io/items/ParticleFilter.html

  57. Snchez-Nielsen E, Antn-Canals L, Hernndez-Tejera M (2004) Hand gesture recognition for human-machine interaction. In: WSCG, pp 395–402

  58. Tara R, Santosa P, Adji T (2012) Hand segmentation from depth image using anthropometric approach in natural interface development. International Journal of Scientific and Engineering Research 3

  59. Uddin MZ, Thang ND, Kim TS (2010) Human activity recognition via 3-d joint angle features and hidden markov models. In: ICIP. IEEE, pp 713–716

  60. Um D, Ryu D, Kal M (2011) multiple intensity differentiation for 3-D surface reconstruction with Mono-Vision infrared proximity array sensor. IEEE Sensors J 11 (12):3352–3358

    Article  Google Scholar 

  61. Vapnik VN (1998) Statistical learning theory. John Wiley & Sons INC

  62. Vieriu RL, Mironica I., Goras B.T.: (2013) Background invariant static hand gesture recognition based on hidden Markov models. In: 2013 international symposium on signals, circuits and systems (ISSCS), pp 1–4

  63. Vrigkas M, Nikou C, Kakadiaris I (2015) A review of human activity recognition methods. Frontiers in Robotics and AI 2:28

    Article  Google Scholar 

  64. Yang X, Gao X, Tao D, Li X, Li J (2015) An efficient MRF embedded level set method for image segmentation. IEEE Trans Image Processing 24(1):9–21

    Article  MathSciNet  Google Scholar 

  65. Yoon HS, Soh J, Bae YJ, Yang HS (2001) Hand gesture recognition using combined features of location, angle and velocity. Pattern Recogn 34(7):1491–1501

    Article  MATH  Google Scholar 

  66. Yuen KK, Choi SH, Yang XB (2010) A Full-Immersive CAVE-based VR simulation system of Forklift truck operations for safety training. Comput-Aided Des Applic 7(2):235–245

    Article  Google Scholar 

  67. Zhou Y, Benois-pineau J, Nicolas H (2010) A multi-resolution particle filter tracking with a dual consistency check for model update in a multi-camera environment. In: 11Th int. Workshop on image analysis for mult. Interactive services, WIAMIS. Desenzano del Garda, Italy, pp 1–4

  68. Zhu Y, Xu G, Kriegman DJ (2002) A Real-Time Approach to the spotting, representation, and recognition of hand gestures for HumanComputer Interaction. Comput Vis Image Underst 85(3):189– 208

    Article  MATH  Google Scholar 

  69. Zhu HM, Pun CM (2012) Real-time hand gesture recognition from depth image sequences. In: 2012 ninth international conference on computer graphics, imaging and visualization (CGIV), pp 49– 52

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diego Q. Leite.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Leite, D.Q., Duarte, J.C., Neves, L.P. et al. Hand gesture recognition from depth and infrared Kinect data for CAVE applications interaction. Multimed Tools Appl 76, 20423–20455 (2017). https://doi.org/10.1007/s11042-016-3959-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-3959-0

Keywords

Navigation