Skip to main content
Log in

Occlusion-robust bimanual gesture recognition by fusing multi-views

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Human hands are dexterous and always be an intuitive way to instruct or communicate with peers. In recent years, hand gesture is widely used as a novel way for human computer interaction as well. However, existing approaches target solely to recognize single-handed gesture, but not gestures with two hands in close proximity (bimanual gesture). Thus, this paper tries to tackle the problems in bimanual gestures recognition which are not well studied from the literature. To overcome the critical issue of hand-hand self-occlusion problem in bimanual gestures, multiple cameras from different view points are used. A tailored multi-camera system is constructed to acquire multi-views bimanual gesture data. By employing both shape and color features, classifiers are trained with our bimanual gestures dataset. A weighted sum fusion scheme is employed to ensemble results predicted from different classifiers. While, the weightings in the fusion are optimized according to how well the recognition performed on a particular view. Our experiments show that multiple-view results outperform single-view results. The proposed method is especially suitable to interactive multimedia applications, such as our two demo programs: a video game and a sign language learner.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Barros P, Magg S, Weber C, Wermter S (2014) A multichannel convolutional neural network for hand posture recognition. Springer, Berlin, pp 403–410

    Google Scholar 

  2. Biswas KK, Basu SK (2011) Gesture recognition using microsoft kinect®;. In: 5Th international conference on automation, robotics and applications. IEEE, ICARA, pp 100–103

  3. Bulugu I, Ye Z (2016) Scale invariant static hand-postures detection using extended higher-order local autocorrelation features. Int J Comput Appl 135(5):1–5. published by Foundation of Computer Science (FCS), NY, USA

    Google Scholar 

  4. Chen M, AlRegib G, Juang BH (2013) Feature processing and modeling for 6d motion gesture recognition. IEEE Trans Multimedia 15(3):561–571. https://doi.org/10.1109/TMM.2012.2237024

    Article  Google Scholar 

  5. hua Chen Z, Kim JT, Liang J, Zhang J, Yuan YB (2014) Real-time hand gesture recognition using finger segmentation. The Scientific World Journal 2014, Article ID 267872

  6. Cheng H, Yang L, Liu Z (2016) Survey on 3d hand gesture recognition. IEEE Trans Circuits Syst Video Technol 26(9):1659–1673. https://doi.org/10.1109/TCSVT.2015.2469551

    Article  Google Scholar 

  7. Deng X, Yang S, Zhang Y, Tan P, Chang L, Wang H (2017) Hand3D: hand pose estimation using 3d neural network. CoRR arXiv:1704.02224[cs.CV]

  8. Dietterich TG (2000) Ensemble methods in machine learning. In: Proceedings of the first international workshop on multiple classifier systems, Springer-Verlag, London, UK, UK, MCS ’00, pp 1–15, http://dl.acm.org/citation.cfm?id=648054.743935

  9. Erol A, Bebis G, Nicolescu M, Boyle RD, Twombly X (2007) Vision-based hand pose estimation: a review. Comput Vis Image Underst 108(1-2):52–73

    Article  Google Scholar 

  10. Fogel I, Sagi D (1989) Gabor filters as texture discriminator. Biol Cybern 61 (2):103–113

    Article  Google Scholar 

  11. Ge L, Liang H, Yuan J, Thalmann D (2017) 3D convolutional neural networks for efficient and robust hand pose estimation from single depth images. In: Proc CVPR

  12. Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(10):993–1001. https://doi.org/10.1109/34.58871

    Article  Google Scholar 

  13. Joshi A, Monnier C, Betke M, Sclaroff S (2015) A random forest approach to segmenting and classifying gestures. In: 2015 11Th IEEE international conference and workshops on automatic face and gesture recognition (FG), vol 1, pp 1–7

  14. Karime A, Al-Osman H, Gueaieb W, Saddik AE (2011) E-glove: an electronic glove with vibro-tactile feedback for wrist rehabilitation of post-stroke patients. In: IEEE international conference on multimedia and expo

  15. Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239

    Article  Google Scholar 

  16. Kristensson PO, Nicholson T, Quigley A (2012) Continuous recognition of one-handed and two-handed gestures using 3d full-body motion tracking sensors. In: Proceedings of the 2012 ACM international conference on intelligent user interfaces. ACM, pp 89–92

  17. LeCun Y, Bengio Y (1998) The handbook of brain theory and neural networks. In: MIT Press, Cambridge, MA, USA, chap convolutional networks for images, speech, and time series, pp 255–258, http://dl.acm.org/citation.cfm?id=303568.303704

  18. Leite DQ, Duarte JC, Neves LP, de Oliveira JC, Giraldi GA (2016) Hand gesture recognition from depth and infrared kinect data for cave applications interaction. Multimed Tools Appl:1–33

  19. Li M, Leung H (2016) Multiview skeletal interaction recognition using active joint interaction graph. IEEE Trans Multimedia 18(11):2293–2302. https://doi.org/10.1109/TMM.2016.2614228

    Article  Google Scholar 

  20. Molchanov P, Gupta S, Kim K, Kautz J (2015) Hand gesture recognition with 3d convolutional neural networks. In: 2015 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 1–7. https://doi.org/10.1109/CVPRW.2015.7301342

  21. Molchanov P, Yang X, Gupta S, Kim K, Tyree S, Kautz J (2016) Online detection and classification of dynamic hand gestures with recurrent 3d convolutional neural networks. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 4207–4215. https://doi.org/10.1109/CVPR.2016.456

  22. Mueller F, Mehta D, Sotnychenko O, Sridhar S, Casas D, Theobalt C (2017) Real-time hand tracking under occlusion from an egocentric RGB-D sensor. In: Proc ICCV, http://handtracker.mpi-inf.mpg.de/projects/OccludedHands/

  23. Oikonomidis I, Kyriazis N, Argyros AA (2012) Tracking the articulated motion of two strongly interacting hands. In: CVPR, IEEE computer society, pp 1862–1869

  24. Parvini F, Mcleod D, Shahabi C, Navai B, Zali B, Ghandeharizadeh S (2009) An approach to glove-based gesture recognition. In: Proceedings of the 13th international conference on human-computer interaction. Springer-Verlag, Berlin, Heidelberg, pp 236–245, https://doi.org/10.1007/978-3-642-02577-8_26

  25. Potter LE, Araullo J, Carter L (2013) The leap motion controller: a view on sign language. In: Proceedings of the 25th Australian computer-human interaction conference: augmentation, application, innovation, collaboration, ACM, pp 175–178

  26. Poularakis S, Katsavounidis I (2014) Finger detection and hand posture recognition based on depth information. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 4329–4333, https://doi.org/10.1109/ICASSP.2014.6854419

  27. Rahman N, Wei K, See J (2006) Rgb-h-cbcr skin colour model for human face detection. In: MMU international symposium on information & communications technologies (M2USIC 2006), MMU

  28. Rautaray SS, Agrawal A (2015) Vision based hand gesture recognition for human computer interaction: a survey. Artif Intell Rev 43(1):1–54

    Article  Google Scholar 

  29. Ren Z, Yuan J, Meng J, Zhang Z (2013) Robust part-based hand gesture recognition using kinect sensor. IEEE Trans Multimed 15(5):1110–1120

    Article  Google Scholar 

  30. Saeed A, Niese R, Al-Hamadi A, Michaelis B (2011) Coping with hand-hand overlapping in bimanual movements. In: IEEE international conference on signal and image processing applications. IEEE, ICSIPA, pp 238–243

  31. Sarkar AR, Sanyal G, Majumder S (2013) Hand gesture recognition systems: a survey. Int J Comput Appl 71(15)

  32. Schramm R, Jung CR, Miranda ER (2015) Dynamic time warping for music conducting gestures evaluation. IEEE Trans Multimedia 17(2):243–255. https://doi.org/10.1109/TMM.2014.2377553

    Article  Google Scholar 

  33. Singha J, Laskar RH (2016) Hand gesture recognition using two-level speed normalization, feature selection and classifier fusion. Multimedia Syst:1–16. https://doi.org/10.1007/s00530-016-0510-0

  34. Tang A, Lu K, Wang Y, Huang J, Li H (2015) A real-time hand posture recognition system using deep neural networks. ACM Trans Intell Syst Technol 6(2):21:1–21:23

    Article  Google Scholar 

  35. Tax DM, van Breukelen M, Duin RP, Kittler J (2000) Combining multiple classifiers by averaging or by multiplying? Pattern Recogn 33(9):1475–1485

    Article  Google Scholar 

  36. Tzionas D, Ballan L, Srikantha A, Aponte P, Pollefeys M, Gall J (2016) Capturing hands in action using discriminative salient points and physics simulation. Int J Comput Vis 118(2):172–193. https://doi.org/10.1007/s11263-016-0895-4

    Article  MathSciNet  Google Scholar 

  37. Wan C, Probst T, Gool LV, Yao A (2017) Crossing nets: combining gans and vaes with a shared latent space for hand pose estimation. In: Proc CVPR

  38. Wang C, Liu Z, Chan SC (2015) Superpixel-based hand gesture recognition with kinect depth camera. IEEE Trans Multimedia 17(1):29–39. https://doi.org/10.1109/TMM.2014.2374357

    Article  Google Scholar 

  39. Wang RY, Popović J (2009) Real-time hand-tracking with a color glove. ACM Trans Graph 28(3):63:1–63:8

    Google Scholar 

  40. Wu G, Kang W (2016) Robust fingertip detection in a complex environment. IEEE Trans Multimedia 18(6):978–987. https://doi.org/10.1109/TMM.2016.2545401

    Article  Google Scholar 

  41. Zen G, Porzi L, Sangineto E, Ricci E, Sebe N (2016) Learning personalized models for facial expression analysis and gesture recognition. IEEE Trans Multimedia 18(4):775–788. https://doi.org/10.1109/TMM.2016.2523421

    Article  Google Scholar 

Download references

Acknowledgments

The work described in this paper was fully supported by a grant from the Research Grants Council of the Hong Kong SAR, China (Ref. No. UGC/FDS11/E03/15).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wai-Man Pang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(MP4 13.2 MB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Poon, G., Kwan, K.C. & Pang, WM. Occlusion-robust bimanual gesture recognition by fusing multi-views. Multimed Tools Appl 78, 23469–23488 (2019). https://doi.org/10.1007/s11042-019-7660-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-7660-y

Keywords

Navigation