Deep Gesture: Static Hand Gesture Recognition Using CNN

  • Aparna MohantyEmail author
  • Sai Saketh Rambhatla
  • Rajiv Ranjan Sahay
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 460)


Hand gestures are an integral part of communication. In several scenarios hand gestures play a vital role by virtue of them being the only means of communication. For example hand signals by a traffic policeman, news reader on TV gesturing news for the deaf, signalling in airport for navigating aircrafts, playing games etc. So, there is a need for robust hand pose recognition (HPR) which can find utility in such applications. The existing state-of-the-art methods are challenged due to clutter in the background. We propose a deep learning framework to recognise hand gestures robustly. Specifically we propose a convolutional neural network (CNN) to identify hand postures despite variation in hand sizes, spatial location in the image and clutter in the background. The advantage of our method is that there is no need for feature extraction. Without explicitly segmenting foreground the proposed CNN learns to recognise the hand pose even in presence of complex, varying background or illumination. We provide experimental results demonstrating superior performance of the proposed algorithm on state-of-the-art datasets.


Hand gesture recognition Deep learning Convolutional neural network 


  1. 1.
    L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis,” IEEE Transactions on Pattern Analysis & Machine Intelligence, no. 11, pp. 1254–1259, 1998.Google Scholar
  2. 2.
    A. Borji and L. Itti, “State-of-the-art in visual attention modeling,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 35, no. 1, pp. 185–207, 2013.Google Scholar
  3. 3.
    S. Mitra and T. Acharya, “Gesture recognition: A survey,” Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, vol. 37, no. 3, pp. 311–324, 2007.Google Scholar
  4. 4.
    S. Marcel, “Hand posture recognition in a body-face centered space,” in CHI ’99 Extended Abstracts on Human Factors in Computing Systems, ser. CHI EA ’99. New York, NY, USA: ACM, 1999, pp. 302–303. [Online]. Available:
  5. 5.
    P. K. Pisharady, P. Vadakkepat, and A. P. Loh, “Attention based detection and recognition of hand postures against complex backgrounds,” International Journal of Computer Vision, vol. 101, no. 3, pp. 403–419, 2013.Google Scholar
  6. 6.
    J. Triesch and C. Von Der Malsburg, “Robust classification of hand postures against complex backgrounds,” in fg. IEEE, 1996, p. 170.Google Scholar
  7. 7.
    T. S. Huang, Y. Wu, and J. Lin, “3d model-based visual hand tracking,” in Multimedia and Expo, 2002. ICME’02. Proceedings. 2002 IEEE International Conference on, vol. 1. IEEE, 2002, pp. 905–908.Google Scholar
  8. 8.
    P. Viola and M. Jones, “Robust real-time object detection,” International Journal of Computer Vision, vol. 4, pp. 51–52, 2001.Google Scholar
  9. 9.
    M. Kölsch and M. Turk, “Robust hand detection.” in FGR, 2004, pp. 614–619.Google Scholar
  10. 10.
    L. Bretzner, I. Laptev, and T. Lindeberg, “Hand gesture recognition using multi-scale colour features, hierarchical models and particle filtering,” in Automatic Face and Gesture Recognition, 2002. Proceedings. Fifth IEEE International Conference on. IEEE, 2002, pp. 423–428.Google Scholar
  11. 11.
    E.-J. Ong and R. Bowden, “A boosted classifier tree for hand shape detection,” in Automatic Face and Gesture Recognition, 2004. Proceedings. Sixth IEEE International Conference on. IEEE, 2004, pp. 889–894.Google Scholar
  12. 12.
    Y. Wu and T. S. Huang, “View-independent recognition of hand postures,” in Computer Vision and Pattern Recognition, 2000. Proceedings. IEEE Conference on, vol. 2. IEEE, 2000, pp. 88–94.Google Scholar
  13. 13.
    H. Kim and D. W. Fellner, “Interaction with hand gesture for a back-projection wall,” in Computer Graphics International, 2004. Proceedings. IEEE, 2004, pp. 395–402.Google Scholar
  14. 14.
    C.-C. Chang, C.-Y. Liu, and W.-K. Tai, “Feature alignment approach for hand posture recognition based on curvature scale space,” Neurocomputing, vol. 71, no. 10, pp. 1947–1953, 2008.Google Scholar
  15. 15.
    J. Triesch and C. Von Der Malsburg, “A system for person-independent hand posture recognition against complex backgrounds,” IEEE Transactions on Pattern Analysis & Machine Intelligence, no. 12, pp. 1449–1453, 2001.Google Scholar
  16. 16.
    F. Flórez, J. M. García, J. García, and A. Hernández, “Hand gesture recognition following the dynamics of a topology-preserving network,” in Automatic Face and Gesture Recognition, 2002. Proceedings. Fifth IEEE International Conference on. IEEE, 2002, pp. 318–323.Google Scholar
  17. 17.
    P. P. Kumar, P. Vadakkepat, and A. P. Loh, “Hand posture and face recognition using a fuzzy-rough approach,” International Journal of Humanoid Robotics, vol. 7, no. 03, pp. 331–356, 2010.Google Scholar
  18. 18.
    P. Barros, S. Magg, C. Weber, and S. Wermter, “A multichannel convolutional neural network for hand posture recognition,” in Artificial Neural Networks and Machine Learning–ICANN 2014. Springer, 2014, pp. 403–410.Google Scholar
  19. 19.
    J. Nagi, F. Ducatelle, G. Di Caro, D. Cireşan, U. Meier, A. Giusti, F. Nagi, J. Schmidhuber, L. M. Gambardella et al., “Max-pooling convolutional neural networks for vision-based hand gesture recognition,” in Signal and Image Processing Applications (ICSIPA), 2011 IEEE International Conference on. IEEE, 2011, pp. 342–347.Google Scholar
  20. 20.
    S. Marcel and O. Bernier, “Hand posture recognition in a body-face centered space,” in Gesture-Based Communication in Human-Computer Interaction. Springer, 1999, pp. 97–100.Google Scholar
  21. 21.
    Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” in Proceedings of the IEEE, 1998, pp. 2278–2324.Google Scholar
  22. 22.
    A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105.Google Scholar
  23. 23.
    Y. Lecun, F. J. Huang, and L. Bottou, “Learning methods for generic object recognition with invariance to pose and lighting,” in CVPR. IEEE Press, 2004.Google Scholar
  24. 24.
    N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.Google Scholar
  25. 25.
    R. B. Palm, “Prediction as a candidate for learning deep hierarchical models of data,” Master’s thesis, 2012. [Online]. Available:
  26. 26.
    A. Vedaldi and K. Lenc, “Matconvnet-convolutional neural networks for matlab,” arXiv preprint arXiv:1412.4564, 2014.

Copyright information

© Springer Science+Business Media Singapore 2017

Authors and Affiliations

  • Aparna Mohanty
    • 1
    Email author
  • Sai Saketh Rambhatla
    • 1
  • Rajiv Ranjan Sahay
    • 1
  1. 1.Department of Electrical EngineeringIndian Institute of Technology KharagpurKharagpurIndia

Personalised recommendations