International Journal of Computer Vision

, Volume 101, Issue 3, pp 403–419 | Cite as

Attention Based Detection and Recognition of Hand Postures Against Complex Backgrounds

  • Pramod Kumar PisharadyEmail author
  • Prahlad Vadakkepat
  • Ai Poh Loh


A system for the detection, segmentation and recognition of multi-class hand postures against complex natural backgrounds is presented. Visual attention, which is the cognitive process of selectively concentrating on a region of interest in the visual field, helps human to recognize objects in cluttered natural scenes. The proposed system utilizes a Bayesian model of visual attention to generate a saliency map, and to detect and identify the hand region. Feature based visual attention is implemented using a combination of high level (shape, texture) and low level (color) image features. The shape and texture features are extracted from a skin similarity map, using a computational model of the ventral stream of visual cortex. The skin similarity map, which represents the similarity of each pixel to the human skin color in HSI color space, enhanced the edges and shapes within the skin colored regions. The color features used are the discretized chrominance components in HSI, YCbCr color spaces, and the similarity to skin map. The hand postures are classified using the shape and texture features, with a support vector machines classifier. A new 10 class complex background hand posture dataset namely NUS hand posture dataset-II is developed for testing the proposed algorithm (40 subjects, different ethnicities, various hand sizes, 2750 hand postures and 2000 background images). The algorithm is tested for hand detection and hand posture recognition using 10 fold cross-validation. The experimental results show that the algorithm has a person independent performance, and is reliable against variations in hand sizes and complex backgrounds. The algorithm provided a recognition rate of 94.36 %. A comparison of the proposed algorithm with other existing methods evidences its better performance.


Computer vision Pattern recognition Hand gesture recognition Complex backgrounds Visual attention Biologically inspired features 



The authors would like to thank Ms. Ma Zin Thu Shein for taking part in the shooting of NUS hand posture dataset-II. Also the authors express their appreciation to all the 40 subjects volunteered for the development of the dataset.


  1. Alon, J., Athitsos, V., Yuan, Q., & Sclaroff, S. (2009). A unified framework for gesture recognition and spatiotemporal gesture segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(09), 1685–1699. CrossRefGoogle Scholar
  2. Athitsos, V., & Sclaroff, S. (2003). Estimating 3d hand pose from a cluttered image. In IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 432–439). Google Scholar
  3. Bienenstock, E., & Malsburg, C. v. d. (1987). A neural network for invariant pattern recognition. Europhysics Letters, 4(1), 121–126. CrossRefGoogle Scholar
  4. Bishop, C. (1995). Neural networks for pattern recognition. London: Oxford University Press. Google Scholar
  5. Chaves-González, J. M., Vega-Rodrígueza, M. A., Gómez-Pulidoa, J. A., & Sánchez-Péreza, J. M. (2010). Detecting skin in face recognition systems: a colour spaces study. Digital Signal Processing, 20(03), 806–823. CrossRefGoogle Scholar
  6. Chen, F. S., Fu, C. M., & Huang, C. L. (2003). Hand gesture recognition using a real-time tracking method and hidden Markov models. Image and Vision Computing, 21, 745–758. CrossRefGoogle Scholar
  7. Chen, Q., Georganas, N. D., & Petriu, E. M. (2008). Hand gesture recognition using haar-like features and a stochastic context-free grammar. IEEE Transactions on Instrumentation and Measurement, 57(8), 1562–1571. CrossRefGoogle Scholar
  8. Chikkerur, S., Serre, T., Tan, C., & Poggio, T. (2010). What and where: a Bayesian inference theory of attention. Vision Research, 50(22), 2233–2247. CrossRefGoogle Scholar
  9. Daniel, K., John, M., & Charles, M. (2010). A person independent system for recognition of hand postures used in sign language. Pattern Recognition Letters, 31, 1359–1368. CrossRefGoogle Scholar
  10. Dayan, P., Hinton, G. E., & Neal, R. M. (1995). The Helmholtz machine. Neural Computation, 7, 889–904. CrossRefGoogle Scholar
  11. Eng-Jon, O., & Bowden, R. (2004). A boosted classifier tree for hand shape detection. In IEEE conference on automatic face and gesture recognition (pp. 889–894). Google Scholar
  12. Erol, A., Bebis, G., Nicolescu, M., Boyle, R. D., & Twombly, X. (2007). Vision-based hand pose estimation: a review. Computer Vision and Image Understanding, 108, 52–73. CrossRefGoogle Scholar
  13. Ge, S. S., Yang, Y., & Lee, T. H. (2008). Hand gesture recognition and tracking based on distributed locally linear embedding. Image and Vision Computing, 26, 1607–1620. CrossRefGoogle Scholar
  14. Hasanuzzamana, M., Zhanga, T., Ampornaramveth, V., Gotoda, H., Shirai, Y., & Ueno, H. (2007). Adaptive visual gesture recognition for human-robot interaction using a knowledge-based software platform. Robotics and Autonomous Systems, 55(8), 643–657. CrossRefGoogle Scholar
  15. Itti, L., & Koch, C. (2001). Computational modelling of visual attention. Nature Reviews. Neuroscience, 2(3), 194–203. CrossRefGoogle Scholar
  16. Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1254–1259. CrossRefGoogle Scholar
  17. Jones, J. P., & Palmer, L. A. (1987). An evaluation of the twodimensional gabor filter model of simple receptive fields in cat striate cortex. Journal of Neurophysiology, 58(6), 1233–1258. Google Scholar
  18. Jones, M., & Rehg, J. (1999). Statistical color models with application to skin detection. In IEEE conference on computer vision and pattern recognition (Vol. 1). Google Scholar
  19. Just, A., & Marcel, S. (2009). A comparative study of two state-of-the-art sequence processing techniques for hand gesture recognition. Computer Vision and Image Understanding, 113(4), 532–543. CrossRefGoogle Scholar
  20. Kolsch, M., & Turk, M. (2004). Robust hand detection. In IEEE conference on automatic face and gesture recognition (pp. 614–619). Google Scholar
  21. Lai, J., & Wang, W. X. (2008). Face recognition using cortex mechanism and svm. In C. Xiong, H. Liu, Y. Huang, & Y. Xiong (Eds.), 1st international conference intelligent robotics and applications, Wuhan, China (pp. 625–632). CrossRefGoogle Scholar
  22. Lee, K. H., & Kim, J. H. (1999). An hmm based threshold model approach for gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(10), 961–973. CrossRefGoogle Scholar
  23. Lee, J., & Kunii, T. (1995). Model-based analysis of hand posture. IEEE Computer Graphics and Applications, 15(5), 77–86. CrossRefGoogle Scholar
  24. Licsar, A., & Sziranyi, T. (2005). User-adaptive hand gesture recognition system with interactive training. Image and Vision Computing, 23, 1102–1114. CrossRefGoogle Scholar
  25. Mitra, S., & Acharya, T. (2007). Gesture recognition: a survey. IEEE Transactions on Systems, Man and Cybernetics. Part C, Applications and Reviews 37(3), 311–324. CrossRefGoogle Scholar
  26. Murphy, K. (2003). Bayes net toolbox for Matlab. Google Scholar
  27. Niebur, E., & Koch, C. (1998). Computational architectures for attention. In R. Parasuraman (Ed.), The attentive brain (pp. 163–186). Cambridge: MIT Press. Google Scholar
  28. Ong, S. C. W., & Ranganath, S. (2005). Automatic sign language analysis: a survey and the future beyond lexical meaning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6), 873–891. CrossRefGoogle Scholar
  29. Patwardhan, K. S., & Roy, S. D. (2007). Hand gesture modelling and recognition involving changing shapes and trajectories, using a predictive eigentracker. Pattern Recognition Letters, 28, 329–334. CrossRefGoogle Scholar
  30. Pavlovic, V. I., Sharma, R., & Huang, T. S. (1997). Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 677–694. CrossRefGoogle Scholar
  31. Pearl, J. (1988). Probabilistic reasoning in intelligent systems: networks of plausible inference. San Mateo: Morgan Kaufmann. Google Scholar
  32. Phung, S. L., Bouzerdoum, A., & Chai, D. (2005). Skin segmentation using color pixel classification: analysis and comparison. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(01), 148–154. CrossRefGoogle Scholar
  33. Poggio, T., & Bizzi, E. (2004). Generalization in vision and motor control. Nature, 431, 768–774. CrossRefGoogle Scholar
  34. Poggio, T., & Riesenhuber, M. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2(11), 1019–1025. CrossRefGoogle Scholar
  35. Pramod Kumar, P., Vadakkepat, P., & Loh, A. P. (2010a). Hand posture and face recognition using a fuzzy-rough approach. International Journal of Humanoid Robotics, 07(03), 331–356. CrossRefGoogle Scholar
  36. Pramod Kumar, P., Vadakkepat, P., & Loh, A. P. (2010b). Graph matching based hand posture recognition using neuro-biologically inspired features. In International conference on control, automation, robotics and vision (ICARCV) 2010, Singapore. Google Scholar
  37. Pramod Kumar, P., Stephanie, Q. S. H., Vadakkepat, P., & Loh, A. P. (2010c). Hand posture recognition using neuro-biologically inspired features. In International conference on computational intelligence, robotics and autonomous systems (CIRAS) 2010, Bangalore. Google Scholar
  38. Pramod Kumar, P., Vadakkepat, P., & Loh, A. P. (2011). Fuzzy-rough discriminative feature selection and classification algorithm, with application to microarray and image datasets. Applied Soft Computing, 11(04), 3429–3440. CrossRefGoogle Scholar
  39. Ramamoorthy, A., Vaswani, N., Chaudhury, S., & Banerjee, S. (2003). Recognition of dynamic hand gestures. Pattern Recognition, 36, 2069–2081. zbMATHCrossRefGoogle Scholar
  40. Rao, R. (2005). Bayesian inference and attentional modulation in the visual cortex. NeuroReport, 16(16), 1843–1848. CrossRefGoogle Scholar
  41. Serre, T., Wolf, L., & Poggio, T. (2005). Object recognition with features inspired by visual cortex. In C. Schmid, S. Soatto, & C. Tomasi (Eds.), Conference on computer vision and pattern recognition, San Diego, CA (pp. 994–1000). Google Scholar
  42. Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., & Poggio, T. (2007). Robust object recognition with cortex-like mechanisms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(3), 411–426. CrossRefGoogle Scholar
  43. Siagian, C., & Itti, L. (2007). Rapid biologically-inspired scene classification using features shared with visual attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(2), 300–312. CrossRefGoogle Scholar
  44. Su, M. C. (2000). A fuzzy rule-based approach to spatio-temporal hand gesture recognition. IEEE Transactions on Systems, Man and Cybernetics. Part C, Applications and Reviews, 30(2), 276–281. Google Scholar
  45. Teng, X., Wu, B., Yu, W., & Liu, C. (2005). A hand gesture recognition system based on local linear embedding. Journal of Visual Languages and Computing, 16, 442–454. CrossRefGoogle Scholar
  46. Triesch, J., & Malsburg, C. (1996a). Robust classification of hand postures against complex backgrounds. In Proceedings of the second international conference on automatic face and gesture recognition, 1996, Killington, VT, USA (pp. 170–175). CrossRefGoogle Scholar
  47. Triesch, J., & Malsburg, C. (1996b). Sebastien Marcel hand posture and gesture datasets: Jochen Triesch static hand posture database [online]:
  48. Triesch, J., & Malsburg, C. (1998). A gesture interface for human-robot-interaction. In Proceedings of the third IEEE international conference on automatic face and gesture recognition, 1998, Nara, Japan (pp. 546–551). CrossRefGoogle Scholar
  49. Triesch, J., & Malsburg, C. (2001). A system for person-independent hand posture recognition against complex backgrounds. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(12), 1449–1453. CrossRefGoogle Scholar
  50. Tsotsos, J. K., Culhane, S. M., Wai, Y. H., Lai, W. Y. K., Davis, N., & Nuflo, F. (1995). Modelling visual attention via selective tuning. Artificial Intelligence, 78(1–2), 507–545. CrossRefGoogle Scholar
  51. Ueda, E., Matsumoto, Y., Imai, M., & Ogasawara, T. (2003). A hand-pose estimation for vision-based human interfaces. IEEE Transactions on Industrial Electronics, 50(4), 676–684. CrossRefGoogle Scholar
  52. Van der Zant, T., Schomaker, L., & Haak, K. (2008). Handwritten-word spotting using biologically inspired features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11), 1945–1957. CrossRefGoogle Scholar
  53. Wang, W. H. A., & Tung, C. L. (2008). Dynamic hand gesture recognition using hierarchical dynamic Bayesian networks through low-level image processing. In 7th international conference on machine learning and cybernetics, Kunming, P.R. China (pp. 3247–3253). Google Scholar
  54. Wiesel, T. N., & Hubel, D. H. (1962). Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. Journal of Physiology, 160, 106–154. Google Scholar
  55. Wu, Y., & Huang, T. S. (1999). Vision-based gesture recognition: a review. In A. Braffort, R. Gherbi, S. Gibet, J. Richardson, & D. Teil (Eds.), International gesture workshop on gesture-based communication in human computer interaction, Gif Sur Yvette, France (pp. 103–115). Berlin: Springer CrossRefGoogle Scholar
  56. Wu, Y., & Huang, T. S. (2000). View-independent recognition of hand postures. In IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 88–94). Google Scholar
  57. Yang, M. H., & Ahuja, N. (1998). Extraction and classification of visual motion patterns for hand gesture recognition. In Proceedings, IEEE computer society conference on computer vision and pattern recognition, Santa Barbara, CA, USA (pp. 892–897). Google Scholar
  58. Yang, M. H., Ahuja, N., & Tabb, M. (2002). Extraction of 2d motion trajectories and its application to hand gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(8), 1061–1074. CrossRefGoogle Scholar
  59. Yang, H. D., Park, A. Y., & Lee, S. W. (2007). Gesture spotting and recognition for human–robot interaction. IEEE Transactions on Robotics, 23(2), 256–270. CrossRefGoogle Scholar
  60. Yin, X., & Xie, M. (2003). Estimation of the fundamental matrix from uncalibrated stereo hand images for 3d hand gesture recognition. Pattern Recognition, 36, 567–584. CrossRefGoogle Scholar
  61. Yoon, H. S., Soh, J., Bae, Y. J., & Yang, H. S. (2001). Hand gesture recognition using combined features of location, angle and velocity. Pattern Recognition, 34, 1491–1501. zbMATHCrossRefGoogle Scholar
  62. Zhao, M., Quek, F. K. H., & Wu, X. (1998). Rievl: recursive induction learning in hand gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1174–1185. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Pramod Kumar Pisharady
    • 1
    Email author
  • Prahlad Vadakkepat
    • 1
  • Ai Poh Loh
    • 1
  1. 1.Department of Electrical and Computer EngineeringNational University of SingaporeSingaporeSingapore

Personalised recommendations