Abstract
Hand gesture for communication has proven effective for humans, and active research is ongoing in replicating the same success in computer vision systems. Human–computer interaction can be significantly improved from advances in systems that are capable of recognizing different hand gestures. In contrast to many earlier works, which consider the recognition of significantly differentiable hand gestures, and therefore often selecting a few gestures from the American Sign Language (ASL) for recognition, we propose applying deep learning to the problem of hand gesture recognition for the whole 24 hand gestures obtained from the Thomas Moeslund’s gesture recognition database. We show that more biologically inspired and deep neural networks such as convolutional neural network and stacked denoising autoencoder are capable of learning the complex hand gesture classification task with lower error rates. The considered networks are trained and tested on data obtained from the above-mentioned public database; results comparison is then made against earlier works in which only small subsets of the ASL hand gestures are considered for recognition.
Similar content being viewed by others
References
Nguyen T-N, Huynh H-H, Meunier J (2013) Static hand gesture recognition using artificial neural network. J Image Graph 1(1):34–38
Nagi J, Ducatelle F, Di Caro GA et al (2011) Max-pooling convolutional neural networks for vision-based hand gesture recognition. In: 2011 IEEE international conference on signal and image processing applications (ICSIPA2011), pp 342–347
Rahman MdH, Afrin J (2013) Hand gesture recognition using multiclass support vector machine. Int J Comput Appl 74(1):39–43
Sultana A, Rajapuspha T (2012) Vision based gesture recognition for alphabetical hand gestures using the SVM classifier. Int J Comput Sci Eng Technol 3(7):218–223
Yewale SK, Bharne PK (2011) Hand gesture recognition using different algorithms based on artificial neural network. In: 2011 International conference on emerging trends in networks and computer communications (ETNCC), 22–24 April 2011, Udaipur, pp 287–292
Triesch J, von Malsburg C (2011) A system for person-independent hand posture recognition against complex backgrounds. IEEE Trans Pattern Anal Mach Intell 23(12):1449–1453
Oyedotun OK, Olaniyi EO, Helwan A, Khashman A (2014) Decision support models for iris nevus diagnosis considering potential malignancy. Int J Sci Eng Res 5(12):419–426
Ahmed T (2012) A neural network based real time hand gesture recognition system. Int J Comput Appl 59(4):17–22
Phu JJ, Tay YH (2006) Computer vision based hand gesture recognition using artificial neural network. Faculty of Information and Communication Technology, Universiti Tunku Abdul Rahman, pp 1–6
Ibraheem NA, Khan RZ (2012) Vision based gesture recognition using neural networks approaches: a review. Int J Hum Comput Interact 3(1):1–14
Khashman A (2012) Investigation of different neural models for blood cell type identification. Neural Comput Appl 21(6):1177–1183
Khashman A (2009) Application of an emotional neural network to facial recognition. Neural Comput Appl 18(4):309–320
Oyedotun OK, Tackie SN, Olaniyi EO, Khashman A (2015) Data mining of students’ performance: Turkish students as a case study. Int J Intell Syst Appl 7(9):20–27
Wang W, Yang J, Xiao J et al (2015) Face recognition based on deep learning. Lect Notes Comput Sci 8944:812–820
Noda K, Yamaguchi Y, Nakadai K et al (2015) Audio-visual speech recognition using deep learning. Appl Intell 42(4):722–737
Collobert R, Weston J, Bottou L et al (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
Kruger N et al (2013) Deep hierarchies in the primate visual cortex: What can we learn for computer vision? IEEE Trans Pattern Anal Mach Intell 35(8):1847–1871
Thomas Moeslund’s gesture recognition database—PRIMA. http://www-prima.inrialpes.fr/FGnet/data/12-MoeslundGesture/database.html
Najafabadi MM et al (2015) Deep learning applications and challenges in big data analytics. J Big Data 2(1):1–21
Pierre B (2012) Autoencoders, unsupervised learning, and deep architectures. Workshop Unsuperv Transf Learn 27:37–50
Erhan D, Bengio Y, Courville A (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11:625–660
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of 13th international conference on artificial intelligence and statistics, pp 249–256
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Oyedotun OK, Dimililer K (2016) Pattern recognition: invariance learning in convolutional auto encoder network. Int J Image Graph Signal Process 8(3):19–27
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Oyedotun OK, Olaniyi EO, Khashman A (2015) Deep learning in character recognition considering pattern invariance constraints. Int J Intell Syst Appl 7(7):1–10
Vincent P et al (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Computer vision—ECCV 2014. Springer, Berlin, pp 818–833
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems. Neural Information Processing Systems (NIPS), pp 1097–1105
Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 1139–1147
Scherer D, Müller A, Behnke S (2010) Evaluation of pooling operations in convolutional architectures for object recognition. In: Diamantaras KI, Duch W, Iliadis LS (eds) Artificial neural networks—ICANN. Springer, Berlin, pp 92–101
Hasan H, Abdul-Kareem S (2014) Static hand gesture recognition using neural networks. Artif Intell Rev 41(2):147–181
Avraam M (2014) Static gesture recognition combining graph and appearance features. Int J Adv Res Artif Intell 3(2):1–4
Nguyen T-N, Huynh H-H, Meunier J (2015) Static hand gesture recognition using principal component analysis combined with artificial neural network. J Autom Control Eng 3(1):40–45
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Oyedotun, O.K., Khashman, A. Deep learning in vision-based static hand gesture recognition. Neural Comput & Applic 28, 3941–3951 (2017). https://doi.org/10.1007/s00521-016-2294-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-016-2294-8