Abstract
Fingerspelling recognition of Chinese sign language rendered an opportunity to smooth the communication barriers of hearing-impaired people and health people, which occupies an important position in sign language recognition. This study proposed an eight-layer convolutional neural network, combined with three advanced techniques: batch normalization, dropout, and stochastic pooling. The output of the stochastic pooling was obtained via sampling from a multinomial distribution formed from the activations of each pooling region. In addition, we used data augmentation method to enhance the training set. In total 10 runs were implemented with the hold-out randomly set for each run. Our method achieved the highest accuracy of 90.91% and overall accuracy of 89.32 ± 1.07%, which was superior to three state-of-the-art approaches compared.
Similar content being viewed by others
References
Cheok ZOMJ, Jaward MH (2019) A review of hand gesture and sign language recognition techniques. Int J Mach Learn Cybern 10:131–153
Congalton RG (1991) A review of assessing the accuracy of classifications of remotely sensed data. Remote Sens Environ 37(1):35–46
Dingqian SXG, Yuanyuan Y (2005) The analysis of Chinese sign language's basic words (basic movements). Chin J Spec Educ 2:65–72
Du T, Ren X, Li H (2018) Gesture recognition method based on deep learning. In: 33rd Youth Academic Annual Conference of Chinese Association of Automation (YAC), Nanjing, China, pp 782–787. IEEE.
Wang S-H, Tang C, Sun J, Yang J, Huang C, Phillips P and Zhang Y-D (2018) Multiple Sclerosis Identification by 14-Layer Convolutional Neural Network With Batch Normalization, Dropout, and Stochastic Pooling. Front. Neurosci. 12:818. https://doi.org/10.3389/fnins.2018.00818
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning (ICML), vol 37, pp 448–456. ACM.
Jiang Y (2018) Exploring a smart pathological brain detection method on pseudo Zernike moment. Multimed Tools Appl 77(17):22589–22604
Huang J, Zhou W, Zhang Q, Li H, Li W (2018) Video-based sign language recognition without temporal segmentation. Thirty-Second AAAI Conference on Artificial Intelligence: 2257–2264
Khan SH, Hayat M, Porikli F (2019) Regularization of deep neural networks with spectral dropout (in English). Neural Netw 110:82–90
Kong FQ (2018) Ridge-based curvilinear structure detection for identifying road in remote sensing image and backbone in neuron dendrite image (in English). Multimed Tools Appl 77(17):22857–22873
Kumar P, Saini R, Roy PP (2017) A position and rotation invariant framework for sign language recognition (SLR) using Kinect. Multimed Tools Appl 77:8823–8846
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436
Lee GC, Yeh F, Hsiao Y (2016) Kinect-based Taiwanese sign-language recognition system. Multimed Tools Appl 75:261–279
Leopold H A, Orchard J, Zelek J S, Lakshminarayanan V (2019) PixelBNN: Augmenting the pixelCNN with batch normalization and the presentation of a fast architecture for retinal vessel segmentation. Journal of Imaging 5(2): 26
Li X (2017) Research on Chinese Sign Language Recognition for Middle and Small Vocabulary based on Neural Network. University of Science and Technology of China, pp 1–2
Li T H S, Kao M C, Kuo P H (2016) Recognition system for Home-Service-related Sign Language Using Entropy-Based$K$-Means Algorithm and ABC-Based HMM. IEEE transactions on systems, man, and Cybernetics: systems 46(1):150–162
Lichtenauer JF, Hendriks EA, Reinders MJT (2008) Sign language recognition by combining statistical DTW and independent classification. IEEE Trans Pattern Anal Mach Intell 30(11):2040–2046
Liu J. Detecting cerebral microbleeds with transfer learning. Mach Vis Appl. Accessed on 22 April. Available https://doi.org/10.1007/s00138-019-01029-5
Lu S (2019) Pathological brain detection based on AlexNet and transfer learning. J Comput Sci 30:41–47
Muhammad K (2019) Image based fruit category classification by 13-layer deep convolutional neural network and data augmentation. Multimed Tools Appl 78:3613–3632
Oscar Koller SZ, Ney H, Bowden R (2018) Deep sign: enabling robust statistical continuous sign language recognition via hybrid CNN-HMMs. Int J Comput Vis 126:1311–1325
Pan C (2018) Abnormal breast identification by nine-layer convolutional neural network with parametric rectified linear unit and rank-based stochastic pooling. J Comput Sci 27:57–68
Pan C (2018) Multiple sclerosis identification by convolutional neural network with dropout and parametric ReLU. J Comput Sci 28:1–10
Pariwat T, Seresangtakul P (2017) Thai finger-spelling sign language recognition using global and local features with SVM. 9th International conference on knowledge and smart technology (KST), IEEE: 116–120
Zhang Y, Wu L, Peterson B, Dong Z (2011) A two-level iterative reconstruction method for compressed sensing MRI. Journal of Electromagnetic Waves and Applications 25(8-9):1081–1091
Qian P (2018) Cat swarm optimization applied to alcohol use disorder identification. Multimed Tools Appl 77(17):22875–22896
Rao GA, Kishore PVV, Kumar DA, Sastry ASCS (2017) Neural network classifier for continuous sign language recognition with selfie video. Far East Journal of Electronics and Communications 17(1):49
Sellami A, Hwang H (2019) A robust deep convolutional neural network with batch-weighted loss for heartbeat classification (in English). Expert Syst Appl 122:75–84
Sun J (2018) Preliminary study on angiosperm genus classification by weight decay and combination of most abundant color index with fractional Fourier entropy. Multimed Tools Appl 77(17):22671–22688
Tang C (2018) Twelve-layer deep convolutional neural network with stochastic pooling for tea category classification on GPU platform. Multimed Tools Appl 77(17):22821–22839
Wei G (2010) Color image enhancement based on HVS and PCNN. SCIENCE CHINA Inf Sci 53(10):1963–1976
Zhang Y, Wu L (2008) Improved image filter based on SPCNN. Science in China Series F-Information Sciences 51(12):2115–2125
Wu LN (2008) Pattern recognition via PCNN and Tsallis entropy (in English). Sensors 8(11):7518–7529
Zhang Y, Wu L (2009) Segment-based coding of color images. Science in China Series F-Information Sciences 52(6):914–925
Wu L (2011) Optimal multi-level Thresholding based on maximum Tsallis entropy via an artificial bee Colony approach. Entropy 13(4):841–859
Yan J (2010) Find multi-objective paths in stochastic networks via chaotic immune PSO. Expert Syst Appl 37(3):1911–1919
Yang J (2019) An adaptive encoding learning for artificial bee colony algorithms. J Comput Sci 30:11–27
Yang H-D, Lee S-W (2010) Robust sign language recognition with hierarchical conditional random fields. In: 20th International Conference on Pattern Recognition, Istanbul, Turkey, pp 2202–2205. IEEE
Zhao G (2018) Smart pathological brain detection by synthetic minority oversampling technique, extreme learning machine, and Jaya algorithm. Multimed Tools Appl 77(17):22629–22648
Acknowledgements
This work was supported from Jiangsu Overseas Visiting Scholar Program for University Prominent Young & Middle-aged Teachers and Presidents of China, Henan Key Research and Development Project (182102310629), Natural Science Foundation of China (61602250).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jiang, X., Lu, M. & Wang, SH. An eight-layer convolutional neural network with stochastic pooling, batch normalization and dropout for fingerspelling recognition of Chinese sign language. Multimed Tools Appl 79, 15697–15715 (2020). https://doi.org/10.1007/s11042-019-08345-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-08345-y