Abstract
Before languages came into existence, sign language was the mode of communication. For human–computer interaction, recognizing these sign languages is vital; thus, hand gesture recognition comes into play. With the advancement of technology and its vast applications, hand gesture recognition has become a common field of research. Gesture recognition has gained a lot of popularity due to its application in sign language detection for speech and hearing-impaired people. This paper presents a methodology for hand gesture recognition using a 3D convolutional neural network. The dataset used for this purpose is MINDS-Libras, a Brazilian sign language dataset. We propose a novel error correction-based key frame extraction technique that selects significant key frames for video summarization. The chosen key frames are preprocessed through the steps of the region of interest selection, background removal, segmentation, binarization, and resizing. The frames are given as input to the proposed three-dimensional convolutional neural network for the classification of hand gestures, which offers an accuracy of 98% and performs better than state-of-the-art techniques.
Similar content being viewed by others
Data availability
Data are available on reasonable request.
Abbreviations
- HCI:
-
Human–computer interaction
- ECKFE:
-
Error correction-based key frame extraction
- ROI:
-
Region of interest
- 3D CNN:
-
Three-dimensional convolutional neural network
- SVM:
-
Support vector machine
- KNN:
-
K-nearest neighbors
- CNN:
-
Convolutional neural network
- RCNN:
-
Region-based convolutional neural network
- LSTM:
-
Long short-term memory
- RGB:
-
Red–green–blue
- HSV:
-
Hue–saturation–value
- GSM:
-
Global system for mobile communications
- SMOTE:
-
Synthetic minority oversampling technique
- GEI:
-
Gait energy image
- ANN:
-
Artificial neural network
- VGG16:
-
Visual geometry group 16
- ResNet-50:
-
Residual network with 50 layers
- ReLU:
-
Rectified linear unit
- DTW:
-
Dynamic time warping
- HOG:
-
Histogram of oriented gradient
References
Zhang Y, Cao C, Cheng J, Lu H (2018) Egogesture: a new dataset and benchmark for egocentric hand gesture recognition. IEEE Trans Multimed 20(5):1038–1050
Alksasbeh MZ, Al-Omari AH, Alqaralleh B, Abukhalil T, Abukarki A, Alshalabi IA, Alkaseasbeh A (2021) Smart hand gestures recognition using K-NN based algorithm for video annotation purposes. Indones J Electr Eng Comput Sci 21(1):242–252
Passos WL, Araujo GM, Gois JN, de Lima AA (2021) A gait energy image-based system for Brazilian sign language recognition. IEEE Trans Circuits Syst I Regul Pap 68(11):4761–4771
Rezende TM, Almeida SGM, Guimarães FG (2021) Development and validation of a Brazilian sign language database for human gesture recognition. Neural Comput Appl 33(16):10449–10467
Oudah M, Al-Naji A, Chahl J (2020) Elderly care based on hand gestures using kinect sensor. Computers 10(1):5
Lekova A, Ryan D, Davidrajuh R (2017) Fingers and gesture recognition with kinect v2 sensor. Inf Technol Control 14(3):24–30
Bamwenda J, Özerdem M (2019) Recognition of static hand gesture with using ANN and SVM. Dicle Univ J Eng
Jadhav A, Asnani R, Crasto R, Nilange O, Ponkshe A (2015) Gesture recognition using support vector machine. Int J Electr Electron Data Commun 3(5):36–41
Shanmuganathan V, Yesudhas HR, Khan MS, Khari M, Gandomi AH (2020) R-CNN and wavelet feature extraction for hand gesture recognition with EMG signals. Neural Comput Appl 32(21):16723–16736
Do NT, Kim SH, Yang HJ, Lee GS (2020) Robust hand shape features for dynamic hand gesture recognition using multi-level feature LSTM. Appl Sci 10(18):6293
Almeida SGM, Rezende TM, Almeida GTB, Toffolo ACR, Guimarães FG (2020) Minds-libras dataset (rgb-d sensor data). https://doi.org/10.5281/zenodo.4322984
Kaur T, Gandhi TK (2019) Automated brain image classification based on VGG-16 and transfer learning. In: 2019 International Conference on Information Technology (ICIT) (IEEE), pp 94–98
Swasono DI, Tjandrasa H, Fathicah C (2019) Classification of tobacco leaf pests using VGG16 transfer learning. In: 2019 12th International Conference on Information & Communication Technology and System (ICTS) (IEEE), pp 176–181
Güler RA, Neverova N, Kokkinos I (2018) Densepose: dense human pose estimation in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition , pp 7297–7306
Kirillov A, Girshick R, He K, Dollár P (2019) Panoptic feature pyramid networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6399–6408
Adegun AA, Viriri S (2020) FCN-based DenseNet framework for automated detection and classification of skin lesions in dermoscopy images. IEEE Access 8:150377–150396
Balmik A, Kumar A, Nandy A (2021) Efficient face recognition system for education sectors in COVID-19 pandemic. In: 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT) (IEEE), pp 1–8
Balmik A, Paikaray A, Jha M, Nandy A (2022) Motion recognition using deep convolutional neural network for kinect-based NAO teleoperation. Robotica 40:3233–3253
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bharti, S., Balmik, A. & Nandy, A. Novel error correction-based key frame extraction technique for dynamic hand gesture recognition. Neural Comput & Applic 35, 21165–21180 (2023). https://doi.org/10.1007/s00521-023-08774-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08774-9