Real-time hand gesture recognition using multiple deep learning architectures

Aggarwal, Apeksha; Bhutani, Nikhil; Kapur, Ritvik; Dhand, Geetika; Sheoran, Kavita

doi:10.1007/s11760-023-02626-8

Real-time hand gesture recognition using multiple deep learning architectures

Original Paper
Published: 05 July 2023

Volume 17, pages 3963–3971, (2023)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Apeksha Aggarwal¹,
Nikhil Bhutani²^na1,
Ritvik Kapur²^na1,
Geetika Dhand² &
…
Kavita Sheoran²

547 Accesses
4 Citations
Explore all metrics

Abstract

Human gesture recognition is one of the most challenging problems in computer vision, striving to analyze human gestures by machine. However, most of the literature on gesture recognition utilizes isolated data with only one gesture in one image or a video for classifying gestures. This work targets the identification of human gestures from the continuous stream of data input taken from a live camera feed, with no pre-defined boundaries. This task becomes even more complex given the diverse lighting conditions, varying backgrounds and different gesture positions in the same input stream of data. This work presents an effective deep learning architecture to classify gestures taken from multiple viewpoints and varying object sizes. To perform the classification, in this work, we have synthesized a real-world dataset consisting of 4500 images collected from different persons of varying age groups ranging from 10 to 50. The dataset is accumulated considering a wide variety of characteristics to address the complexities in the gesture recognition process. A real-time system is developed that captures, analyzes and classifies live gesture videos frame by frame. To prove the validity of our approach, we have compared our results with multiple deep learning architectures and other benchmark datasets. The results depict that our approach outperforms the existing works and is able to detect gestures with deteriorating lighting conditions and murky gesture positions, achieving an accuracy of 99.63%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hand Gesture Recognition Using 3D CNN and Computer Interfacing

Hand Gesture Recognition Using Leap Motion Controller, Infrared Information, and Deep Learning Framework

An Online Approach for Gesture Recognition Toward Real-World Applications

Availability of data and materials

Not applicable.

References

Moin, A., Zhou, A., Rahimi, A., Menon, A., Benatti, S., Alexandrov, G., Tamakloe, S., Ting, J., Yamamoto, N., Khan, Y., et al.: A wearable biosensing system with in-sensor adaptive machine learning for hand gesture recognition. Nat. Electron. 4(1), 54–63 (2021)
Article Google Scholar
Mujahid, A., Awan, M.J., Yasin, A., Mohammed, M.A., Damaševičius, R., Maskeliūnas, R., Abdulkareem, K.H.: Real-time hand gesture recognition based on deep learning yolov3 model. Appl. Sci. 11(9), 4164 (2021)
Article Google Scholar
Ahmed, S., Kallu, K.D., Ahmed, S., Cho, S.H.: Hand gestures recognition using radar sensors for human-computer-interaction: a review. Remote Sens. 13(3), 527 (2021)
Article Google Scholar
Stergiopoulou, E., Papamarkos, N.: Hand gesture recognition using a neural network shape fitting technique. Eng. Appl. Artif. Intell. 22(8), 1141–1158 (2009)
Article Google Scholar
Czuszynski, K., Ruminski, J., Wtorek, J.: Pose classification in the gesture recognition using the linear optical sensor. In: 2017 10th International Conference on Human System Interactions (HSI), pp. 18–24. IEEE (2017)
Molchanov, P., Gupta, S., Kim, K., Kautz, J.: Hand gesture recognition with 3d convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–7 (2015)
Flores, C.J.L., Cutipa, A.G., Enciso, R.L.: Application of convolutional neural networks for static hand gestures recognition under different invariant features. In: 2017 IEEE XXIV International Conference on Electronics, Electrical Engineering and Computing (INTERCON), pp. 1–4. IEEE (2017)
Devineau, G., Moutarde, F., Xi, W., Yang, J.: Deep learning for hand gesture recognition on skeletal data. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 106–113. IEEE (2018)
Fernández, D.N., Kwolek, B.: Hand posture recognition using convolutional neural network. In: Iberoamerican Congress on Pattern Recognition, pp. 441–449. Springer (2017)
Limonchik, B., Amdur, G.: 3d model-based data augmentation for hand gesture recognition. http://cs231n.stanford.edu/reports/2017/pdfs/218.pdf, 1–9 (2017). Accessed 01 Apr 2023
Arenas, J.O.P., Moreno, R.J., Murillo, P.C.U.: Hand gesture recognition by means of region-based convolutional neural networks. Contemp. Eng. Sci. 10(27), 1329–1342 (2017)
Article Google Scholar
Materzynska, J., Berger, G., Bax, I., Memisevic, R.: The jester dataset: a large-scale video dataset of human gestures. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 1–9 (2019)
Gupta, O., Raviv, D., Raskar, R.: Multi-velocity neural networks for gesture recognition in videos. https://arxiv.org/abs/1603.06829 (2016). Accessed 06 Dec 2021
Seok, W., Kim, Y., Park, C.: Pattern recognition of human arm movement using deep reinforcement learning. In: 2018 International Conference on Information Networking (ICOIN), pp. 917–919. IEEE (2018)
Luzanin, O., Plancak, M.: Hand gesture recognition using low-budget data glove and cluster-trained probabilistic neural network. Assem. Autom. 34(1), 94–105 (2014)
Article Google Scholar
AlZu’bi, S., Al-Qatawneh, S., Alsmirat, M.: Transferable hmm trained matrices for accelerating statistical segmentation time. In: 2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS), pp. 172–176. IEEE (2018)
Al-Ayyoub, M., AlZu’bi, S., Jararweh, Y., Shehab, M.A., Gupta, B.B.: Accelerating 3d medical volume segmentation using gpus. Multim. Tools Appl. 77(4), 4939–4958 (2018)
Article Google Scholar
AlZu’bi, S., Shehab, M., Al-Ayyoub, M., Jararweh, Y., Gupta, B.: Parallel implementation for 3d medical volume fuzzy segmentation. Pattern Recognit. Lett. 130, 312–318 (2020)
Article Google Scholar
Al-Zu’bi, S., Hawashin, B., Mughaid, A., Baker, T.: Efficient 3d medical image segmentation algorithm over a secured multimedia network. Multim. Tools Appl. 80(11), 16887–16905 (2021)
Article Google Scholar
Singha, J., Roy, A., Laskar, R.H.: Dynamic hand gesture recognition using vision-based approach for human-computer interaction. Neural Comput. Appl. 29(4), 1129–1141 (2018)
Article Google Scholar
Aggarwal, A., Srivastava, A., Agarwal, A., Chahal, N., Singh, D., Alnuaim, A.A., Alhadlaq, A., Lee, H.-N.: Two-way feature extraction for speech emotion recognition using deep learning. Sensors 22(6), 2378 (2022)
Li, Z.: Practice of gesture recognition based on resnet50. J. Phys. Conf. Ser. 1574, 012154 (2020)
Article Google Scholar
Satybaldina, D., Kalymova, G.: Deep learning based static hand gesture recognition. Indones. J. Electr. Eng. Comput. Sci. 21(1), 398–405 (2021)
Google Scholar
Ozcan, T., Basturk, A.: Transfer learning-based convolutional neural networks with heuristic optimization for hand gesture recognition. Neural Comput. Appl. 31(12), 8955–8970 (2019)
Article Google Scholar
Tangri, K.: Multi-class image classification using Alexnet deep learning network implemented in Keras API. Medium. https://medium.com/analytics-vidhya/multi-class-image-classification-using-alexnet-deep-learning-network-implemented-in-keras-api-c9ae7bc4c05f (2020). Accessed 06 Dec 2021
Zhang, E., Xue, B., Cao, F., Duan, J., Lin, G., Lei, Y.: Fusion of 2d cnn and 3d densenet for dynamic gesture recognition. Electronics 8(12), 1511 (2019)
Article Google Scholar
Teams, K.: Keras documentation: DenseNet. Keras. https://keras.io/api/applications/densenet/#densenet121-function. Accessed 06 Dec 2021
Teams, K.: Keras documentation: EfficientNet B0 to B7. Keras. https://keras.io/api/applications/efficientnet/#efficientnetb0-function. Accessed 06 Dec 2021
G., R.: Everything you need to know about VGG16. Medium. https://medium.com/@mygreatlearning/everything-you-need-to-know-about-vgg16-7315defb5918. Accessed 06 Apr 2023
Kang, S., Kim, H., Park, C., Sim, Y., Lee, S., Jung, Y.: semg-Based hand gesture recognition using binarized neural network. Sensors 23(3), 1436 (2023)
Miah, A.S.M., Hasan, M.A.M., Shin, J.: Dynamic hand gesture recognition using multi-branch attention based graph and general deep learning model. IEEE Access 11, 4703–4716 (2023)
Article Google Scholar
Colli Alfaro, J.G., Trejos, A.L.: User-independent hand gesture recognition classification models using sensor fusion. Sensors 22(4), 1321 (2022)
Article Google Scholar
Wang, S., Wang, A., Ran, M., Liu, L., Peng, Y., Liu, M., Su, G., Alhudhaif, A., Alenezi, F., Alnaim, N.: Hand gesture recognition framework using a lie group based spatio-temporal recurrent network with multiple hand-worn motion sensors. Inf. Sci. 606, 722–741 (2022)
Article Google Scholar
Jain, K.: Hand Gesture Recognition. https://www.kaggle.com/kritanjalijain/gestures-hand (2020). Accessed 06 Dec 2021
Sappani, R.: Hand gesture recognition. https://www.kaggle.com/datasets/roobansappani/hand-gesture-recognition (2020). Accessed 08 Aug 2022

Download references

Acknowledgements

This work was conducted as a part of internship project at AI-Shala Pvt. Limited.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial or not-for-profit sectors.

Author information

Nikhil Bhutani and Ritvik Kapur have contributed equally to this work.

Authors and Affiliations

Department of CSE, Indian Institute of Technology Roorkee, Roorkee, Uttar Pradesh, 247667, India
Apeksha Aggarwal
Maharaja Surajmal Institute of Technology, Delhi, Delhi, 110058, India
Nikhil Bhutani, Ritvik Kapur, Geetika Dhand & Kavita Sheoran

Authors

Apeksha Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar
Nikhil Bhutani
View author publications
You can also search for this author in PubMed Google Scholar
Ritvik Kapur
View author publications
You can also search for this author in PubMed Google Scholar
Geetika Dhand
View author publications
You can also search for this author in PubMed Google Scholar
Kavita Sheoran
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors read and approved the final manuscript. NB and RK: methodology, software, validation, formal analysis, investigation, resources, data curation. AA: conceptualization, writing—original draft, review \(\&\) editing, visualization, project administration, supervision. KS and GD: writing—review.

Corresponding author

Correspondence to Ritvik Kapur.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Aggarwal, A., Bhutani, N., Kapur, R. et al. Real-time hand gesture recognition using multiple deep learning architectures. SIViP 17, 3963–3971 (2023). https://doi.org/10.1007/s11760-023-02626-8

Download citation

Received: 10 March 2023
Revised: 15 April 2023
Accepted: 08 May 2023
Published: 05 July 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s11760-023-02626-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-time hand gesture recognition using multiple deep learning architectures

Abstract

Access this article

Similar content being viewed by others

Hand Gesture Recognition Using 3D CNN and Computer Interfacing

Hand Gesture Recognition Using Leap Motion Controller, Infrared Information, and Deep Learning Framework

An Online Approach for Gesture Recognition Toward Real-World Applications

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Real-time hand gesture recognition using multiple deep learning architectures

Abstract

Access this article

Similar content being viewed by others

Hand Gesture Recognition Using 3D CNN and Computer Interfacing

Hand Gesture Recognition Using Leap Motion Controller, Infrared Information, and Deep Learning Framework

An Online Approach for Gesture Recognition Toward Real-World Applications

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation