Skip to main content
Log in

Real-time hand gesture recognition using multiple deep learning architectures

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Human gesture recognition is one of the most challenging problems in computer vision, striving to analyze human gestures by machine. However, most of the literature on gesture recognition utilizes isolated data with only one gesture in one image or a video for classifying gestures. This work targets the identification of human gestures from the continuous stream of data input taken from a live camera feed, with no pre-defined boundaries. This task becomes even more complex given the diverse lighting conditions, varying backgrounds and different gesture positions in the same input stream of data. This work presents an effective deep learning architecture to classify gestures taken from multiple viewpoints and varying object sizes. To perform the classification, in this work, we have synthesized a real-world dataset consisting of 4500 images collected from different persons of varying age groups ranging from 10 to 50. The dataset is accumulated considering a wide variety of characteristics to address the complexities in the gesture recognition process. A real-time system is developed that captures, analyzes and classifies live gesture videos frame by frame. To prove the validity of our approach, we have compared our results with multiple deep learning architectures and other benchmark datasets. The results depict that our approach outperforms the existing works and is able to detect gestures with deteriorating lighting conditions and murky gesture positions, achieving an accuracy of 99.63%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Availability of data and materials

Not applicable.

References

  1. Moin, A., Zhou, A., Rahimi, A., Menon, A., Benatti, S., Alexandrov, G., Tamakloe, S., Ting, J., Yamamoto, N., Khan, Y., et al.: A wearable biosensing system with in-sensor adaptive machine learning for hand gesture recognition. Nat. Electron. 4(1), 54–63 (2021)

    Article  Google Scholar 

  2. Mujahid, A., Awan, M.J., Yasin, A., Mohammed, M.A., Damaševičius, R., Maskeliūnas, R., Abdulkareem, K.H.: Real-time hand gesture recognition based on deep learning yolov3 model. Appl. Sci. 11(9), 4164 (2021)

    Article  Google Scholar 

  3. Ahmed, S., Kallu, K.D., Ahmed, S., Cho, S.H.: Hand gestures recognition using radar sensors for human-computer-interaction: a review. Remote Sens. 13(3), 527 (2021)

    Article  Google Scholar 

  4. Stergiopoulou, E., Papamarkos, N.: Hand gesture recognition using a neural network shape fitting technique. Eng. Appl. Artif. Intell. 22(8), 1141–1158 (2009)

    Article  Google Scholar 

  5. Czuszynski, K., Ruminski, J., Wtorek, J.: Pose classification in the gesture recognition using the linear optical sensor. In: 2017 10th International Conference on Human System Interactions (HSI), pp. 18–24. IEEE (2017)

  6. Molchanov, P., Gupta, S., Kim, K., Kautz, J.: Hand gesture recognition with 3d convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–7 (2015)

  7. Flores, C.J.L., Cutipa, A.G., Enciso, R.L.: Application of convolutional neural networks for static hand gestures recognition under different invariant features. In: 2017 IEEE XXIV International Conference on Electronics, Electrical Engineering and Computing (INTERCON), pp. 1–4. IEEE (2017)

  8. Devineau, G., Moutarde, F., Xi, W., Yang, J.: Deep learning for hand gesture recognition on skeletal data. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 106–113. IEEE (2018)

  9. Fernández, D.N., Kwolek, B.: Hand posture recognition using convolutional neural network. In: Iberoamerican Congress on Pattern Recognition, pp. 441–449. Springer (2017)

  10. Limonchik, B., Amdur, G.: 3d model-based data augmentation for hand gesture recognition. http://cs231n.stanford.edu/reports/2017/pdfs/218.pdf, 1–9 (2017). Accessed 01 Apr 2023

  11. Arenas, J.O.P., Moreno, R.J., Murillo, P.C.U.: Hand gesture recognition by means of region-based convolutional neural networks. Contemp. Eng. Sci. 10(27), 1329–1342 (2017)

    Article  Google Scholar 

  12. Materzynska, J., Berger, G., Bax, I., Memisevic, R.: The jester dataset: a large-scale video dataset of human gestures. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 1–9 (2019)

  13. Gupta, O., Raviv, D., Raskar, R.: Multi-velocity neural networks for gesture recognition in videos. https://arxiv.org/abs/1603.06829 (2016). Accessed 06 Dec 2021

  14. Seok, W., Kim, Y., Park, C.: Pattern recognition of human arm movement using deep reinforcement learning. In: 2018 International Conference on Information Networking (ICOIN), pp. 917–919. IEEE (2018)

  15. Luzanin, O., Plancak, M.: Hand gesture recognition using low-budget data glove and cluster-trained probabilistic neural network. Assem. Autom. 34(1), 94–105 (2014)

    Article  Google Scholar 

  16. AlZu’bi, S., Al-Qatawneh, S., Alsmirat, M.: Transferable hmm trained matrices for accelerating statistical segmentation time. In: 2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS), pp. 172–176. IEEE (2018)

  17. Al-Ayyoub, M., AlZu’bi, S., Jararweh, Y., Shehab, M.A., Gupta, B.B.: Accelerating 3d medical volume segmentation using gpus. Multim. Tools Appl. 77(4), 4939–4958 (2018)

    Article  Google Scholar 

  18. AlZu’bi, S., Shehab, M., Al-Ayyoub, M., Jararweh, Y., Gupta, B.: Parallel implementation for 3d medical volume fuzzy segmentation. Pattern Recognit. Lett. 130, 312–318 (2020)

    Article  Google Scholar 

  19. Al-Zu’bi, S., Hawashin, B., Mughaid, A., Baker, T.: Efficient 3d medical image segmentation algorithm over a secured multimedia network. Multim. Tools Appl. 80(11), 16887–16905 (2021)

    Article  Google Scholar 

  20. Singha, J., Roy, A., Laskar, R.H.: Dynamic hand gesture recognition using vision-based approach for human-computer interaction. Neural Comput. Appl. 29(4), 1129–1141 (2018)

    Article  Google Scholar 

  21. Aggarwal, A., Srivastava, A., Agarwal, A., Chahal, N., Singh, D., Alnuaim, A.A., Alhadlaq, A., Lee, H.-N.: Two-way feature extraction for speech emotion recognition using deep learning. Sensors 22(6), 2378 (2022)

  22. Li, Z.: Practice of gesture recognition based on resnet50. J. Phys. Conf. Ser. 1574, 012154 (2020)

    Article  Google Scholar 

  23. Satybaldina, D., Kalymova, G.: Deep learning based static hand gesture recognition. Indones. J. Electr. Eng. Comput. Sci. 21(1), 398–405 (2021)

    Google Scholar 

  24. Ozcan, T., Basturk, A.: Transfer learning-based convolutional neural networks with heuristic optimization for hand gesture recognition. Neural Comput. Appl. 31(12), 8955–8970 (2019)

    Article  Google Scholar 

  25. Tangri, K.: Multi-class image classification using Alexnet deep learning network implemented in Keras API. Medium. https://medium.com/analytics-vidhya/multi-class-image-classification-using-alexnet-deep-learning-network-implemented-in-keras-api-c9ae7bc4c05f (2020). Accessed 06 Dec 2021

  26. Zhang, E., Xue, B., Cao, F., Duan, J., Lin, G., Lei, Y.: Fusion of 2d cnn and 3d densenet for dynamic gesture recognition. Electronics 8(12), 1511 (2019)

    Article  Google Scholar 

  27. Teams, K.: Keras documentation: DenseNet. Keras. https://keras.io/api/applications/densenet/#densenet121-function. Accessed 06 Dec 2021

  28. Teams, K.: Keras documentation: EfficientNet B0 to B7. Keras. https://keras.io/api/applications/efficientnet/#efficientnetb0-function. Accessed 06 Dec 2021

  29. G., R.: Everything you need to know about VGG16. Medium. https://medium.com/@mygreatlearning/everything-you-need-to-know-about-vgg16-7315defb5918. Accessed 06 Apr 2023

  30. Kang, S., Kim, H., Park, C., Sim, Y., Lee, S., Jung, Y.: semg-Based hand gesture recognition using binarized neural network. Sensors 23(3), 1436 (2023)

  31. Miah, A.S.M., Hasan, M.A.M., Shin, J.: Dynamic hand gesture recognition using multi-branch attention based graph and general deep learning model. IEEE Access 11, 4703–4716 (2023)

    Article  Google Scholar 

  32. Colli Alfaro, J.G., Trejos, A.L.: User-independent hand gesture recognition classification models using sensor fusion. Sensors 22(4), 1321 (2022)

    Article  Google Scholar 

  33. Wang, S., Wang, A., Ran, M., Liu, L., Peng, Y., Liu, M., Su, G., Alhudhaif, A., Alenezi, F., Alnaim, N.: Hand gesture recognition framework using a lie group based spatio-temporal recurrent network with multiple hand-worn motion sensors. Inf. Sci. 606, 722–741 (2022)

    Article  Google Scholar 

  34. Jain, K.: Hand Gesture Recognition. https://www.kaggle.com/kritanjalijain/gestures-hand (2020). Accessed 06 Dec 2021

  35. Sappani, R.: Hand gesture recognition. https://www.kaggle.com/datasets/roobansappani/hand-gesture-recognition (2020). Accessed 08 Aug 2022

Download references

Acknowledgements

This work was conducted as a part of internship project at AI-Shala Pvt. Limited.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Contributions

All authors read and approved the final manuscript. NB and RK: methodology, software, validation, formal analysis, investigation, resources, data curation. AA: conceptualization, writing—original draft, review \(\&\) editing, visualization, project administration, supervision. KS and GD: writing—review.

Corresponding author

Correspondence to Ritvik Kapur.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aggarwal, A., Bhutani, N., Kapur, R. et al. Real-time hand gesture recognition using multiple deep learning architectures. SIViP 17, 3963–3971 (2023). https://doi.org/10.1007/s11760-023-02626-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-023-02626-8

Keywords

Navigation