Skip to main content

Advertisement

Log in

mIV3Net: modified inception V3 network for hand gesture recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Hand gesture plays an important role in communication among the hearing and speech disorders people. Hand gesture recognition (HGR) is the backbone of human–computer interaction (HCI). Most of the reported hand gesture recognition techniques suffer due to the complex backgrounds. As per the literature, most of the existing HGR methods have only selected a few inter-class similar gestures for recognition performance. This paper proposes a two-phase deep learning-based HGR system to mitigate the complex background issue and consider all gesture classes. In the first phase, inception V3 architecture is improved and named mIV3Net: modified inception V3 network to reduce the computational resource requirement. In the second phase, mIV3Net has been fine-tuned to offer more attention to prominent features. As a result, better abstract knowledge has been used for gesture recognition. Hence, the proposed algorithm has more discrimination characteristics. The efficacy of the proposed two-phase-based HGR system is validated and generalized through experimentation using five publicly available standard datasets: MUGD, ISL, ArSL, NUS-I, and NUS-II. The accuracy values of the proposed system on five datasets in the above order are 97.14%, 99.3%, 97.4%, 99%, and 99.8%, which indicates significant improvement, i.e., 12.58%, 2.54%, 2.73%, 0.56%, and 2.02%, respectively, than the state-of-the-art HGR systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Data Availability

The datasets that support the findings of this study are available in: (1) MUGD: http://hdl.handle.net/10179/4514. (2) ISL: https://doi.org/10.1007/s13369-021-06456-z. (3) ArSL: https://doi.org/10.1016/j.dib.2019.103777. (4) NUS-I and II: https://doi.org/10.1007/s11263-012-0560-5.

References

  1. Aly S, Aly W (2020) DeepArSLR: a novel signer-independent deep learning framework for isolated Arabic sign language gestures recognition. IEEE Access 8:83199–83212

    Article  Google Scholar 

  2. Badi H (2016) Recent methods in vision-based hand gesture recognition. Int J Data Sci Anal 1(2):77–87

    Article  Google Scholar 

  3. Bansal SR, Wadhawan S, Goel R (2022) mRMR-PSO: a hybrid feature selection technique with a multiobjective approach for sign language recognition. Arab J Sci Eng 47(8):10365–10380

    Article  Google Scholar 

  4. Barczak ALC, Reyes NH, Abastillas M, Piccio A, Susnjak T (2011) A new 2D static hand gesture colour image dataset for ASL gestures. Research Letters in the Information and Mathematical Sciences 15:12–20. http://hdl.handle.net/10179/4514

  5. Bhaumik G, Verma M, Govil MC, Vipparthi SK (2023) HyFiNet: hybrid feature attention network for hand gesture recognition. Multimed Tools Appl 82(4):4863–4882

    Article  Google Scholar 

  6. Can C, Kaya Y, Kılıç F (2021) A deep convolutional neural network model for hand gesture recognition in 2d near-infrared images. Biomed Phys Eng Express 7(5):055005

    Article  Google Scholar 

  7. Chevtchenko SF, Vale RF, Macario V, Cordeiro FR (2018) A convolutional neural network with feature fusion for real-time hand posture recognition. Appl Soft Comput 73:748–766

    Article  Google Scholar 

  8. Dadashzadeh A, Targhi AT, Tahmasbi M, Mirmehdi M (2019) HGR-Net: a fusion network for hand gesture segmentation and recognition. IET Comput Vision 13(8):700–707

    Article  Google Scholar 

  9. Gupta B, Shukla P, Mittal A (2016) K-nearest correlated neighbor classification for Indian sign language gesture recognition using feature fusion. In: 2016 International Conference on Computer Communication and Informatics (ICCCI). IEEE, pp 1–5

  10. Hasan HS, Kareem SA (2012) Human computer interaction for vision based hand gesture recognition: a survey. In: 2012 international conference on Advanced Computer Science Applications and Technologies (ACSAT). IEEE, pp 55–60

  11. He K., Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  12. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861

  13. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708

  14. Huesser C, Schubiger S, Çöltekin A (2021) Gesture interaction in virtual reality: a low-cost machine learning system and a qualitative assessment of effectiveness of selected gestures vs. gaze and controller interaction. In: Human-Computer Interaction–INTERACT 2021: 18th IFIP TC 13 International Conference, Bari, Italy, August 30–September 3, 2021, Proceedings, Part III 18. Springer, pp 151–160

  15. Jadooki S, Mohamad D, Saba T, Almazyad AS, Rehman A (2017) Fused features mining for depth-based hand gesture recognition to classify blind human communication. Neural Comput Appl 28:3285–3294

    Article  Google Scholar 

  16. Jaramillo-Yánez A, Benalcázar ME, Mena-Maldonado E (2020) Real-time hand gesture recognition using surface electromyography and machine learning: a systematic literature review. Sensors 20(9):2467

    Article  Google Scholar 

  17. Joshi G, Singh S, Vig R (2020) Taguchi-TOPSIS based HOG parameter selection for complex background sign language recognition. J Vis Commun Image Represent 71:102834

    Article  Google Scholar 

  18. Kamruzzaman MM (2020) Arabic sign language recognition and generating Arabic speech using convolutional neural network. Wirel Commun Mob Comput pp 1–9. https://doi.org/10.1155/2020/3685614

  19. Kowdiki M, Khaparde A (2022) Adaptive hough transform with optimized deep learning followed by dynamic time warping for hand gesture recognition. Multimed Tools Appl:1–32

  20. Latif G, Mohammad N, Alghazo J, AlKhalaf R, AlKhalaf R (2019) Arasl: Arabic alphabets sign language dataset. Data Brief 23:103777

    Article  Google Scholar 

  21. Li X, Deng Q (2021) Chinese position segmentation based on ALBERT-BiGRU-CRF model. In: 2021 International Symposium on Computer Technology and Information Science (ISCTIS). IEEE, pp 116–120

  22. Li S-Z, Yu B, Wu W, Su S-Z, Ji R-R (2015) Feature learning based on SAE–PCA network for human gesture recognition in RGBD images. Neurocomputing 151:565–573

    Article  Google Scholar 

  23. Li Y, Wang X, Liu W, Feng B (2018) Deep attention network for joint hand gesture localization and recognition using static RGB-D images. Inf Sci 441:66–78

    Article  MathSciNet  Google Scholar 

  24. Li G, Zhang L, Sun Y, Kong J (2019) Towards the sEMG hand: internet of things sensors and haptic feedback application. Multimed Tools Appl 78:29765–29782

    Article  Google Scholar 

  25. Lin H-I, Hsu M-H, Chen W-K (2014) Human hand gesture recognition using a convolution neural network. In: 2014 IEEE international Conference on Automation Science and Engineering (CASE). IEEE, pp 1038–1043

  26. Liu P, Li X, Cui H, Li S, Yuan Y (2019) Hand gesture recognition based on single-shot multibox detector deep learning. Mob Inf Syst 2019:1–7

    Google Scholar 

  27. Mujahid A, Awan MJ, Yasin A, Mohammed MA, Damaševičius R, Maskeliūunas R, Abdulkareem KH (2021) Real-time hand gesture recognition based on deep learning YOLOv3 model. Appl Sci 11(9):4164

    Article  Google Scholar 

  28. Nagarajan S, Subashini T (2013) Static hand gesture recognition for sign language alphabets using edge oriented histogram and multi class SVM. Int J Comput Appl 82(4):28–35

    Google Scholar 

  29. Neethu P, Suguna R, Sathish D (2020) An efficient method for human hand gesture detection and recognition using deep learning convolutional neural networks. Soft Comput 24:15239–15248

    Article  Google Scholar 

  30. Oudah M, Al-Naji A, Chahl J (2020) Hand gesture recognition based on computer vision: a review of techniques. J Imaging 6(8):73

    Article  Google Scholar 

  31. Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951

    Article  Google Scholar 

  32. Ozcan T, Basturk A (2019) Transfer learning-based convolutional neural networks with heuristic optimization for hand gesture recognition. Neural Comput Appl 31:8955–8970

    Article  Google Scholar 

  33. Pabendon E, Nugroho H, Suheryadi A, Yunanto PE (2017) Hand gesture recognition system under complex background using spatio temporal analysis. In: 2017 5th International Conference on Instrumentation, Communications, Information Technology, and Biomedical Engineering (ICICI-BME). IEEE, pp 261–265

  34. Pavlovic VI, Sharma R, Huang TS (1997) Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Trans Pattern Anal Mach Intell 19(7):677–695

    Article  Google Scholar 

  35. Pinto RF, Borges CD, Almeida AM, Paula IC (2019) Static hand gesture recognition based on convolutional neural networks. J Electr Comput Eng 2019:1–12

    Article  Google Scholar 

  36. Pisharady PK, Vadakkepat P, Loh AP (2013) Attention based detection and recognition of hand postures against complex backgrounds. Int J Comput Vis 101:403–419

    Article  Google Scholar 

  37. Ranga V, Yadav N, Garg P (2018) American sign language fingerspelling using hybrid discrete wavelet transform-Gabor filter and convolutional neural network. J Eng Sci Technol 13(9):2655–2669

    Google Scholar 

  38. Rastgoo R, Kiani K, Escalera S (2021) Sign language recognition: a deep survey. Expert Syst Appl 164:113794

    Article  Google Scholar 

  39. Rathi P, Kuwar Gupta R, Agarwal S, Shukla A (2020) Sign language recognition using resnet50 deep neural network architecture. In: 5th International Conference on Next Generation Computing Technologies (NGCT-2019)

  40. Rautaray SS, Agrawal A (2015) Vision based hand gesture recognition for human computer interaction: a survey. Artif Intell Rev 43:1–54

    Article  Google Scholar 

  41. Rubin Bose S, Sathiesh Kumar V (2021) In-situ identification and recognition of multi-hand gestures using optimized deep residual network. J Intell Fuzzy Syst 41(6):6983–6997

    Article  Google Scholar 

  42. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520

  43. Shanthakumar VA, Peng C, Hansberger J, Cao L, Meacham S, Blakely V (2020) Design and evaluation of a hand gesture recognition approach for real-time interactions. Multimed Tools Appl 79:17707–17730

    Article  Google Scholar 

  44. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

  45. Tan YS, Lim KM, Lee CP (2021) Hand gesture recognition via enhanced densely connected convolutional neural network. Expert Syst Appl 175:114797

    Article  Google Scholar 

  46. Tharwat A, Gaber T, Hassanien AE, Shahin MK, Refaat B (2015) Sift-based Arabic sign language recognition system. In: Afro-European conference for industrial advancement: proceedings of the first international Afro-European Conference for Industrial Advancement AECIA 2014. Springer, pp 359–370

  47. Tsai T-H, Huang C-C, Zhang K-L (2020) Design of hand gesture recognition system for human-computer interaction. Multimed Tools Appl 79:5989–6007

    Article  Google Scholar 

  48. Von Hardenberg C, Bérard F (2001) Bare-hand human-computer interaction. In: Proceedings of the 2001 workshop on perceptive user interfaces, pp 1–8

  49. Wadhawan A, Kumar P (2020) Deep learning-based sign language recognition system for static signs. Neural Comput Appl 32:7957–7968

    Article  Google Scholar 

  50. Wang C, Liu Z, Chan S-C (2014) Superpixel-based hand gesture recognition with Kinect depth camera. IEEE Trans Multimed 17(1):29–39

    Article  Google Scholar 

  51. Xie B, He X, Li Y (2018) RGB-D static gesture recognition based on convolutional neural network. J Eng 2018(16):1515–1520

    Article  Google Scholar 

  52. Yasen M, Jusoh S (2019) A systematic review on hand gesture recognition techniques, challenges and applications. PeerJ Comput Sci 5:218

    Article  Google Scholar 

  53. Zakariah M, Alotaibi YA, Koundal D, Guo Y, Mamun EM (2022) Sign language recognition for Arabic alphabets using transfer learning technique. Comput Intell Neurosci pp 1–15. https://doi.org/10.1155/2022/4567989

  54. Zhang T, Lin H, Ju Z, Yang C (2020) Hand gesture recognition in complex background based on convolutional pose machine and fuzzy Gaussian mixture models. Int J Fuzzy Syst 22:1330–1341

    Article  Google Scholar 

  55. Zhang W, Wang J, Lan F (2020) Dynamic hand gesture recognition based on short-term sampling neural networks. IEEE/CAA J Autom Sin 8(1):110–120

    Article  Google Scholar 

  56. Zhao J, Allison RS (2020) Comparing head gesture, hand gesture and gamepad interfaces for answering yes/no questions in virtual environments. Virtual Real 24(3):515–524

    Article  Google Scholar 

  57. Zhou W, Chen K (2022) A lightweight hand gesture recognition in complex backgrounds. Displays 74:102226

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

BK was involved in experimentation, investigations, analysis, paper writing. RHL contributed to conceptualization, reviewing and editing, supervision. RKK contributed to analysis and reviewing.

Corresponding author

Correspondence to Bhumika Karsh.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Karsh, B., Laskar, R.H. & Karsh, R.K. mIV3Net: modified inception V3 network for hand gesture recognition. Multimed Tools Appl 83, 10587–10613 (2024). https://doi.org/10.1007/s11042-023-15865-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15865-1

Keywords

Navigation