Abstract
Object tracking in videos is a critical task in computer vision. It comes with challenges due to the processing complexities and the high accuracy requirement. Challenges like varying lighting conditions, partial or complete occlusion, shape changes, and the presence of multiple persons make object tracking particularly difficult. A new dataset named LNMIIT Dynamic Hand Gesture Dataset-5 (numerals 0 to 9) has been prepared under various challenging conditions. An innovative Region of Interest (ROI) hand detection model has been proposed, which utilizes motion and color information to identify hands automatically. The template Matching technique combined with the Improved mKLT (Modified Kanade Lucas Tomasi) tracking algorithm has been used to track the hand. This hybrid approach aims to enhance tracking performance under challenging conditions. Additionally, A novel and robust CNN model named as HG-CNN (Hand Gesture Convolution Neural Network) has been proposed for hand gesture recognition.HG-CNN excels in accuracy and boasts time efficiency, ensuring rapid response times. Additionally, it is engineered to be energy-efficient, making it a compact and resource-sparing solution for real-time applications. The proposed CNN model achieves an impressive recognition accuracy of 99.83%, showcasing its effectiveness in handling object recognition tasks. A comparative study has been carried out with established pre-trained models, namely LeNet5, Inception V3, and VGG16, and has shown the proposed system outperforming in terms of accuracy, time efficiency, and response time.
Similar content being viewed by others
Data Availability
The author declares that the data set will be available on request. Please contact the corresponding author.
References
Xue K, Vela PA, Liu Y, Wang Y (2012) A modified klt multiple objects tracking framework based on global segmentation and adaptive template. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012), pp 3561–3564
Yen SH, Wang CH, Chien JC (2015) Accurate and robust ROI localization in a camshift tracking application. Multimed Tools Appl 74(23):10291–10312. https://doi.org/10.1007/s11042-014-2167-z
Ryan M, Hanafiah N (2015) An examination of character recognition on ID card using template matching approach. Procedia Comput Sci 59(Iccsci):520–529. https://doi.org/10.1016/j.procs.2015.07.534
Parikh M (2013) Animal detection using template matching algorithm. International Journal of Research in Modern Engineering and Emerging Technology 1(April):26–32
Tripathi S, Sharma V, Sharma S (2011) Face detection using combined skin color detector and template matching method. Int J Comput Appl 26(7):5–8. https://doi.org/10.5120/3119-4290
Brunelli R, Poggio T (1993) Face recognition: features versus templates. IEEE Trans Pattern Anal Mach Intell 15(10):1042–1052. https://doi.org/10.1109/34.254061
Briechle K, Hanebeck UD (2001) Template matching using fast normalized cross correlation. Optical Pattern Recognition XII 4387:95–102. https://doi.org/10.1117/12.421129
Journal I, Advanced OF (2012) A survey - mathematical morphology operations on images in MATLAB 3(2):21–32
Zaibi A, Ladgham A, Sakly A (2021) A lightweight model for traffic sign classification based on enhanced LeNet-5 network. J Sens 2021. https://doi.org/10.1155/2021/8870529
Kaur T, Gandhi TK (2019) Automated brain image classification based on vgg-16 and transfer learning. In: 2019 International conference on information technology (ICIT), pp 94–98. https://doi.org/10.1109/ICIT48102.2019.00023
Singha J, Roy A, Laskar RH (2018) Dynamic hand gesture recognition using vision-based approach for human–computer interaction. Neural Comput Appl 29(4):1129–1141. https://doi.org/10.1007/s00521-016-2525-z
Saboo S, Singha J (2021) Vision based two-level hand tracking system for dynamic hand gestures in indoor environment. Multimed Tools Appl 80(13):20579–20598. https://doi.org/10.1007/s11042-021-10669-7
Nadgeri SM, Sawarkar SD, Gawande AD (2010) Hand gesture recognition using CAMSHIFT algorithm. Proceedings - 3rd international conference on emerging trends in engineering and technology, ICETET 2010, 37–41. https://doi.org/10.1109/ICETET.2010.63
Shi L, Lv JH (2013) Face detection system based on AdaBoost algorithm. Appl Mech Mater 380–384(4):3917–3920. https://doi.org/10.4028/www.scientific.net/AMM.380-384.3917
Sun S, Huang Y, Inoue K, Hara K (2023) Order space-based morphology for color image processing. J Imaging 9. https://doi.org/10.3390/jimaging9070139
Otsu N (1979) A threshold selection method from gray level histograms. IEEE Trans Syst Man Cybern 9:62–66
Bagherpour P, Cheraghi SA, Bin Mohd Mokji M (2012) Upper body tracking using KLT and kalman filter. Procedia Comput Sci 13:185–191. https://doi.org/10.1016/j.procs.2012.09.127
Nouar O-D, Ali G, Raphael C (2006) Improved object tracking with camshift algorithm. In: 2006 IEEE International conference on acoustics speech and signal processing proceedings, vol 2, p. https://doi.org/10.1109/ICASSP.2006.1660428
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. Proceedings of the IEEE computer society conference on computer vision and pattern recognition 2016-December: 2818–2826. arXiv:1512.00567. https://doi.org/10.1109/CVPR.2016.308
Li S, Wang L, Li J, Yao Y (2021) Image classification algorithm based on improved AlexNet. Journal of Physics: Conference Series 1813(1). https://doi.org/10.1088/1742-6596/1813/1/012051
Nayyar A, Puri V (2015) Raspberry Pi-A small, powerful, cost effective and efficient form factor computer: a review. Int J Adv Res Comp Sci Software Eng 5(12):720–737
Acknowledgements
This work is supported by DST (Govt. of India) under the SEED Division [SP/YO/407/2018].
Author information
Authors and Affiliations
Contributions
Rabul Laskar, Joyeeta Singha, and Shweta Saboo conceived and designed the study. Manoj kumar sain conducted data gathering, performed model analyses, and wrote the article.
Corresponding author
Ethics declarations
The author declares that the manuscript is prepared as per the journal’s guidelines for author’s.
Conflicts of Interest
The authors declare no conflicts of interest exist.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sain, M.K., Saboo, S., Singha, J. et al. Improved mKLT and low layered HG-CNN based dynamic gesture recognition hardware system. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18647-5
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-18647-5