Abstract
A sign language recognition system can be applied to reduce a communication gap between deaf and normal persons. However, the Indian sign language recognition (ISL) systems are in the developing stage. Most of the recent ISL recognition systems use convolutional neural networks (CNNs) where applied convolution operation shifts a kernel to overlapping portions over the image. However, these kernels may learn redundant data since real-world images have very high correlations. The training process of neural networks is challenging for redundant image data. To overcome this limitation, an ISL recognition system has been proposed in this paper that uses the network deconvolution technique. This technique reduces not only pixel-wise correlation but also a channel-wise correlation in images. The proposed model is also augmented with a spatial transformer network to increase spatial invariance of convolution operations against spatial transformations. The proposed recognizer offers better accuracy most of the time than other experimented systems on two ISL datasets VUCS_ISL_I and created VUCS_ISL_II and standard datasets of other sign languages, i.e., American sign language, Arabic sign language, Spanish sign language.
Similar content being viewed by others
Data availability
Data will be made available on reasonable request.
Code availability
Custom code is available.
References
Devi Prasad B, Srilatha Juvva MN (2020) The contemporary Indian family transitions and diversity. Taylor & Francis, London. https://doi.org/10.4324/9781003057796
Sandler W, Lillo-Martin D (2006) Sign language and linguistic Universals, pp. 1–547. https://doi.org/10.1017/CBO9781139163910
Nguyen T-N, Huynh H-H, Meunier J (2013) Static hand gesture recognition using artificial neural network. J Image Graph 1:34–38
Nagi J, Ducatelle F, Di Caro GA, Ciresan D, Meier U, Giusti A, Nagi F, Schmidhuber J, Gambardella LM (2011) Max-pooling convolutional neural networks for vision-based hand gesture recognition. In: 2011 IEEE International conference on signal and image processing applications (ICSIPA), pp. 342–347 . https://doi.org/10.1109/ICSIPA.2011.6144164
Ye C, Evanusa M, He H, Mitrokhin A, Goldstein T, Yorke JA, Fermuller C, Aloimonos Y (2020) Network deconvolution. In: International Conference on Learning Representations . https://openreview.net/forum?id=rkeu30EtvS
Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K (2015) Spatial transformer networks. In: NIPS
Kolodziej J, Khan SU, Wang L, Min-Allah N, Madani SA, Ghani N, Li, H (2011) An application of markov jump process model for activity-based indoor mobility prediction in wireless networks. In: 2011 Frontiers of information technology, pp. 51–56 . https://doi.org/10.1109/FIT.2011.17
Khan SU, Min-Allah N (2012) A goal programming based energy efficient resource allocation in data centers. J Supercomp 61(3):502–519. https://doi.org/10.1007/s11227-011-0611-7
Zomaya AY, Lee YC (2012) Comparison and analysis of greedy energy-efficient scheduling algorithms for computational grids, pp. 189–214. https://doi.org/10.1002/9781118342015.ch7
Lyu Y, Chen L, Zhang C, Qu D, Min Allah N, Wang Y (2018) An interleaved depth-first search method for the linear optimization problem with disjunctive constraints. J Glob Optimizat. https://doi.org/10.1007/s10898-017-0602-1
Min-Allah N, Kazmi A-R, Ali I, Jian-Sheng X, Yong-Ji W (2007) Minimizing response time implication in dvs scheduling for low power embedded systems. In: 2007 Innovations in information technologies (IIT), pp. 347–351 . https://doi.org/10.1109/IIT.2007.4430421
Al-Rousan M, Assaleh K, Tala’a A (2009) Video-based signer-independent arabic sign language recognition using hidden markov models. Appl Soft Comput 9:990–999. https://doi.org/10.1016/j.asoc.2009.01.002
Shanableh T, Assaleh K (2011) User-independent recognition of arabic sign language for facilitating communication with the deaf community. Digital Sign Process 21:535–542. https://doi.org/10.1016/j.dsp.2011.01.015
Mohandes M, Deriche M, Johar U, Ilyas S (2012) A signer-independent arabic sign language recognition system using face detection, geometric features, and a hidden markov model. Comput Electr Eng 38(2):422–433. https://doi.org/10.1016/j.compeleceng.2011.10.013
Djamila D, Larabi S (2014) User-independent system for sign language finger spelling recognition. J Visual Commun Image Represent. https://doi.org/10.1016/j.jvcir.2013.12.019
Tubaiz N, Shanableh T, Assaleh K (2015) Glove-based continuous arabic sign language recognition in user-dependent mode. IEEE Trans Human-Mach Syst 45(4):526–533. https://doi.org/10.1109/THMS.2015.2406692
Kelly D, Mcdonald J, Markham C (2010) A person independent system for recognition of hand postures used in sign language. Patt Recognit Lett. https://doi.org/10.1016/j.patrec.2010.02.004
Sun C, Zhang T, Bao B-K, Xu C, Mei T (2013) Discriminative exemplar coding for sign language recognition with kinect. IEEE Trans Cybernet 43(5):1418–1428. https://doi.org/10.1109/TCYB.2013.2265337
Sun C, Zhang T, Xu C (2015) Latent support vector machine modeling for sign language recognition with kinect. ACM Trans Intell Syst Technol. https://doi.org/10.1145/2629481
Kim SY, Han HG, Kim JW, Lee S, Kim TW (2017) A hand gesture recognition sensor using reflected impulses. IEEE Sens J 17(10):2975–2976. https://doi.org/10.1109/JSEN.2017.2679220
Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951. https://doi.org/10.1007/s00521-016-2294-8
Yang W, Tao J, Ye Z (2016) Continuous sign language recognition using level building based on fast hidden markov model. Patt Recogn Lett 78(C):28–35. https://doi.org/10.1016/j.patrec.2016.03.030
Guo D, Zhou W, Li H, Wang M (2017) Online early-late fusion based on adaptive hmm for sign language recognition. ACM Trans Multim Comput, Commun Appl 14:1–18. https://doi.org/10.1145/3152121
Jiang X, Lu M, Wang S (2020) An eight-layer convolutional neural network with stochastic pooling, batch normalization and dropout for fingerspelling recognition of chinese sign language. Multim Tools Appl 79:1–19. https://doi.org/10.1007/s11042-019-08345-y
Karami A, Zanj B, Sarkaleh AK (2011) Persian sign language (psl) recognition using wavelet transform and neural networks. Expert Syst Appl 38:2661–2667
Lewis MP, Simons GF (2013) Ethnologue: languages of the World. SIL International, Dallas
Tripathi K, Nandi NBGC (2015) Continuous Indian sign language gesture recognition and sentence formation. Procedia Comput Sci 54:523–531
Mehrotra K, Godbole A, Belhe S (2015) Indian sign language recognition using kinect sensor. In: ICIAR
Tripathi K, Baranwal N, Nandi GC (2015) Continuous dynamic Indian sign language gesture recognition with invariant backgrounds. 2015 International conference on advances in computing, communications and informatics (ICACCI), 2211–2216
Kishore PVV, Prasad MVD, Kumar DA, Sastry ASC (2016) Optical flow hand tracking and active contour hand shape features for continuous sign language recognition with artificial neural networks. 2016 IEEE 6th International conference on advanced computing (IACC), 346–351
Naglot D, Kulkarni M (2016) Ann based indian sign language numerals recognition using the leap motion controller. 2016 International Conference on inventive computation technologies (ICICT) 2, 1–6
Kumar DA, Kishore PVV, Sastry ASC, Swamy PRG (2016) Selfie continuous sign language recognition using neural network. 2016 IEEE Annual India conference (INDICON), 1–6
Ahmed W, Chanda K, Mitra S (2016) Vision based hand gesture recognition using dynamic time warping for indian sign language. 2016 International conference on information science (ICIS), 120–125
Kumar P, Gauba H, Roy PP, Dogra DP (2017) A multimodal framework for sensor based sign language recognition. Neurocomputing 259:21–38
Kumar P, Gauba H, Roy PP, Dogra DP (2017) Coupled hmm-based multi-sensor data fusion for sign language recognition. Patt Recognit Lett 86:1–8
Rao GA, Kishore PVV (2018) Selfie video based continuous indian sign language recognition system. Ain Shams Engineering Journal
Kumar P, Saini R, Behera SK, Dogra DP, Roy PP (2017) Real-time recognition of sign language gestures and air-writing using leap motion. In: 2017 Fifteenth IAPR International conference on machine vision applications (MVA), pp. 157–160 . https://doi.org/10.23919/MVA.2017.7986825
Kumar P, Saini R, Behera SK, Dogra DP, Roy PP (2017) Real-time recognition of sign language gestures and air-writing using leap motion. In: 2017 Fifteenth IAPR International conference on machine vision applications (MVA), pp. 157–160. https://doi.org/10.23919/MVA.2017.7986825
Wadhawan A, Kumar P (2019) Sign language recognition systems: a decade systematic literature review. Arch Comput Meth Eng. https://doi.org/10.1007/s11831-019-09384-2
Bhatia P, Wadhawan A (2021) Deep learning-based sign language recognition system for static signs. Neural Comput Applicat. https://doi.org/10.1007/s00521-019-04691-y(
Kumar EK, Kishore PVV, Kiran Kumar MT, Kumar DA (2020) 3d sign language recognition with joint distance and angular coded color topographical descriptor on a 2 - stream cnn. Neurocomput 372(C):40–54. https://doi.org/10.1016/j.neucom.2019.09.059
Ekbote J, Joshi MJ (2017) Indian sign language recognition using ann and svm classifiers. 2017 International conference on innovations in information, embedded and communication systems (ICIIECS), 1–5
Sahoo AK, Sarangi PK, Goyal P (2020) Indian sign language recognition using soft computing techniques, 37–65. Chap. 2. https://doi.org/10.1002/9781119682042.ch2
Sharma S, Gupta R, Kumar A (2022) Trbaggboost: an ensemble-based transfer learning method applied to Indian sign language recognition. J Amb Intell Human Comput. https://doi.org/10.1007/s12652-020-01979-z
Sharma A, Sharma N, Saxena Y, Singh A, Sadhya D (2021) Benchmarking deep neural network approaches for indian sign language recognition. Neural Comput Appl 33:6685–6696
Sharma S, Singh S (2022) Recognition of indian sign language (isl) using deep learning model. Wirel Personal Commun. https://doi.org/10.1007/s11277-021-09152-1
Sharma S, Singh S (2021) Vision-based hand gesture recognition using deep learning for the interpretation of sign language. Expert Syst Applicat 182:115657. https://doi.org/10.1016/j.eswa.2021.115657
Sharma S, Gupta R, Kumar A (2022) Trbaggboost: an ensemble-based transfer learning method applied to indian sign language recognition. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-020-01979-z
Nandi U, Ghorai A, Singh MM, Changdar C, Bhakta S, Kumar Pal R (2022) Indian sign language alphabet recognition system using CNN with diffgrad optimizer and stochastic pooling. Multim Tool Appl 82(7):9627–9648
Areeb QM, Maryam-Nadeem M, Alroobaea R, Anwer F (2022) A deep learning-based approach helping hearing-impaired in emergency situations. IEEE Access 10:8502–8517. https://doi.org/10.1109/ACCESS.2022.3142918
Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583):607–609. https://doi.org/10.1038/381607a0
Hyvärinen A, Hurri J, Hoyer PO (2009) Natural image statistics - a probabilistic approach to early computational vision. In: Computational Imaging and Vision
Minango J, de Almeida C (2018) Low complexity zero forcing detector based on newton-schultz iterative algorithm for massive mimo systems. IEEE Trans Vehicul Technol 67(12):11759–11766. https://doi.org/10.1109/TVT.2018.2874811
PRATHUM ARIKERI: American Sign Language (ASL) Dataset. https://www.kaggle.com/datasets/prathumarikeri/american-sign-language-09az
Latif G, Mohammad N, Alghazo J, AlKhalaf R, AlKhalaf R (2019) Arasl: arabic alphabets sign language dataset. Data in Brief 23:103777. https://doi.org/10.1016/j.dib.2019.103777
KIRLE LEA: Spanish sign language alphabet (Static). https://www.kaggle.com/datasets/kirlelea/spanish-sign-language-alphabet-static (2022)
Kumar A, Madaan M, Kumar S, Saha A, Yadav S (2021) Indian sign language gesture recognition in real-time using convolutional neural networks. In: 2021 8th International conference on signal processing and integrated networks (SPIN), pp. 562–568. https://doi.org/10.1109/SPIN52536.2021.9566005
Agrawal SC, Jalal AS, Bhatnagar C (2012) Recognition of indian sign language using feature fusion. 2012 4th International conference on intelligent human computer interaction (IHCI), 1–5
Rahaman MA, Jasim M, Ali MH, Hasanuzzaman M (2014) Real-time computer vision-based bengali sign language recognition. 2014 17th International conference on computer and information technology (ICCIT), 192–197
Yasir F, Prasad PWC, Alsadoon A, Elchouemi A (2015) Sift based approach on bangla sign language recognition. 2015 IEEE 8th International workshop on computational intelligence and applications (IWCIA), 35–39
Uddin MA, Chowdhury SA (2016) Hand sign language recognition for bangla alphabet using support vector machine. 2016 International conference on innovations in science, engineering and technology (ICISET), 1–4
Kumar P, Saini R, Behera SK, Dogra DP, Roy PP (2017) Real-time recognition of sign language gestures and air-writing using leap motion. In: 2017 Fifteenth IAPR International conference on machine vision applications (MVA), pp. 157–160. https://doi.org/10.23919/MVA.2017.7986825
Kumar P, Saini R, Roy PP, Dogra DP (2017) A position and rotation invariant framework for sign language recognition (slr) using kinect. Multim Tools Appl 77:8823–8846
Singh R, Jangid M (2021) Indian sign language recognition using color space model and thresholding. In: 2021 Asian conference on innovation in technology (ASIANCON), pp. 1–4 . https://doi.org/10.1109/ASIANCON51346.2021.9544615
Katoch S, Singh V, Tiwary US (2022) Indian sign language recognition system using surf with SVM and CNN. Array 14:100141. https://doi.org/10.1016/j.array.2022.100141
Acknowledgements
We like to thank the Dept. of Computer Science, Vidyasagar University, Paschim Medinipur, Midnapore 721102, West Bengal, India to provide the infrastructures to carry out our experiments.
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
AG was contributed to implementation and drafting; UN was contributed to conceptualization, investigation, methodology, analysis, and supervision; Others were contributed to review and editing.
Corresponding author
Ethics declarations
Conflict of interest
There are no conflicts of interest/competing interests.
Ethics approval
The authors approve that the research presented in this paper is conducted following the principles of ethical and professional conduct.
Consent to participate
Not applicable.
Consent for publication
Not applicable, the authors used publicly available data only and provide the corresponding references.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ghorai, A., Nandi, U., Changdar, C. et al. Indian sign language recognition system using network deconvolution and spatial transformer network. Neural Comput & Applic 35, 20889–20907 (2023). https://doi.org/10.1007/s00521-023-08860-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08860-y