Skip to main content

Advertisement

Log in

Dynamic Iranian Sign Language Recognition Using an Optimized Deep Neural Network: An Implementation via a Robotic-Based Architecture

  • Published:
International Journal of Social Robotics Aims and scope Submit manuscript

Abstract

Sign language is a non-verbal communication tool used by the deaf. A robust sign language recognition framework is needed to develop Human–Robot Interaction (HRI) platforms that are able to interact with humans via sign language. Iranian sign language (ISL) is composed of both static postures and dynamic gestures of the hand and fingers. In this paper, we present a robust framework using a Deep Neural Network (DNN) to recognize dynamic ISL gestures captured by motion capture gloves in Real-Time. To this end, first, a dataset of fifteen ISL classes was collected in time series; then, this dataset was virtually augmented and pre-processed using the “state-image” method to produce a unique collection of images, each image corresponding to a specific set of sequential data representing a class. Next, by implementing a continuous Genetic algorithm, an optimal deep neural network with the minimum number of weights (trainable parameters) and the maximum overall accuracy was found. Finally, the dataset was fed to the DNN to train the model. The results showed that the optimization process was successful at finding a DNN structure highly suitable for this application, with 99.7% accuracy on the verification (test) data. Then, after implementing the module in a robotic architecture, an HRI experiment was conducted to assess the system’s performance in real-time applications. Preliminary statistical analysis on the standard UTAUT model for eight participants showed that the system can recognize ISL signs quickly and accurately during human–robot interaction. The proposed methodology can be used for other sign languages as no specific characteristics of ISL were used in the preprocessing or training stage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Data availability

All data from this project (videos of the sessions, results of the questionnaires, scores of performances, etc.) are available in the archive of the Social & Cognitive Robotics Laboratory.

Code availability

All of the codes are available in the archive of the Social & Cognitive Robotics Laboratory. In case the readers need the codes, they may contact the corresponding author.

Notes

  1. Time-of-Flight.

  2. Adaptive Moment Estimation.

  3. Stochastic Gradient Descent.

References

  1. W. H. O. (WHO). "Deafness and hearing loss." https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss (accessed 2019–04–27.

  2. Marschark M, Hauser PC (2011) How deaf children learn: what parents and teachers need to know. Oxford University Press, USA

    Google Scholar 

  3. Courtin C (2000) The impact of sign language on the cognitive development of deaf children: the case of theories of mind. J Deaf Stud Deaf Educ 5(3):266–276. https://doi.org/10.1093/deafed/5.3.266

    Article  Google Scholar 

  4. M. Zakipour, A. Meghdari, and M. Alemi, (2016) RASA: A low-cost upper-torso social robot acting as a sign language teaching assistant, presented at the International Conference on Social Robotics.

  5. S. R. Hosseini, A. Taheri, A. Meghdari, and M. Alemi, (2019) Teaching persian sign language to a social robot via the learning from demonstrations approach, Presented at the International Conference on Social Robotics.

  6. Meghdari A, Alemi M, Zakipour M, Kashanian SA (2018) Design and realization of a sign language educational humanoid robot. J Intell Rob Syst 95(1):3–17. https://doi.org/10.1007/s10846-018-0860-2

    Article  Google Scholar 

  7. Karami A, Zanj B, Sarkaleh AK (2011) Persian sign language (PSL) recognition using wavelet transform and neural networks. Expert Syst Appl 38(3):2661–2667. https://doi.org/10.1016/j.eswa.2010.08.056

    Article  Google Scholar 

  8. A. Kiani Sarkaleh, F. Poorahangaryan, B. Zanj, and A. Karami, (2009) A Neural Network based system for Persian sign language recognition, presented at the 2009 IEEE International Conference on Signal and Image Processing Applications, Kuala Lumpur, Malaysia.

  9. Starner T, Weaver J, Pentland A (1998) Real-time American sign language recognition using desk and wearable computer based video. IEEE Trans Pattern Anal Mach Intell 20(12):1371–1375. https://doi.org/10.1109/34.735811

    Article  Google Scholar 

  10. P. V. V. Kishore, M. V. D. Prasad, C. R. Prasad, and R. Rahul, (2015) 4-Camera model for sign language recognition using elliptical fourier descriptors and ANN, presented at the 2015 International Conference on Signal Processing and Communication Engineering Systems, Guntur, India.

  11. Oz C, Leu MC (2011) American Sign Language word recognition with a sensory glove using artificial neural networks. Eng Appl Artif Intell 24(7):1204–1213. https://doi.org/10.1016/j.engappai.2011.06.015

    Article  Google Scholar 

  12. S. A. Mehdi and Y. N. Khan, (2002) Sign language recognition using sensor glove, In Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02, Singapore, Singapore: IEEE, https://doi.org/10.1109/ICONIP.2002.1201884.

  13. R.-H. Liang, (1998) A real-time continuous gesture recognition system for sign language, In Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan: IEEE, https://doi.org/10.1109/AFGR.1998.671007.

  14. A. Agarwal and M. K. Thakur, (2013) Sign language recognition using Microsoft Kinect, presented at the 2013 Sixth International Conference on Contemporary Computing (IC3), Noida, India 8–10 Aug. 2013.

  15. Z. Zafrulla, B. Brashear, S. Starner, H. Hamilton, and P. Presti, (2011) American sign language recognition with the kinect, in ICMI '11 Proceedings of the 13th international conference on multimodal interfaces Alicante, Spain, https://doi.org/10.1145/2070481.2070532.

  16. S. Lang, M. Block, and R. Rojas, (2012) Sign Language Recognition Using Kinect, Presented at the International Conference on Artificial Intelligence and Soft Computing ICAISC 2012.

  17. Zahedi M, Manashty AR (2011) Robust sign language recognition system using ToF depth cameras. World Comput Sci Inf Technol J (WCSIT) 1(3):50–55 (arXiv:1105.0699)

    Google Scholar 

  18. S. Oprisescu, C. Rasche, and B. Su, (2012) Automatic static hand gesture recognition using ToF cameras, In 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO), Bucharest, Romania, 27–31 Aug. 2012.

  19. L. E. Potter, J. Araullo, and C. Carter, (2013) The Leap Motion controller: a view on sign language, In Proceedings of the 25th Australian Computer-Human Interaction Conference: Augmentation, Application, Innovation, Collaboration Adelaide, Australia, https://doi.org/10.1145/2541016.2541072.

  20. C.-H. Chuan, E. Regina, and C. Guardino, (2014) American Sign Language Recognition Using Leap Motion Sensor, In 2014 13th International Conference on Machine Learning and Applications, Detroit, MI, USA: IEEE, https://doi.org/10.1109/ICMLA.2014.110.

  21. M. Mohandes, S. Aliyu, and M. Deriche, (2014) Arabic sign language recognition using the leap motion controller, Presented at the 2014 IEEE 23rd International Symposium on Industrial Electronics (ISIE), Istanbul, Turkey.

  22. H. Brashear, T. Starner, P. Lukowicz, and H. Junker, (2003) Using multiple sensors for mobile sign language recognition, In Seventh IEEE International Symposium on Wearable Computers, 2003. Proceedings, White Plains, NY, USA, USA, https://doi.org/10.1109/ISWC.2003.1241392.

  23. Yang HD (2014) Sign language recognition with the Kinect sensor based on conditional random fields. Sensors (Basel) 15(1):135–147. https://doi.org/10.3390/s150100135

    Article  Google Scholar 

  24. Gao WEN, Ma J, Wu J, Wang C (2000) Sign language recognition based on Hmm/Ann/Dp. Int J Pattern Recognit Artif Intell 14(05):587–602. https://doi.org/10.1142/s0218001400000386

    Article  Google Scholar 

  25. S. K. Yewale and P. K. Bharne, (2011) Hand gesture recognition using different algorithms based on artificial neural network, Presented at the 2011 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC), Udaipur, India.

  26. Izzah A, Suciati N (2014) Translation of sign language using generic fourier descriptor and nearest neighbour. Int J Cybern Inf 3(1):31–41. https://doi.org/10.5121/ijci.2014.3104

    Article  Google Scholar 

  27. Ansari ZA, Harit G (2016) Nearest neighbour classification of Indian sign language gestures using kinect camera. Sadhana 41(2):161–182. https://doi.org/10.1007/s12046-015-0405-3

    Article  MathSciNet  Google Scholar 

  28. J. Ye, H. Yao, and F. Jiang, (2004) Based on HMM and SVM multilayer architecture classifier for Chinese sign language recognition with large vocabulary, Presented at the Third International Conference on Image and Graphics (ICIG'04), Hong Kong, China, China.

  29. Subashini TS, Nagarajan S (2013) Static hand gesture recognition for sign language alphabets using edge oriented histogram and multi class SVM. Int J Comput Appl 82(4):28–35. https://doi.org/10.5120/14106-2145

    Article  Google Scholar 

  30. BPP. Kumar and MB. Manjunatha (2017) A Hybrid Gesture Recognition Method for American Sign Language, Indian Journal of Science and Technology, 10(1) https://doi.org/10.17485/ijst/2017/v10i1/109389

  31. O. Koller, S. Zargaran, R. Schlüter, and R. A Bowden, (2016) Deep Sign: Hybrid CNN-HMM for Continuous Sign Language Recognition, In The British Machine Vision Conference (BMVC), York, https://doi.org/10.5244/C.30.136.

  32. A. Krizhevsky, I. Sutskever, and G. E. Hinton, (2012) ImageNet classification with deep convolutional neural networks, Presented at the Proceedings of the 25th International Conference on Neural Information Processing Systems-Volume 1, Lake Tahoe, Nevada.

  33. Y. LeCun, Y. Bengio, and G. Hinton, (2015) Deep learning, Nature, vol. 521, p. 436, 05/27/online 2015, https://doi.org/10.1038/nature14539.

  34. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge

    MATH  Google Scholar 

  35. Cui R, Liu H, Zhang C (2019) A deep neural framework for continuous sign language recognition by iterative training. IEEE Trans Multimedia 21(7):1880–1891. https://doi.org/10.1109/tmm.2018.2889563

    Article  Google Scholar 

  36. M. Taskiran, M. Killioglu, and N. Kahraman, (2018) A Real-Time System for Recognition of American Sign Language by using Deep Learning, In 2018 41st International Conference on Telecommunications and Signal Processing (TSP), Athens, Greece: IEEE, https://doi.org/10.1109/TSP.2018.8441304.

  37. Tang A, Lu K, Wang Y, Huang J, Li H (2015) A real-time hand posture recognition system using deep neural networks. ACM Trans Intell Syst Technol 6(2):1–23. https://doi.org/10.1145/2735952

    Article  Google Scholar 

  38. Oyedotun OK, Khashman A (2016) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951. https://doi.org/10.1007/s00521-016-2294-8

    Article  Google Scholar 

  39. Azar SG, Seyedarabi H (2020) Trajectory-based recognition of dynamic Persian sign language using hidden Markov model. Comput Speech Lang 61:101053. https://doi.org/10.1016/j.csl.2019.101053

    Article  Google Scholar 

  40. K. Xing et al., (2018) Hand Gesture Recognition Based on Deep Learning Method, Presented at the 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC), Guangzhou, China.

  41. Tubaiz N, Shanableh T, Assaleh K (2015) Glove-based continuous arabic sign language recognition in user-dependent Mode. IEEE Trans Human-Mach Syst 45(4):526–533. https://doi.org/10.1109/thms.2015.2406692

    Article  Google Scholar 

  42. Dong Y (2018) An application of Deep Neural Networks to the in-flight parameter identification for detection and characterization of aircraft icing. Aerosp Sci Technol 77:34–49. https://doi.org/10.1016/j.ast.2018.02.026

    Article  Google Scholar 

  43. Dong Y (2019) Implementing Deep Learning for comprehensive aircraft icing and actuator/sensor fault detection/identification. Eng Appl Artif Intell 83:28–44. https://doi.org/10.1016/j.engappai.2019.04.010

    Article  Google Scholar 

  44. Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003

    Article  Google Scholar 

  45. van Dyk DA, Meng X-L (2001) The art of data augmentation. J Comput Graph Stat 10(1):1–50. https://doi.org/10.1198/10618600152418584

    Article  MathSciNet  Google Scholar 

  46. Barbu T (2013) Variational image denoising approach with diffusion porous media flow. Abstr Appl Anal 2013:1–8. https://doi.org/10.1155/2013/856876

    Article  MathSciNet  MATH  Google Scholar 

  47. F. Chollet. "Keras." https://keras.io/ (accessed 2019–07–23).

  48. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  49. J. Ba and D. P. Kingma, "Adam: A Method for Stochastic Optimization," presented at the the 3rd International Conference for Learning Representations, San Diego, 2015.

  50. J. Brownlee. "Gentle Introduction to the Adam Optimization Algorithm for Deep Learning." https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/ (accessed 2019–30–07, 2019).

  51. "ML Cheatsheet " https://ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html (accessed 2019–02–08.

  52. M. Mitchell, An Introduction to Genetic Algorithms. MIT Press, 1996.

  53. Basiri S, Taheri A, Meghdari A, Alemi M (2021) Design and implementation of a robotic architecture for adaptive teaching: a case study on iranian sign language. J Intell Robot Syst 102(2):48. https://doi.org/10.1007/s10846-021-01413-2

    Article  Google Scholar 

  54. Williams MD, Rana NP, Dwivedi YK (2015) The unified theory of acceptance and use of technology (UTAUT): a literature review. J Enterp Inf Manag 28(3):443–488

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the “Iranian National Science Foundation (INSF)” (http://en.insf.org/). We also appreciate the “Dr. AliAkbar Siassi Memorial Grant Award” for their complementary support of the Social & Cognitive Robotics Laboratory.

Funding

Author Alireza Taheri has received research grants from the “Iranian National Science Foundation (INSF)” (Gant No. 98025100).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design, material preparation, data collection and analysis were performed by Salar Basiri and Alireza Taheri. The first draft of the manuscript was written by Salar Basiri and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Alireza Taheri.

Ethics declarations

Conflict of interest

The authors Salar Basiri, Ali Meghdari, Mehrdad Boroushaki, and Minoo Alemi declare that they have no conflict of interest.

Ethics approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. Ethical approval for the protocol of this study was provided by the Iran University of Medical Sciences (#IR.IUMS.REC.1395.95301469).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Basiri, S., Taheri, A., Meghdari, A.F. et al. Dynamic Iranian Sign Language Recognition Using an Optimized Deep Neural Network: An Implementation via a Robotic-Based Architecture. Int J of Soc Robotics 15, 599–619 (2023). https://doi.org/10.1007/s12369-021-00819-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12369-021-00819-0

Keywords

Navigation