Multiple Proposals for Continuous Arabic Sign Language Recognition

Abstract

The deaf community relies on sign language as the primary means of communication. For the millions of people around the world who suffer from hearing loss, interaction with hearing people is quite difficult. The main objective of sign language recognition (SLR) is the development of automatic SLR systems to facilitate communication with the deaf community. Arabic SLR (ArSLR) specifically did not receive much attention until recent years. This work presents a comprehensive comparison between two different recognition techniques for continuous ArSLR, namely a Modified k-Nearest Neighbor which is suitable for sequential data and Hidden Markov Models (HMMs) techniques based on two different toolkits. Additionally, in this work, two new ArSL datasets composed of 40 Arabic sentences are collected using Polhemus G4 motion tracker and a camera. An existing glove-based dataset is employed in this work as well. The three datasets are made publicly available to the research community. The advantages and disadvantages of each data acquisition approach and classification technique are discussed in this paper. In the experimental results section, it is shown that classification accuracy for sign sentences acquired using a motion tracker are very similar the classification accuracy for sentences acquired using sensor gloves. The modified KNN solution is inferior to HMMs in terms of the computational time required for classification.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

References

  1. 1.

    Starner, T., Weaver, J., & Pentland, A. (1998). Real-time American sign language recognition using desk and wearable computer based video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12), 1371–1375.

    Article  Google Scholar 

  2. 2.

    Dgs-corpus. (2015). http://www.sign-lang.uni-hamburg.de/dgs-korpus/.

  3. 3.

    Dictasign project. (2016). http://www.sign-lang.uni-hamburg.de/dicta-sign.

  4. 4.

    Bsl corpus project. (2016). http://www.bslcorpusproject.org/.

  5. 5.

    Yang, R., & Sarkar, S. (2006). Detecting coarticulation in sign language using conditional random fields. In 18th international conference on pattern recognition (ICPR’06) (Vol. 2, pp. 108–112).

  6. 6.

    Yang, R., Sarkar, S., & Loeding, B. (2007). Enhanced level building algorithm for the movement epenthesis problem in sign language recognition. In IEEE conference on computer vision and pattern recognition (pp. 1–8).

  7. 7.

    Yang, R., Sarkar, S., & Loeding, B. (2010). Handling movement epenthesis and hand segmentation ambiguities in continuous sign language recognition using nested dynamic programming. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 462–477.

    Article  Google Scholar 

  8. 8.

    Cooper, H., Holt, B., & Bowden, R. (2011). Sign language recognition. In Visual analysis of humans (pp. 539–562). London: Springer.

  9. 9.

    Ong, S. C., & Ranganath, S. (2005). Automatic sign language analysis: A survey and the future beyond lexical meaning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6), 873–891.

    Article  Google Scholar 

  10. 10.

    Dipietro, L., Sabatini, A. M., & Dario, P. (2008). A survey of glove-based systems and their applications. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 38(4), 461–482.

    Article  Google Scholar 

  11. 11.

    Agrawal, S. C., Jalal, A. S., & Tripathi, R. K. (2016). A survey on manual and non-manual sign language recognition for isolated and continuous sign. International Journal of Applied Pattern Recognition, 3(2), 99–134.

    Article  Google Scholar 

  12. 12.

    Al-Rousan, M., & Hussain, M. (2001). Automatic recognition of Arabic sign language finger spelling. International Journal of Computers and Their Applications, 8, 80–88.

    Google Scholar 

  13. 13.

    Assaleh, K., & Al-Rousan, M. (2005). Recognition of Arabic sign language alphabet using polynomial classifiers. EURASIP Journal on Applied Signal Processing, 2005, 2136–2145.

    MATH  Google Scholar 

  14. 14.

    Uebersax, D., Gall, J., den Bergh, M. V., & Gool, L. V. (2011). Real-time sign language letter and word recognition from depth data. In IEEE international conference on computer vision workshops (ICCV Workshops) (pp. 383–390).

  15. 15.

    Oz, C., & Leu, M. C. (2011). American sign language word recognition with a sensory glove using artificial neural networks. Engineering Applications of Artificial Intelligence, 24(7), 1204–1213.

    Article  Google Scholar 

  16. 16.

    Shanableh, T., Assaleh, K., & Al-Rousan, M. (2007). Spatio-temporal feature-extraction techniques for isolated gesture recognition in Arabic sign language. IEEE Transactions on Systems, Man, and Cybernetics Part B (Cybernetics), 37(3), 641–650.

    Article  Google Scholar 

  17. 17.

    Gweth, Y. L., Plahl, C., & Ney, H. (2012). Enhanced continuous sign language recognition using PCA and neural network features. In IEEE computer society conference on computer vision and pattern recognition workshop (pp. 55–60).

  18. 18.

    Forster, J., Oberdörfer, C., Koller, O., & Ney, H. (2013). Modality combination techniques for continuous sign language recognition. In Pattern recognition and image analysis. IbPRIA 2013. Lecture notes in computer science (Vol. 7887, pp. 89–99). Berlin, Heidelberg: Springer.

  19. 19.

    Koller, O., Zargaran, O., Ney, H., & Bowden, R. (2016). Deep sign: Hybrid CNN-HMM for continuous sign language recognition. In British machine vision conference.

  20. 20.

    Pu, J., Zhou, W., Zhang, J., & Li, H. (2016). Sign language recognition based on trajectory modeling with HMMs. In Multimedia modeling. MMM 2016. Lecture notes in computer science (Vol. 9516, pp. 686–697). Cham: Springer.

  21. 21.

    Kong, W., & Ranganath, S. (2014). Towards subject independent continuous sign language recognition: A segment and merge approach. Pattern Recognition, 47(3), 1294–1308.

    Article  Google Scholar 

  22. 22.

    Kong, W. W., & Ranganath, S. (2008). Automatic hand trajectory segmentation and phoneme transcription for sign language. In 8th IEEE international conference on automatic face & gesture recognition (pp. 1–6). Netherlands.

  23. 23.

    Koller, O., Forster, J., & Ney, H. (2015). Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Computer Vision and Image Understanding, 141, 108–125.

    Article  Google Scholar 

  24. 24.

    Gao, W., Fang, G., Zhao, D., & Chen, Y. (2004). A Chinese sign language recognition system based on SOFM/SRN/HMM. Pattern Recognition, 37(12), 2389–2402.

    Article  Google Scholar 

  25. 25.

    Fang, G., Gao, W., & Zhao, D. (2007). Large-vocabulary continuous sign language recognition based on transition-movement models. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 37(1), 1–9.

    Article  Google Scholar 

  26. 26.

    Chai, X., Li, G., Lin, Y., Xu, Z., Tang, Y., Chen, X., & Zhou, M. (2013). Sign language recognition and translation with Kinect.

  27. 27.

    Chen, X., et al. (2013). Kinect sign language translator expands communication possibilities.

  28. 28.

    Zafrulla, Z., Brashear, H., Starner, T., Hamilton, H., & Presti, P. (2011). American sign language recognition with the Kinect. In Proceedings of the 13th international conference on multimodal interfaces (pp. 279–286). Spain.

  29. 29.

    Lang, S., Block, M., & Rojas, R. (2012). Sign language recognition using Kinect. In Artificial intelligence and soft computing. ICAISC 2012. Lecture notes in computer science (Vol. 7267, pp. 394–402). Berlin: Springer.

  30. 30.

    Mohandes, M., Deriche, M., & Liu, J. (2014). Image-based and sensor-based approaches to Arabic sign language recognition. IEEE Transactions on Human-Machine Systems, 44(4), 551–557.

    Article  Google Scholar 

  31. 31.

    Al-Jarrah, O., & Halawani, A. (2001). Recognition of gestures in Arabic sign language using neuro-fuzzy systems. Artificial Intelligence, 133(1–2), 117–138.

    Article  Google Scholar 

  32. 32.

    Elhenawy, I., & Khamiss, A. (2014). The design and implementation of mobile Arabic fingerspelling recognition system. International Journal of Computer Science and Network Security (IJCSNS), 14(2), 149.

    Google Scholar 

  33. 33.

    Assaleh, K., Shanableh, T., Fanaswala, M., Amin, F., & Bajaj, H. (2010). Continuous Arabic sign language recognition in user dependent mode. Journal of Intelligent Learning Systems and Applications, 2(01), 19.

    Article  Google Scholar 

  34. 34.

    Tubaiz, N., Shanableh, T., & Assaleh, K. (2015). Glove-based continuous Arabic sign language recognition in user-dependent mode. IEEE Transactions on Human-Machine Systems, 45(4), 526–533.

    Article  Google Scholar 

  35. 35.

    Tuffaha, M., Shanableh, T., & Assaleh, K. (2015). Novel feature extraction and classification technique for sensor-based continuous Arabic sign language recognition, pp. 290–299.

  36. 36.

    Walker, W., Lamere, P., Kwok, P., Raj, B., Singh, R., Gouvea, E., et al. (2004). Sphinx-4: A flexible open source framework for speech recognition. Mountain View, California: Sun Microsystems, Inc.

  37. 37.

    Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., et al. (2002). The HTK book (Vol. 3, p. 175). Cambridge: Cambridge University Engineering Department.

    Google Scholar 

  38. 38.

    Lee, A., Kawahara, T., & Shikano, K. (2001). Julius—An open source real-time large vocabulary recognition engine. In European conference on speech communication and technology (EUROSPEECH).

  39. 39.

    Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., et al. (2011). The kaldi speech recognition toolkit, no. EPFL-CONF-192584.

  40. 40.

    Rybach, D., Gollan, C., Heigold, G., Hoffmeister, B., Lööf, J., Schlüter, R., et al. (2009). The RWTH AACHEN university open source speech recognition system. In 10th annual conference of the international speech communication association (pp. 2111–2114). Brighton, UK.

  41. 41.

    Westeyn, T., Brashear, H., Atrash, A., & Starner, T. (2003). Georgia tech gesture toolkit: Supporting experiments in gesture recognition. In 5th international conference on multimodal interfaces (pp. 85–92). New York.

  42. 42.

    Dreuw, P., Rybach, D., Deselaers, T., Zahedi, M., & Ney, H. (2007). Speech recognition techniques for a sign language recognition system. In 8th annual conference of the international speech communication association (p. 80). Belgium.

  43. 43.

    Dreuw, P., Rybach, D., Heigold, G., & Ney, H. (2012). RWTH OCR: A large vocabulary optical character recognition system for Arabic scripts. In Guide to OCR for Arabic scripts (pp. 215–254). London: Springer.

  44. 44.

    Gillian, N., & Paradiso, J. A. (2014). The gesture recognition toolkit. The Journal of Machine Learning Research, 15(1), 3483–3487.

    Google Scholar 

  45. 45.

    Lööf, J., Gollan, C., Hahn, S., Heigold, G., Hoffmeister, B., Plahl, C., et al. (2007). The RWTH 2007 TC-STAR evaluation system for European English and Spanish. In 8th annual conference of the international speech communication association (pp. 2145–2148). Belgium.

  46. 46.

    Rybach, D., Hahn, S., Gollan, C., Schluter, R., & Ney, H. (2007). Advances in Arabic broadcast news transcription at RWTH. In IEEE workshop on automatic speech recognition & understanding (ASRU) (pp. 449–454). Koyoto, Japan.

  47. 47.

    Sundermeyer, M., Nußbaum-Thom, M., Wiesler, S., Plahl, C., Mousa, A. E.-D., Hahn, S., et al. (2011). The RWTH 2010 Quaero ASR evaluation system for English, French, and German. In IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2212–2215). Prague, Czech Republic.

  48. 48.

    Plahl, C., Hoffmeister, B., Hwang, M., Lu, D., Heigold, G., Lööf, J., et al. (2008). Recent improvements of the RWTH GALE mandarin LVCSR system. In 9th annual conference of the international speech communication association (pp. 2426–2429). Brisbane, Australia.

  49. 49.

    Povey, D., & Woodland, P. C. (2002). Minimum phone error and i-smoothing for improved discriminative training. In IEEE international conference on acoustics, speech, and signal processing (pp. I-105). Orlando, FL, USA.

  50. 50.

    RASR manual. (2017). http://www.hltpr.rwth-aachen.de/rasr/manual

Download references

Acknowledgements

The authors gratefully acknowledge the American University of Sharjah for supporting this research through Grant FRG14-2-26.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Mohamed Hassan.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hassan, M., Assaleh, K. & Shanableh, T. Multiple Proposals for Continuous Arabic Sign Language Recognition. Sens Imaging 20, 4 (2019). https://doi.org/10.1007/s11220-019-0225-3

Download citation

Keywords

  • Arabic sign language recognition
  • Pattern classification
  • Feature extraction
  • Motion detectors