Advertisement

Sensing and Imaging

, 20:4 | Cite as

Multiple Proposals for Continuous Arabic Sign Language Recognition

  • Mohamed HassanEmail author
  • Khaled Assaleh
  • Tamer Shanableh
Original Paper
  • 45 Downloads

Abstract

The deaf community relies on sign language as the primary means of communication. For the millions of people around the world who suffer from hearing loss, interaction with hearing people is quite difficult. The main objective of sign language recognition (SLR) is the development of automatic SLR systems to facilitate communication with the deaf community. Arabic SLR (ArSLR) specifically did not receive much attention until recent years. This work presents a comprehensive comparison between two different recognition techniques for continuous ArSLR, namely a Modified k-Nearest Neighbor which is suitable for sequential data and Hidden Markov Models (HMMs) techniques based on two different toolkits. Additionally, in this work, two new ArSL datasets composed of 40 Arabic sentences are collected using Polhemus G4 motion tracker and a camera. An existing glove-based dataset is employed in this work as well. The three datasets are made publicly available to the research community. The advantages and disadvantages of each data acquisition approach and classification technique are discussed in this paper. In the experimental results section, it is shown that classification accuracy for sign sentences acquired using a motion tracker are very similar the classification accuracy for sentences acquired using sensor gloves. The modified KNN solution is inferior to HMMs in terms of the computational time required for classification.

Keywords

Arabic sign language recognition Pattern classification Feature extraction Motion detectors 

Notes

Acknowledgements

The authors gratefully acknowledge the American University of Sharjah for supporting this research through Grant FRG14-2-26.

References

  1. 1.
    Starner, T., Weaver, J., & Pentland, A. (1998). Real-time American sign language recognition using desk and wearable computer based video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12), 1371–1375.CrossRefGoogle Scholar
  2. 2.
  3. 3.
  4. 4.
    Bsl corpus project. (2016). http://www.bslcorpusproject.org/.
  5. 5.
    Yang, R., & Sarkar, S. (2006). Detecting coarticulation in sign language using conditional random fields. In 18th international conference on pattern recognition (ICPR’06) (Vol. 2, pp. 108–112).Google Scholar
  6. 6.
    Yang, R., Sarkar, S., & Loeding, B. (2007). Enhanced level building algorithm for the movement epenthesis problem in sign language recognition. In IEEE conference on computer vision and pattern recognition (pp. 1–8).Google Scholar
  7. 7.
    Yang, R., Sarkar, S., & Loeding, B. (2010). Handling movement epenthesis and hand segmentation ambiguities in continuous sign language recognition using nested dynamic programming. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 462–477.CrossRefGoogle Scholar
  8. 8.
    Cooper, H., Holt, B., & Bowden, R. (2011). Sign language recognition. In Visual analysis of humans (pp. 539–562). London: Springer.Google Scholar
  9. 9.
    Ong, S. C., & Ranganath, S. (2005). Automatic sign language analysis: A survey and the future beyond lexical meaning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6), 873–891.CrossRefGoogle Scholar
  10. 10.
    Dipietro, L., Sabatini, A. M., & Dario, P. (2008). A survey of glove-based systems and their applications. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 38(4), 461–482.CrossRefGoogle Scholar
  11. 11.
    Agrawal, S. C., Jalal, A. S., & Tripathi, R. K. (2016). A survey on manual and non-manual sign language recognition for isolated and continuous sign. International Journal of Applied Pattern Recognition, 3(2), 99–134.CrossRefGoogle Scholar
  12. 12.
    Al-Rousan, M., & Hussain, M. (2001). Automatic recognition of Arabic sign language finger spelling. International Journal of Computers and Their Applications, 8, 80–88.Google Scholar
  13. 13.
    Assaleh, K., & Al-Rousan, M. (2005). Recognition of Arabic sign language alphabet using polynomial classifiers. EURASIP Journal on Applied Signal Processing, 2005, 2136–2145.zbMATHGoogle Scholar
  14. 14.
    Uebersax, D., Gall, J., den Bergh, M. V., & Gool, L. V. (2011). Real-time sign language letter and word recognition from depth data. In IEEE international conference on computer vision workshops (ICCV Workshops) (pp. 383–390).Google Scholar
  15. 15.
    Oz, C., & Leu, M. C. (2011). American sign language word recognition with a sensory glove using artificial neural networks. Engineering Applications of Artificial Intelligence, 24(7), 1204–1213.CrossRefGoogle Scholar
  16. 16.
    Shanableh, T., Assaleh, K., & Al-Rousan, M. (2007). Spatio-temporal feature-extraction techniques for isolated gesture recognition in Arabic sign language. IEEE Transactions on Systems, Man, and Cybernetics Part B (Cybernetics), 37(3), 641–650.CrossRefGoogle Scholar
  17. 17.
    Gweth, Y. L., Plahl, C., & Ney, H. (2012). Enhanced continuous sign language recognition using PCA and neural network features. In IEEE computer society conference on computer vision and pattern recognition workshop (pp. 55–60).Google Scholar
  18. 18.
    Forster, J., Oberdörfer, C., Koller, O., & Ney, H. (2013). Modality combination techniques for continuous sign language recognition. In Pattern recognition and image analysis. IbPRIA 2013. Lecture notes in computer science (Vol. 7887, pp. 89–99). Berlin, Heidelberg: Springer.Google Scholar
  19. 19.
    Koller, O., Zargaran, O., Ney, H., & Bowden, R. (2016). Deep sign: Hybrid CNN-HMM for continuous sign language recognition. In British machine vision conference.Google Scholar
  20. 20.
    Pu, J., Zhou, W., Zhang, J., & Li, H. (2016). Sign language recognition based on trajectory modeling with HMMs. In Multimedia modeling. MMM 2016. Lecture notes in computer science (Vol. 9516, pp. 686–697). Cham: Springer.Google Scholar
  21. 21.
    Kong, W., & Ranganath, S. (2014). Towards subject independent continuous sign language recognition: A segment and merge approach. Pattern Recognition, 47(3), 1294–1308.CrossRefGoogle Scholar
  22. 22.
    Kong, W. W., & Ranganath, S. (2008). Automatic hand trajectory segmentation and phoneme transcription for sign language. In 8th IEEE international conference on automatic face & gesture recognition (pp. 1–6). Netherlands.Google Scholar
  23. 23.
    Koller, O., Forster, J., & Ney, H. (2015). Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Computer Vision and Image Understanding, 141, 108–125.CrossRefGoogle Scholar
  24. 24.
    Gao, W., Fang, G., Zhao, D., & Chen, Y. (2004). A Chinese sign language recognition system based on SOFM/SRN/HMM. Pattern Recognition, 37(12), 2389–2402.CrossRefGoogle Scholar
  25. 25.
    Fang, G., Gao, W., & Zhao, D. (2007). Large-vocabulary continuous sign language recognition based on transition-movement models. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 37(1), 1–9.CrossRefGoogle Scholar
  26. 26.
    Chai, X., Li, G., Lin, Y., Xu, Z., Tang, Y., Chen, X., & Zhou, M. (2013). Sign language recognition and translation with Kinect.Google Scholar
  27. 27.
    Chen, X., et al. (2013). Kinect sign language translator expands communication possibilities.Google Scholar
  28. 28.
    Zafrulla, Z., Brashear, H., Starner, T., Hamilton, H., & Presti, P. (2011). American sign language recognition with the Kinect. In Proceedings of the 13th international conference on multimodal interfaces (pp. 279–286). Spain.Google Scholar
  29. 29.
    Lang, S., Block, M., & Rojas, R. (2012). Sign language recognition using Kinect. In Artificial intelligence and soft computing. ICAISC 2012. Lecture notes in computer science (Vol. 7267, pp. 394–402). Berlin: Springer.Google Scholar
  30. 30.
    Mohandes, M., Deriche, M., & Liu, J. (2014). Image-based and sensor-based approaches to Arabic sign language recognition. IEEE Transactions on Human-Machine Systems, 44(4), 551–557.CrossRefGoogle Scholar
  31. 31.
    Al-Jarrah, O., & Halawani, A. (2001). Recognition of gestures in Arabic sign language using neuro-fuzzy systems. Artificial Intelligence, 133(1–2), 117–138.CrossRefGoogle Scholar
  32. 32.
    Elhenawy, I., & Khamiss, A. (2014). The design and implementation of mobile Arabic fingerspelling recognition system. International Journal of Computer Science and Network Security (IJCSNS), 14(2), 149.Google Scholar
  33. 33.
    Assaleh, K., Shanableh, T., Fanaswala, M., Amin, F., & Bajaj, H. (2010). Continuous Arabic sign language recognition in user dependent mode. Journal of Intelligent Learning Systems and Applications, 2(01), 19.CrossRefGoogle Scholar
  34. 34.
    Tubaiz, N., Shanableh, T., & Assaleh, K. (2015). Glove-based continuous Arabic sign language recognition in user-dependent mode. IEEE Transactions on Human-Machine Systems, 45(4), 526–533.CrossRefGoogle Scholar
  35. 35.
    Tuffaha, M., Shanableh, T., & Assaleh, K. (2015). Novel feature extraction and classification technique for sensor-based continuous Arabic sign language recognition, pp. 290–299.Google Scholar
  36. 36.
    Walker, W., Lamere, P., Kwok, P., Raj, B., Singh, R., Gouvea, E., et al. (2004). Sphinx-4: A flexible open source framework for speech recognition. Mountain View, California: Sun Microsystems, Inc.Google Scholar
  37. 37.
    Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., et al. (2002). The HTK book (Vol. 3, p. 175). Cambridge: Cambridge University Engineering Department.Google Scholar
  38. 38.
    Lee, A., Kawahara, T., & Shikano, K. (2001). Julius—An open source real-time large vocabulary recognition engine. In European conference on speech communication and technology (EUROSPEECH).Google Scholar
  39. 39.
    Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., et al. (2011). The kaldi speech recognition toolkit, no. EPFL-CONF-192584.Google Scholar
  40. 40.
    Rybach, D., Gollan, C., Heigold, G., Hoffmeister, B., Lööf, J., Schlüter, R., et al. (2009). The RWTH AACHEN university open source speech recognition system. In 10th annual conference of the international speech communication association (pp. 2111–2114). Brighton, UK.Google Scholar
  41. 41.
    Westeyn, T., Brashear, H., Atrash, A., & Starner, T. (2003). Georgia tech gesture toolkit: Supporting experiments in gesture recognition. In 5th international conference on multimodal interfaces (pp. 85–92). New York.Google Scholar
  42. 42.
    Dreuw, P., Rybach, D., Deselaers, T., Zahedi, M., & Ney, H. (2007). Speech recognition techniques for a sign language recognition system. In 8th annual conference of the international speech communication association (p. 80). Belgium.Google Scholar
  43. 43.
    Dreuw, P., Rybach, D., Heigold, G., & Ney, H. (2012). RWTH OCR: A large vocabulary optical character recognition system for Arabic scripts. In Guide to OCR for Arabic scripts (pp. 215–254). London: Springer.Google Scholar
  44. 44.
    Gillian, N., & Paradiso, J. A. (2014). The gesture recognition toolkit. The Journal of Machine Learning Research, 15(1), 3483–3487.Google Scholar
  45. 45.
    Lööf, J., Gollan, C., Hahn, S., Heigold, G., Hoffmeister, B., Plahl, C., et al. (2007). The RWTH 2007 TC-STAR evaluation system for European English and Spanish. In 8th annual conference of the international speech communication association (pp. 2145–2148). Belgium.Google Scholar
  46. 46.
    Rybach, D., Hahn, S., Gollan, C., Schluter, R., & Ney, H. (2007). Advances in Arabic broadcast news transcription at RWTH. In IEEE workshop on automatic speech recognition & understanding (ASRU) (pp. 449–454). Koyoto, Japan.Google Scholar
  47. 47.
    Sundermeyer, M., Nußbaum-Thom, M., Wiesler, S., Plahl, C., Mousa, A. E.-D., Hahn, S., et al. (2011). The RWTH 2010 Quaero ASR evaluation system for English, French, and German. In IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2212–2215). Prague, Czech Republic.Google Scholar
  48. 48.
    Plahl, C., Hoffmeister, B., Hwang, M., Lu, D., Heigold, G., Lööf, J., et al. (2008). Recent improvements of the RWTH GALE mandarin LVCSR system. In 9th annual conference of the international speech communication association (pp. 2426–2429). Brisbane, Australia.Google Scholar
  49. 49.
    Povey, D., & Woodland, P. C. (2002). Minimum phone error and i-smoothing for improved discriminative training. In IEEE international conference on acoustics, speech, and signal processing (pp. I-105). Orlando, FL, USA.Google Scholar
  50. 50.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Mechatronics Engineering ProgramAmerican University of SharjahSharjahUAE
  2. 2.Department of Electrical EngineeringAjman UniversityAjmanUAE
  3. 3.Department of Computer Science and EngineeringAmerican University of SharjahSharjahUAE

Personalised recommendations