Sign boundary and hand articulation feature recognition in Sign Language videos

Abstract

In this paper we present a recommendation system for (semi-)automatic annotation of sign language videos exploiting deep learning techniques, which handle handshape recognition in continuous signing data. Major tools in our approach have been the keypoint output of OpenPose and the use of HamNoSys in sign annotation of the training data. Prior to application on signed phrases, we tested our method with recognition of hand shape, hand location and palm orientation in isolated signs using two lexical datasets. The system has been trained on the Danish Sign Language lexicon and has also been applied to POLYTROPON, a lexicon of the Greek Sign Language (GSL), for which we received satisfactory recognition results. Experimentation with the POLYTROPON corpus of GSL phrases, has provided results which verify that our approach exhibits satisfactory accuracy rates. Thus, it can be exploited in a recommendation system for semi-automatic annotation of isolated signs and signed phrases in big SL video data, also contributing towards the development of further datasets for machine learning training.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

References

  1. Bauer B, Hienz H (2000, March) Relevant features for video-based continuous sign language recognition. In Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580). IEEE, pp 440–445

  2. Buehler P, Zisserman A, Everingham M (2009, June) Learning sign language by watching TV (using weakly aligned subtitles). In: Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 2961–2968

  3. Camgoz NC, Hadfield S, Koller O, Bowden R (2017, October) Subunets: End-to-end hand shape and continuous sign language recognition. In: 2017 IEEE International conference on computer vision (ICCV). IEEE, pp 3075–3084

  4. Camgoz NC, Koller O, Hadfield S, Bowden R (2020) Sign language transformers: Joint end-to-end sign language recognition and translation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 10023–10033

  5. Cao Z, Hidalgo G, Simon T, Wei SE, Sheikh Y (2019) OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans Pattern Anal Mach Intell 43(1):172–186

    Article  Google Scholar 

  6. Cao Z, Simon T, Wei SE, Sheikh Y (2018) OpenPose: real-time multi-person keypoint detection library for body, face, and hands estimation

  7. Choudhury A, Talukdar AK, Bhuyan MK, Sarma KK (2017) Movement epenthesis detection for continuous sign language recognition. J Intell Syst 26(3):471–481

    Article  Google Scholar 

  8. Cui R, Liu H, Zhang C (2017) Recurrent convolutional neural networks for continuous sign language recognition by staged optimization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7361–7369

  9. Cui R, Liu H, Zhang C (2019) A deep neural framework for continuous sign language recognition by iterative training. IEEE Trans Multimed 21(7):1880–1891

    Article  Google Scholar 

  10. de Amorim CC, Macêdo D, Zanchettin C (2019, September) Spatial-temporal graph convolutional networks for sign language recognition. In: International conference on artificial neural networks. Springer, Cham, pp 646–657

  11. De Coster M, Van Herreweghe M, Dambre J (2019) Towards automatic sign language corpus annotation using deep learning. In: 6th workshop on sign language translation and avatar technology

  12. Efthimiou E, Dimou AL, Fotinea SE, Goulas T, Pissaris M (2014) SiS-builder: a tool to support sign synthesis. In: Proceedings of the second international conference on the use of new technologies for inclusive learning, York, UK, pp 26–36

  13. Efthimiou E, Fotinea SE, Dimou AL, Goulas T, Karioris P, Vasilaki K, Vacalopoulou A, Pissaris M, Korakakis D (2016, May) From a sign lexical database to an SL golden corpus–the POLYTROPON SL resource. In: Proceedings of 7th workshop on the representation and processing of sign languages: Corpus Mining (LREC-2016), Portorož, Slovenia, pp 63–68

  14. Efthimiou E, Vasilaki K, Fotinea SE, Vacalopoulou A, Goulas T, Dimou AL (2018) The POLYTROPON parallel corpus. In: Proceedings of international conference on language resources and evaluation (LREC)

  15. Goulas T, Fotinea SE, Efthimiou E, Pissaris M (2010, May) SiS-Builder: a sign synthesis support tool. In: LREC, pp 102–105

  16. Hanke T (2004) May) HamNoSys-representing sign language data in language resources and language processing contexts. LREC 4:1–6

    Google Scholar 

  17. Ko SK, Kim CJ, Jung H, Cho C (2019) Neural sign language translation based on human keypoint estimation. Appl Sci 9(13):2683

    Article  Google Scholar 

  18. Ko SK, Son JG, Jung H (2018, October) Sign language recognition with recurrent neural network using human keypoint detection. In: Proceedings of the 2018 conference on research in adaptive and convergent systems, pp 326–328

  19. Koller O, Forster J Ney H (2015) Continuous sign language recognition: towards large vocabulary statistical recognition systems handling multiple signers. Computer Vision and Image Understanding, vol 141, pp 108–125

  20. Koller O, Zargaran O, Ney H, Bowden R (2016) Deep sign: Hybrid CNN-HMM for continuous sign language recognition. In: Proceedings of the British Machine Vision Conference 2016

  21. Konstantinidis D, Dimitropoulos K, Daras, P. (2018, June). Sign language recognition based on hand and body skeletal data. In: 2018–3DTV-conference: the true vision-capture, transmission and display of 3D video (3DTV-CON). IEEE, pp 1–4

  22. Li D, Rodriguez C, Yu X, Li H (2020) Word-level deep sign language recognition from video: a new large-scale dataset and methods comparison. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1459–1469

  23. Neves, C, Coheur L, Nicolau H (2020, May) HamNoSyS2SiGML: translating HamNoSys into SiGML. In: Proceedings of The 12th language resources and evaluation conference, pp 6035–6039

  24. Pitsikalis V, Theodorakis S, Vogler C, Maragos P (2011, June) Advances in phonetics-based sub-unit modeling for transcription alignment and sign language recognition. In: CVPR 2011 WORKSHOPS. IEEE, pp 1–6

  25. Sang H, Wu H (2016, October) A sign language recognition system in complex background. In: Chinese conference on biometric recognition. Springer, Cham. pp 453–461

  26. Simon T, Joo H, Matthews I, Sheikh Y (2017) Hand keypoint detection in single images using multiview bootstrapping. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1145–1153

  27. Starner T, Weaver J, Pentland A (1998) Real-time american sign language recognition using desk and wearable computer based video. IEEE Trans Pattern Anal Mach Intell 20(12):1371–1375

    Article  Google Scholar 

  28. Theodorakis S, Pitsikalis V, Maragos P (2014) Dynamic–static unsupervised sequentiality, statistical subunits and lexicon for sign language recognition. Image Vis Comput 32(8):533–549

    Article  Google Scholar 

  29. Wei SE, Ramakrishna V, Kanade T, Sheikh Y (2016) Convolutional pose machines. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4724–4732

  30. Yan C, Li X, Li G (2020) A new action recognition framework for video highlights summarization in sporting events. arXiv:2012.00253

  31. Zaki MM, Shaheen SI (2011) Sign language recognition using a combination of new vision based features. Pattern Recogn Lett 32(4):572–577

    Article  Google Scholar 

  32. Zhang J, Zhou W, Xie C, Pu J, Li H (2016, July) Chinese sign language recognition with adaptive HMM. In: 2016 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1–6

Download references

Acknowledgement

This work was supported by the project “Computational Science and Technologies: Data, Content and Interaction” (MIS 5002437), implemented under the Action “Reinforcement of the Research and Innovation Infrastructure”, funded by the Operational Programme “Competitiveness Entrepreneurship and Innovation” (NSRF 2014-2020) and co-financed by Greece and the European Union (European Regional Development Fund). Deep thanks to Center for Tegnsprog for providing the Danish Sign Language dataset we used.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ioannis Koulierakis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (MP4 2421 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Koulierakis, I., Siolas, G., Efthimiou, E. et al. Sign boundary and hand articulation feature recognition in Sign Language videos. Machine Translation (2021). https://doi.org/10.1007/s10590-021-09271-3

Download citation

Keywords

  • Sign language recognition
  • Deep learning
  • Continuous signing
  • Recommendation system
  • Annotated SL resources