Skip to main content
Log in

A framework for continuous fingerspelling spotting for H.264/AVC compressed videos using spatio-temporal Markov random field

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The continuous Sign Language recognition (SLR) system suffers from a problem called movement epenthesis (me) which involves certain intermediate connecting movement between two consecutive signs. In this paper, a novel framework for spotting of continuous fingerspelling sequence is proposed, which can directly extract motion information of signs from a compressed video. The framework is based on motion vectors extracted from H.264/AVC compressed videos. A Spatio-Temporal Markov Random Field (ST-MRF) based model is employed to model non-rigid motions of fingers as sign or me. The proposed framework is tested on a number of sign language videos encoded with an H.264/AVC JM encoder, and the accuracy of spotting was found to be around 75%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Abdari A, Amirjan P, Mansouri A (2019) Action recognition in compressed domain using residual information. In: 2019 4Th international conference on pattern recognition and image analysis (IPRIA). IEEE, pp 130–134

  2. Aly W, Aly S, Almotairi S (2019) User-independent american sign language alphabet recognition based on depth image and pcanet features. IEEE Access 7:123138–123150

  3. Avola D, Bernardi M, Cinque L, Foresti GL, Massaroni C (2019) Exploiting recurrent neural networks and leap motion controller for the recognition of sign language and semaphoric hand gestures. IEEE Trans Multimed 21(1):234–245

    Article  Google Scholar 

  4. Besag J (1974) Spatial interaction and the statistical analysis of lattice systems. J R Stat Soc Ser B (Methodol) 36(2):192–225

    MathSciNet  MATH  Google Scholar 

  5. Besag J (1986) On the statistical analysis of dirty pictures. J R Stat Soc Ser B (Methodol) 48(3):259–279

    MathSciNet  MATH  Google Scholar 

  6. Chen YM, Bajic IV, Saeedi P (2011) Moving region segmentation from compressed video using global motion estimation and markov random fields. IEEE Trans Multimed 13(3):421–431

    Article  Google Scholar 

  7. Chon J, Cherniavsky N, Riskin EA, Ladner RE (2009) Enabling access through real-time sign language communication over cell phones. In: 2009 Conference record of the forty-third asilomar conference on signals, systems and computers. IEEE, pp 588–592

  8. Chon J, Whittle S, Riskin EA, Ladner RE (2011) Improving compressed video sign language conversations in the presence of data loss. In: 2011 Data compression conference. IEEE, pp 383–392

  9. Chuan CH, Regina E, Guardino C (2014) American sign language recognition using leap motion sensor. In: 2014 13Th international conference on machine learning and applications. IEEE, pp 541–544

  10. Ciaramello FM, Hemami SS (2011) A computational intelligibility model for assessment and compression of american sign language video. IEEE Trans Image Process 20(11):3014–3027

    Article  MathSciNet  Google Scholar 

  11. Jalal MA, Chen R, Moore RK, Mihaylova L (2018) American sign language posture understanding with deep neural networks. In: 2018 21St international conference on information fusion (FUSION). IEEE, pp 573–579

  12. Kane L, Khanna P (2015) A framework for live and cross platform fingerspelling recognition using modified shape matrix variants on depth silhouettes. Comput Vis Image Underst 141:138–151

    Article  Google Scholar 

  13. Kang B, Tripathi S, Nguyen TQ (2015) Real-time sign language fingerspelling recognition using convolutional neural networks from depth map. In: 2015 3Rd IAPR asian conference on pattern recognition (ACPR). IEEE, pp 136–140

  14. Kayaalp IB (2003) Video segmentation using partially decoded mpeg bitstream. Ph.D. thesis METU

  15. Khatoonabadi SH, Bajic IV (2013) Video object tracking in the compressed domain using spatio-temporal markov random fields. IEEE Trans Image Process 22(1):300–313

    Article  MathSciNet  Google Scholar 

  16. Kim J, Chang HS, Kim J, Kim HM (2000) Efficient camera motion characterization for mpeg video indexing. In: 2000 IEEE International conference on multimedia and expo. ICME2000. Proceedings. Latest advances in the fast changing world of multimedia (cat. no. 00TH8532), vol 2. IEEE, pp 1171–1174

  17. Kim T, Shakhnarovich G, Livescu K (2013) Fingerspelling recognition with semi-markov conditional random fields. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1521–1528

  18. Lee J, Lee H, Lee D, Oh SJ (2017) A compressed-domain corner detection method for a dct-based compressed image. In: 2017 IEEE International conference on consumer electronics (ICCE). IEEE, pp 306–307

  19. Li SZ (2009) Markov random field modeling in image analysis. Springer Science & Business Media

  20. Nguyen HB, Do HN (2019) Deep learning for american sign language fingerspelling recognition system. In: 2019 26Th international conference on telecommunications (ICT). IEEE, pp 314–318

  21. Papadimitriou K, Potamianos G (2019) Fingerspelled alphabet sign recognition in upper-body videos. In: 2019 27Th european signal processing conference (EUSIPCO). IEEE, pp 1–5

  22. Ricco S, Tomasi C (2009) Fingerspelling recognition through classification of letter-to-letter transitions. In: Asian conference on computer vision. Springer, pp 214–225

  23. Shi B, Del Rio AM, Keane J, Michaux J, Brentari D, Shakhnarovich G, Livescu K (2018) American sign language fingerspelling recognition in the wild. In: 2018 IEEE Spoken language technology workshop (SLT). IEEE, pp 145–152

  24. Talukdar AK, Bhuyan M (2018) Movement epenthesis detection in continuous fingerspelling from a coarsely sampled motion vector field in h. 264/avc video. In: 2018 IEEE Recent advances in intelligent computational systems (RAICS). IEEE, pp 26–30

  25. Tazhigaliyeva N, Kalidolda N, Imashev A, Islam S, Aitpayev K, Parisi GI, Sandygulova A (2017) Cyrillic manual alphabet recognition in rgb and rgb-d data for sign language interpreting robotic system (slirs). In: 2017 IEEE International conference on robotics and automation (ICRA). IEEE, pp 4531–4536

  26. Yang HD, Lee SW (2010) Simultaneous spotting of signs and fingerspellings based on hierarchical conditional random fields and boostmap embeddings. Pattern Recogn 43(8):2858–2870

    Article  Google Scholar 

  27. Yang HD, Sclaroff S, Lee SW (2009) Sign language spotting with a threshold model based on conditional random fields. IEEE Trans Pattern Anal Mach Intell 31(7):1264–1277

    Article  Google Scholar 

  28. Yang R, Sarkar S, Loeding B (2007) Enhanced level building algorithm for the movement epenthesis problem in sign language recognition. In: 2007 IEEE Conference on computer vision and pattern recognition. IEEE, pp 1–8

  29. Yang R, Sarkar S, Loeding B (2010) Handling movement epenthesis and hand segmentation ambiguities in continuous sign language recognition using nested dynamic programming. IEEE Trans Pattern Anal Mach Intell 32(3):462–477

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anjan Kumar Talukdar.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Talukdar, A.K., Bhuyan, M. A framework for continuous fingerspelling spotting for H.264/AVC compressed videos using spatio-temporal Markov random field. Multimed Tools Appl 80, 28329–28347 (2021). https://doi.org/10.1007/s11042-021-10910-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-10910-3

Keywords

Navigation