A framework for continuous fingerspelling spotting for H.264/AVC compressed videos using spatio-temporal Markov random field

Talukdar, Anjan Kumar; Bhuyan, M.K.

doi:10.1007/s11042-021-10910-3

A framework for continuous fingerspelling spotting for H.264/AVC compressed videos using spatio-temporal Markov random field

Published: 03 June 2021

Volume 80, pages 28329–28347, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

150 Accesses
1 Citation
Explore all metrics

Abstract

The continuous Sign Language recognition (SLR) system suffers from a problem called movement epenthesis (me) which involves certain intermediate connecting movement between two consecutive signs. In this paper, a novel framework for spotting of continuous fingerspelling sequence is proposed, which can directly extract motion information of signs from a compressed video. The framework is based on motion vectors extracted from H.264/AVC compressed videos. A Spatio-Temporal Markov Random Field (ST-MRF) based model is employed to model non-rigid motions of fingers as sign or me. The proposed framework is tested on a number of sign language videos encoded with an H.264/AVC JM encoder, and the accuracy of spotting was found to be around 75%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Fig. 4

Moving Hand Segmentation from H.264 Compressed Sign Language Videos

A comparative analysis on major key-frame extraction techniques

Article 13 February 2024

Gesture Recognition in Sign Language Videos by Tracking the Position and Medial Representation of the Hand Shapes

References

Abdari A, Amirjan P, Mansouri A (2019) Action recognition in compressed domain using residual information. In: 2019 4Th international conference on pattern recognition and image analysis (IPRIA). IEEE, pp 130–134
Aly W, Aly S, Almotairi S (2019) User-independent american sign language alphabet recognition based on depth image and pcanet features. IEEE Access 7:123138–123150
Avola D, Bernardi M, Cinque L, Foresti GL, Massaroni C (2019) Exploiting recurrent neural networks and leap motion controller for the recognition of sign language and semaphoric hand gestures. IEEE Trans Multimed 21(1):234–245
Article Google Scholar
Besag J (1974) Spatial interaction and the statistical analysis of lattice systems. J R Stat Soc Ser B (Methodol) 36(2):192–225
MathSciNet MATH Google Scholar
Besag J (1986) On the statistical analysis of dirty pictures. J R Stat Soc Ser B (Methodol) 48(3):259–279
MathSciNet MATH Google Scholar
Chen YM, Bajic IV, Saeedi P (2011) Moving region segmentation from compressed video using global motion estimation and markov random fields. IEEE Trans Multimed 13(3):421–431
Article Google Scholar
Chon J, Cherniavsky N, Riskin EA, Ladner RE (2009) Enabling access through real-time sign language communication over cell phones. In: 2009 Conference record of the forty-third asilomar conference on signals, systems and computers. IEEE, pp 588–592
Chon J, Whittle S, Riskin EA, Ladner RE (2011) Improving compressed video sign language conversations in the presence of data loss. In: 2011 Data compression conference. IEEE, pp 383–392
Chuan CH, Regina E, Guardino C (2014) American sign language recognition using leap motion sensor. In: 2014 13Th international conference on machine learning and applications. IEEE, pp 541–544
Ciaramello FM, Hemami SS (2011) A computational intelligibility model for assessment and compression of american sign language video. IEEE Trans Image Process 20(11):3014–3027
Article MathSciNet Google Scholar
Jalal MA, Chen R, Moore RK, Mihaylova L (2018) American sign language posture understanding with deep neural networks. In: 2018 21St international conference on information fusion (FUSION). IEEE, pp 573–579
Kane L, Khanna P (2015) A framework for live and cross platform fingerspelling recognition using modified shape matrix variants on depth silhouettes. Comput Vis Image Underst 141:138–151
Article Google Scholar
Kang B, Tripathi S, Nguyen TQ (2015) Real-time sign language fingerspelling recognition using convolutional neural networks from depth map. In: 2015 3Rd IAPR asian conference on pattern recognition (ACPR). IEEE, pp 136–140
Kayaalp IB (2003) Video segmentation using partially decoded mpeg bitstream. Ph.D. thesis METU
Khatoonabadi SH, Bajic IV (2013) Video object tracking in the compressed domain using spatio-temporal markov random fields. IEEE Trans Image Process 22(1):300–313
Article MathSciNet Google Scholar
Kim J, Chang HS, Kim J, Kim HM (2000) Efficient camera motion characterization for mpeg video indexing. In: 2000 IEEE International conference on multimedia and expo. ICME2000. Proceedings. Latest advances in the fast changing world of multimedia (cat. no. 00TH8532), vol 2. IEEE, pp 1171–1174
Kim T, Shakhnarovich G, Livescu K (2013) Fingerspelling recognition with semi-markov conditional random fields. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1521–1528
Lee J, Lee H, Lee D, Oh SJ (2017) A compressed-domain corner detection method for a dct-based compressed image. In: 2017 IEEE International conference on consumer electronics (ICCE). IEEE, pp 306–307
Li SZ (2009) Markov random field modeling in image analysis. Springer Science & Business Media
Nguyen HB, Do HN (2019) Deep learning for american sign language fingerspelling recognition system. In: 2019 26Th international conference on telecommunications (ICT). IEEE, pp 314–318
Papadimitriou K, Potamianos G (2019) Fingerspelled alphabet sign recognition in upper-body videos. In: 2019 27Th european signal processing conference (EUSIPCO). IEEE, pp 1–5
Ricco S, Tomasi C (2009) Fingerspelling recognition through classification of letter-to-letter transitions. In: Asian conference on computer vision. Springer, pp 214–225
Shi B, Del Rio AM, Keane J, Michaux J, Brentari D, Shakhnarovich G, Livescu K (2018) American sign language fingerspelling recognition in the wild. In: 2018 IEEE Spoken language technology workshop (SLT). IEEE, pp 145–152
Talukdar AK, Bhuyan M (2018) Movement epenthesis detection in continuous fingerspelling from a coarsely sampled motion vector field in h. 264/avc video. In: 2018 IEEE Recent advances in intelligent computational systems (RAICS). IEEE, pp 26–30
Tazhigaliyeva N, Kalidolda N, Imashev A, Islam S, Aitpayev K, Parisi GI, Sandygulova A (2017) Cyrillic manual alphabet recognition in rgb and rgb-d data for sign language interpreting robotic system (slirs). In: 2017 IEEE International conference on robotics and automation (ICRA). IEEE, pp 4531–4536
Yang HD, Lee SW (2010) Simultaneous spotting of signs and fingerspellings based on hierarchical conditional random fields and boostmap embeddings. Pattern Recogn 43(8):2858–2870
Article Google Scholar
Yang HD, Sclaroff S, Lee SW (2009) Sign language spotting with a threshold model based on conditional random fields. IEEE Trans Pattern Anal Mach Intell 31(7):1264–1277
Article Google Scholar
Yang R, Sarkar S, Loeding B (2007) Enhanced level building algorithm for the movement epenthesis problem in sign language recognition. In: 2007 IEEE Conference on computer vision and pattern recognition. IEEE, pp 1–8
Yang R, Sarkar S, Loeding B (2010) Handling movement epenthesis and hand segmentation ambiguities in continuous sign language recognition using nested dynamic programming. IEEE Trans Pattern Anal Mach Intell 32(3):462–477
Article Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Technology Guwahati, Guwahati, 781039, India
Anjan Kumar Talukdar & M.K. Bhuyan

Authors

Anjan Kumar Talukdar
View author publications
You can also search for this author in PubMed Google Scholar
M.K. Bhuyan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anjan Kumar Talukdar.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Talukdar, A.K., Bhuyan, M. A framework for continuous fingerspelling spotting for H.264/AVC compressed videos using spatio-temporal Markov random field. Multimed Tools Appl 80, 28329–28347 (2021). https://doi.org/10.1007/s11042-021-10910-3

Download citation

Received: 01 June 2020
Revised: 01 November 2020
Accepted: 01 April 2021
Published: 03 June 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s11042-021-10910-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A framework for continuous fingerspelling spotting for H.264/AVC compressed videos using spatio-temporal Markov random field

Abstract

Access this article

Similar content being viewed by others

Moving Hand Segmentation from H.264 Compressed Sign Language Videos

A comparative analysis on major key-frame extraction techniques

Gesture Recognition in Sign Language Videos by Tracking the Position and Medial Representation of the Hand Shapes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A framework for continuous fingerspelling spotting for H.264/AVC compressed videos using spatio-temporal Markov random field

Abstract

Access this article

Similar content being viewed by others

Moving Hand Segmentation from H.264 Compressed Sign Language Videos

A comparative analysis on major key-frame extraction techniques

Gesture Recognition in Sign Language Videos by Tracking the Position and Medial Representation of the Hand Shapes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation