Hand Sign Recognition for Thai Finger Spelling: an Application of Convolution Neural Network

Nakjai, Pisit; Katanyukul, Tatpong

doi:10.1007/s11265-018-1375-6

Hand Sign Recognition for Thai Finger Spelling: an Application of Convolution Neural Network

Published: 24 May 2018

Volume 91, pages 131–146, (2019)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

902 Accesses
28 Citations
Explore all metrics

Abstract

The finger spelling is a necessary part of Sign Language—an important means of communication among people with hearing disability. The finger spelling is used to spell out names, places or signs that have not yet been defined. A sign recognition system attempts to allow better communication between hearing majority and hearing disability people. Our study investigates Thai Finger Spelling(TFS), its unique characteristics, a design of automatic TFS recognition, and approaches to handle a TFS key potential issue. Our research designs automatic TFS recognition as a two-stage pipeline: (1) locating and extracting a signing hand on the image and (2) classifying the signing image into the valid TFS sign. Signing hand is located and extracted based on color scheme and contour area using Green’s Theorem. Two approaches are examined for signing image classification: Convolution Neural Network(CNN)-based and Histogram of Oriented Gradients(HOG)-based approaches. Our experimental results have shown the viability of the proposed pipeline, which achieves mean Average Precision (mAP) at 91.26. The proposed design outperforms state-of-the-arts in automatic visual TFS recognition. In a practical sign recognition system, invalid TFS signs may appear in sign transition or simply from unaware hand postures. We proposed a formulation, called confidence ratio. Confidence ratio is simple to compute and generally compatible with multi-class classifiers. The confidence ratio has been found to be a promising mechanism for identifying invalid TFS signs. Our findings reveal challenging issues related to TFS recognition, practical design for TFS sign transcription, formulation and effectiveness of confidence ratio.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sign Language Recognition Using Convolutional Neural Network

R-DCNN Based Automatic Recognition of Indian Sign Language

An Optimized Eight-Layer Convolutional Neural Network Based on Blocks for Chinese Fingerspelling Sign Language Recognition

Notes

Thai has forty-four official alphabets, but two of them are obsolete.

References

Acharya, U.R., Fujita, H., Lih, O.S., Hagiwara, Y., Tan, J.H., Adam, M. (2017). Automated detection of arrhythmias using different intervals of tachycardia ECG segments with convolutional neural network. Information Sciences, 405, 81–90. https://doi.org/10.1016/j.ins.2017.04.012.
Article Google Scholar
Adhan, S., & Pintavirooj, C. (2016). Thai sign language recognition by using geometric invariant feature and ANN classification. In 2016 9th biomedical engineering international conference (BMEiCON) (pp. 1–4). https://doi.org/10.1109/BMEiCON.2016.7859627.
Antia, S.D., Reed, S., Kreimeyer, K.H. (2005). Written language of deaf and hard-of-hearing students in public schools. The Journal of Deaf Studies and Deaf Education, 10(3), 244–255. https://doi.org/10.1093/deafed/eni026.
Article Google Scholar
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y. (2016). Realtime multi-person 2d pose estimation using part affinity fields. arXiv:1611.08050 [cs].
Cardoso, D.O., Gama, J., França, F.M.G. (2017). Weightless neural networks for open set recognition. Machine Learning 1–21. https://doi.org/10.1007/s10994-017-5646-4.
Chanda, P., Auephanwiriyakul, S., Theera-Umpon, N. (2012). Thai sign language translation system using upright speed-up robust feature and dynamic time warping. In 2012 IEEE international conference on computer science and automation engineering (CSAE) (Vol. 2, pp. 70–74). https://doi.org/10.1109/CSAE.2012.6272730.
Chandola, V., Banerjee, A., Kumar, V. (2009). Anomaly detection: a survey. ACM Computing Surveys, 41 (3), 15:1–15:58. https://doi.org/10.1145/1541880.1541882.
Article Google Scholar
Chansri, C., & Srinonchat, J. (2016). Hand gesture recognition for Thai sign language in complex background using fusion of depth and color video. Procedia Computer Science, 86, 257–260. https://doi.org/10.1016/j.procs.2016.05.113.
Article Google Scholar
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05) (Vol. 1, pp. 886–893). https://doi.org/10.1109/CVPR.2005.177.
Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A. (2010). The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338. https://doi.org/10.1007/s11263-009-0275-4.
Article Google Scholar
Everingham, M., Eslami, S.M.A., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A. (2015). The Pascal visual object classes challenge: a retrospective. International Journal of Computer Vision, 111(1), 98–136. https://doi.org/10.1007/s11263-014-0733-5.
Article Google Scholar
Fukushima, K. (1980). Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193–202. https://doi.org/10.1007/BF00344251.
Article Google Scholar
Girshick, R., Iandola, F., Darrell, T., Malik, J. (2014). Deformable part models are convolutional neural networks. arXiv:1409.5403 [cs].
Hikawa, H., & Kaida, K. (2015). Novel FPGA implementation of hand sign recognition system with SOM #x2013;Hebb classifier. IEEE Transactions on Circuits and Systems for Video Technology, 25(1), 153–166. https://doi.org/10.1109/TCSVT.2014.2335831.
Article Google Scholar
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. MIT Press, 9, 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735.
Google Scholar
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J. (2001). Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. IEEE Press.
Inoue, K., Shiraishi, T., Yoshioka, M., Yanagimoto, H. (2015). Depth sensor based automatic hand region extraction by using time-series curve and its application to Japanese finger-spelled sign language recognition. Procedia Computer Science, 60, 371–380. https://doi.org/10.1016/j.procs.2015.08.145.
Article Google Scholar
Isaacs, J., & Foo, S. (2004). Hand pose estimation for American sign language recognition. In Proceedings of the thirty-sixth southeastern symposium on system theory (pp. 132–136). https://doi.org/10.1109/SSST.2004.1295634.
Junxia, B., Jianqin, Y., Jun, W., Ling, Z. (2015). Hand detection based on depth information and color information of the Kinect. In The 27th Chinese control and decision conference (2015 CCDC) (pp. 4205–4210). https://doi.org/10.1109/CCDC.2015.7162669.
Katanyukul, T., & Ponsawat, J. (2017). Customer analytics: customer detection with multiple cues, to be appeared in Acta Polytechnica Hungarica. Acta Polytechnica Hungarica, 14(3), 187–207.
Google Scholar
Kishore, P.V.V., Prasad, M.V.D., Kumar, D.A., Sastry, A.S.C.S. (2016). Optical flow hand tracking and active contour hand shape features for continuous sign language recognition with artificial neural networks. In 2016 IEEE 6th international conference on advanced computing (IACC) (pp. 346–351). https://doi.org/10.1109/IACC.2016.71.
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. https://doi.org/10.1109/5.726791.
Article Google Scholar
Liwicki, S., & Everingham, M. (2009). Automatic recognition of fingerspelled words in British Sign Language. In 2009 IEEE computer society conference on computer vision and pattern recognition workshops (pp. 50–57). https://doi.org/10.1109/CVPRW.2009.5204291.
Michel, D., Oikonomidis, I., Argyros, A. (2011). Scale invariant and deformation tolerant partial shape matching. Image and Vision Computing, 29(7), 459–469. https://doi.org/10.1016/j.imavis.2011.01.008.
Article Google Scholar
Oz, C., & Leu, M.C. (2011). American Sign Language word recognition with a sensory glove using artificial neural networks. Engineering Applications of Artificial Intelligence, 24(7), 1204–1213. https://doi.org/10.1016/j.engappai.2011.06.015.
Article Google Scholar
Pariwat, T., & Seresangtakul, P. (2017). Thai finger-spelling sign language recognition using global and local features with SVM. In 2017 9th international conference on knowledge and smart technology (KST) (pp. 116–120). https://doi.org/10.1109/KST.2017.7886111.
Pattanaworapan, K., Chamnongthai, K., Guo, J.M. (2016). Signer-independence finger alphabet recognition using discrete wavelet transform and area level run lengths. Journal of Visual Communication and Image Representation, 38(Supplement C), 658–677. https://doi.org/10.1016/j.jvcir.2016.04.015.
Article Google Scholar
Redmon, J., & Farhadi, A. (2016). YOLO9000: better, faster, stronger. arXiv:1612.08242 [cs].
Redmon, J., Divvala, S., Girshick, R., Farhadi, A. (2015). You only look once: unified, real-time object detection. arXiv:1506.02640 [cs].
Ren, S., He, K., Girshick, R., Sun, J. (2015). Faster R-CNN: towards real-time object detection with region proposal networks. arXiv:1506.01497 [cs].
Saengsri, S., Niennattrakul, V., Ratanamahatana, C.A. (2012). TFRS: Thai finger-spelling sign language recognition system. In 2012 second international conference on digital information and communication technology and it’s applications (DICTAP) (pp. 457–462). https://doi.org/10.1109/DICTAP.2012.6215407.
Silanon, K. (2017). Thai finger-spelling recognition using a cascaded classifier based on histogram of orientation gradient features. Computational Intelligence and Neuroscience, 2017, 11. https://doi.org/10.1155/2017/9026375.
Article Google Scholar
Simon, T., Joo, H., Matthews, I., Sheikh, Y. (2017). Hand keypoint detection in single images using multiview bootstrapping. arXiv:1704.07809 [cs].
Smedt, Q.D., Wannous, H., Vandeborre, J.P. (2016). Skeleton-based dynamic hand gesture recognition. In 2016 IEEE conference on computer vision and pattern recognition workshops (CVPRW) (pp. 1206–1214). https://doi.org/10.1109/CVPRW.2016.153.
Starner, T., & Pentland, A. (1995). Real-time American sign language recognition from video using hidden Markov models. In Proceedings of international symposium on computer vision (pp. 265–270). https://doi.org/10.1109/ISCV.1995.477012.
Suwanarat, M., & Reilly, C. (1986). National Association of the Deaf in Thailand, B.: The Thai sign language dictionary. Washington, D.C.: Distributed by ERIC Clearinghouse.
Suzuki, S., & Be, K. (1985). Topological structural analysis of digitized binary images by border following. Computer Vision, Graphics, and Image Processing, 30(1), 32–46. https://doi.org/10.1016/0734-189X(85)90016-7.
Article Google Scholar
Tang, H.K., & Feng, Z.Q. (2008). Hand’s skin detection based on ellipse clustering. In 2008 international symposium on computer science and computational technology (Vol. 2, pp. 758–761). https://doi.org/10.1109/ISCSCT.2008.53.
Yang, C., Feinen, C., Tiebe, O., Shirahama, K., Grzegorzek, M. (2016). Shape-based object matching using interesting points and high-order graphs. Pattern Recognition Letters. https://doi.org/10.1016/j.patrec.2016.03.013.
Yap Bee, W., & Nornadiah Mohd, R. (2011). Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests - Semantic Scholar. Journal of Statistical Modeling and Analytics, 2(1), 21–33.
Google Scholar
Zaman, M.F., Mossarrat, S.T., Islam, F., Karmaker, D. (2015). Real-time hand detection and tracking with depth values. In 2015 international conference on advances in electrical engineering (ICAEE) (pp. 129–132). https://doi.org/10.1109/ICAEE.2015.7506813.
Zhao, Y., Song, Z., Wu, X. (2012). Hand detection using multi-resolution HOG features. In 2012 IEEE international conference on robotics and biomimetics (ROBIO) (pp. 1715–1720). https://doi.org/10.1109/ROBIO.2012.6491215.

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Khon Kaen University, Khon Kaen, Thailand
Pisit Nakjai & Tatpong Katanyukul

Authors

Pisit Nakjai
View author publications
You can also search for this author in PubMed Google Scholar
Tatpong Katanyukul
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pisit Nakjai.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nakjai, P., Katanyukul, T. Hand Sign Recognition for Thai Finger Spelling: an Application of Convolution Neural Network. J Sign Process Syst 91, 131–146 (2019). https://doi.org/10.1007/s11265-018-1375-6

Download citation

Received: 11 September 2017
Revised: 11 January 2018
Accepted: 26 April 2018
Published: 24 May 2018
Issue Date: February 2019
DOI: https://doi.org/10.1007/s11265-018-1375-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hand Sign Recognition for Thai Finger Spelling: an Application of Convolution Neural Network

Abstract

Access this article

Similar content being viewed by others

Sign Language Recognition Using Convolutional Neural Network

R-DCNN Based Automatic Recognition of Indian Sign Language

An Optimized Eight-Layer Convolutional Neural Network Based on Blocks for Chinese Fingerspelling Sign Language Recognition

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hand Sign Recognition for Thai Finger Spelling: an Application of Convolution Neural Network

Abstract

Access this article

Similar content being viewed by others

Sign Language Recognition Using Convolutional Neural Network

R-DCNN Based Automatic Recognition of Indian Sign Language

An Optimized Eight-Layer Convolutional Neural Network Based on Blocks for Chinese Fingerspelling Sign Language Recognition

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation