Sign language digits and alphabets recognition by capsule networks

Xiao, Hongwang; Yang, Yun; Yu, Ke; Tian, Jiao; Cai, Xinyi; Muhammad, Usman; Chen, Jinjun

doi:10.1007/s12652-021-02974-8

Sign language digits and alphabets recognition by capsule networks

Original Research
Published: 23 February 2021

Volume 13, pages 2131–2141, (2022)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Hongwang Xiao ORCID: orcid.org/0000-0001-7118-1038¹,
Yun Yang¹,
Ke Yu¹,
Jiao Tian¹,
Xinyi Cai¹,
Usman Muhammad² &
…
Jinjun Chen¹

584 Accesses
5 Citations
Explore all metrics

Abstract

There exist communication barriers between the deaf people and the listeners. Sign language translation is a reasonable and effective way to break these barriers. Recognition of sign language symbols is an essential part of sign language translation. Sign language digits of (0–9) and alphabetic letters of (A–Z) are elementary but important symbols of sign languages of different countries or regions. Capsule networks (CapsNet) are promising alternative to convolutional neural networks (CNN), which take into account of the spatial relationships and orientations of the features of an entity. For sign language digits and alphabets recognition tasks, the proposed SLR-CapsNet architecture achieves a start-of-the-art test accuracy of 99.52% with 100*100 RGB input size and 99.94% with 32*32 RGB input size on Sign Language Digits Dataset and 99.60% with 28*28 Gray-scale input on Sign Language MNIST Dataset. The experimental results also prove that CapsNet has higher generalization and expressiveness capacity on unseen data than CNN dose. Another important finding in our work is that SLR-CapsNet is robust to routing iterations, i.e., its performance will not be affected by various routing iterations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Sign Language Recognition from Digital Videos Using Deep Learning Methods

A Modern Approach for Sign Language Interpretation Using Convolutional Neural Network

Indian Sign Language Digit Translation Using CNN with Swish Activation Function

Notes

References

Afshar P, Mohammadi A, Plataniotis KN (2018) Brain tumor type classification via capsule networks. In: 2018 25th IEEE International Conference on Image Processing (ICIP), IEEE, pp 3129–3133
Beşer F, Kizrak MA, Bolat B, Yildirim T (2018) Recognition of sign language using capsule networks. In: 2018 26th Signal Processing and Communications Applications Conference (SIU), IEEE, pp 1–4
Bilgin M, Mutludoğan K (2019) American sign language character recognition with capsule networks. In: 2019 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), IEEE, pp 1–6
Cai X, Geng S, Wu D, Cai J, Chen J (2020) A multi-cloud model based many-objective intelligent algorithm for efficient task scheduling in internet of things. IEEE Internet of Things J. https://doi.org/10.1109/JIOT.2020.3040019
Article Google Scholar
Duarte K, Rawat Y, Shah M (2018) Videocapsulenet: a simplified network for action detection. Advances in neural information processing systems. MIT Press, Cambridge, pp 7610–7619
Google Scholar
Ertugrul IO, Jeni LA, Cohn JF (2018) FACSCaps: Pose-independent facial action coding with capsules. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, pp 2243–2252
Hassan MU, Rehmani MH, Chen J (2019) Privacy preservation in blockchain based iot systems: integration issues, prospects, challenges, and future research directions. Future Gener Comput Syst 97:512–529
Article Google Scholar
Hassan MU, Rehmani MH, Chen J (2020) DEAL: Differentially private auction for blockchain-based microgrids energy trading. IEEE Trans Serv Comput 13(2):263–275
Google Scholar
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2961–2969
Hinton GE, Krizhevsky A, Wang SD (2011) Transforming auto-encoders. In: International Conference on Artificial Neural Networks, Springer, pp 44–51
Hinton GE, Sabour S, Frosst N (2018) Matrix capsules with EM routing. In: Proceedings of the 6th International Conference on Learning Representations
Iesmantas T, Alzbutas R (2018) Convolutional capsule network for classification of breast cancer histology images. In: International Conference Image Analysis and Recognition, Springer, pp 853–860
Iqbal T, Xu Y, Kong Q, Wang W (2018) Capsule routing for sound event detection. In: 2018 26th European Signal Processing Conference (EUSIPCO), IEEE, pp 2255–2259
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. MIT Press, Cambridge, pp 1097–1105
Google Scholar
LaLonde R, Bagci U (2018) Capsules for object segmentation. CoRR arXiv:1804.04241
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11):2278–2324
Article Google Scholar
Li H, Guo X, Dai B, Ouyang W, Wang X (2018) Neural network encapsulation. CoRR arXiv:1808.03749
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3431–3440
McIntosh B, Duarte K, Rawat YS, Shah M (2018) Multi-modal capsule routing for actor and action video segmentation conditioned on natural language queries. CoRR arXiv:1812.00303
Mobiny A, Van Nguyen H (2018) Fast capsnet for lung cancer screening. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp 741–749
Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1520–1528
Peer D, Stabinger S, Rodríguez-Sánchez AJ (2019) Limitations of routing-by-agreement based capsule networks. CoRR arXiv:1905.08744
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in neural information processing systems. MIT Press, Cambridge, pp 91–99
Google Scholar
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. Advances in neural information processing systems. MIT Press, Cambridge, pp 3856–3866
Google Scholar
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: Integrated recognition, localization and detection using convolutional networks. CoRR arXiv:1312.6229
Wang X, Tan K, Du Q, Chen Y, Du P (2019) Caps-triplegan: Gan-assisted Capsnet for hyperspectral image classification. IEEE Trans Geosci Remote Sens. https://doi.org/10.1109/TGRS.2019.2912468
Article Google Scholar

Download references

Acknowledgements

This paper is partly supported by Australian Research Council (ARC) projects DP190101893, DP170100136 and LP180100758.

Author information

Authors and Affiliations

Swinburne University of Technology, Melbourne, Australia
Hongwang Xiao, Yun Yang, Ke Yu, Jiao Tian, Xinyi Cai & Jinjun Chen
Federation University, Ballarat, Australia
Usman Muhammad

Authors

Hongwang Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Yun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ke Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jiao Tian
View author publications
You can also search for this author in PubMed Google Scholar
Xinyi Cai
View author publications
You can also search for this author in PubMed Google Scholar
Usman Muhammad
View author publications
You can also search for this author in PubMed Google Scholar
Jinjun Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongwang Xiao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethic approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiao, H., Yang, Y., Yu, K. et al. Sign language digits and alphabets recognition by capsule networks. J Ambient Intell Human Comput 13, 2131–2141 (2022). https://doi.org/10.1007/s12652-021-02974-8

Download citation

Received: 13 September 2020
Accepted: 11 February 2021
Published: 23 February 2021
Issue Date: April 2022
DOI: https://doi.org/10.1007/s12652-021-02974-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sign language digits and alphabets recognition by capsule networks

Abstract

Access this article

Similar content being viewed by others

Sign Language Recognition from Digital Videos Using Deep Learning Methods

A Modern Approach for Sign Language Interpretation Using Convolutional Neural Network

Indian Sign Language Digit Translation Using CNN with Swish Activation Function

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethic approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Sign language digits and alphabets recognition by capsule networks

Abstract

Access this article

Similar content being viewed by others

Sign Language Recognition from Digital Videos Using Deep Learning Methods

A Modern Approach for Sign Language Interpretation Using Convolutional Neural Network

Indian Sign Language Digit Translation Using CNN with Swish Activation Function

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethic approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation