Skip to main content
Log in

Cornerstone network with feature extractor: a metric-based few-shot model for chinese natural sign language

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

StandardChinese natural sign language (CNSL) contains over 8,000 words. We consider dividing the task of CNSL recognition into multiple subtasks. Few-shot learning on subtasks can achieve minimal acquisition cost and short-term training. However, the existing few-shot learning methods do not take into account the impact of ill-conditioned support samples, so we propose a new metric-based model, Cornerstone Network (CN), to complete the subtasks. CN is mainly composed of feature extractor (optional), embedding network and cornerstone generator. The cornerstone generator is designed as a semi-supervised clusterer. Compared with other metric-based few-shot models, CN without feature extractor improves 5-shot accuracy on Omniglot and miniImageNet. In order to verify the feasibility of our model on the task of CNSL recognition, we expanded the Chinese Natural Sign Language database, from CNSL-80 to CNSL-139, which integrates surface electromyography and inertial signals. The 5-shot accuracy on CNSL-139 increases from 65.25% to 68.83% comparing with the state-of-art model. After connecting with the 1-D convolution feature extractor using Siamese Network’s idea for secondary training, the accuracy increases by 10.38%. During the online test, the feature vector norms are used for selective matching. Although the accuracy drops, it is still at least 5% higher than that without feature extractor. Experimental results confirm the effectiveness of our model on 2-D images and 1-D time-series signals and the improvement of real-time recognition by SM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

Availability of data is temporarily not allowed by the authors.

References

  1. Wang P, Song Q, Han H, Cheng J (2016) Sequentially supervised long short-term memory for gesture recognition. Cognitive Computation 8(5):982–991

    Article  Google Scholar 

  2. Chiu C-M, Chen S-W, Pao Y-P, Huang M-Z, Chan S-W, Lin Z-H (2019) A smart glove with integrated triboelectric nanogenerator for self-powered gesture recognition and language expression. Sci Technol Adv Mater 20(1):964–971

    Article  Google Scholar 

  3. Camgoz NC, Hadfield S, Koller O, Bowden R (2017) Subunets: End-to-end hand shape and continuous sign language recognition. In: 2017 IEEE International conference on computer vision (ICCV), 2017. IEEE, pp 3075–3084

  4. Perera AG, Law YW, Chahl J (2018) Human pose and path estimation from aerial video using dynamic classifier selection. Cogn Comp 10(6):1019–1041

    Article  Google Scholar 

  5. Cui R, Liu H, Zhang C (2019) A deep neural framework for continuous sign language recognition by iterative training. IEEE Trans Multimedia 21(7):1880–1891

    Article  Google Scholar 

  6. Liang RH, Ouhyoung MA (1995) Real-time Continuous Alphabetic Sign Language to Speech Conversion VR System. In: Computer graphics forum, 1995. vol 3. Wiley online library, pp 67–76

  7. Mummadi CK, Leo FPP, Verma KD, Kasireddy S, Scholl PM, Van Laerhoven K (2017) Real-time embedded recognition of sign language alphabet fingerspelling in an imu-based glove. In: Proceedings of the 4th international workshop on sensor-based activity recognition and interaction, 2017. pp 1–6

  8. Cheng J, Chen X, Liu A, Peng H (2015) A novel phonology-and radical-coded chinese sign language recognition framework using accelerometer and surface electromyography sensors. Sensors 15 (9):23303–23324

    Article  Google Scholar 

  9. Hu Y, Wong Y, Wei W, Du Y, Kankanhalli M, Geng W (2018) A novel attention-based hybrid CNN-RNN architecture for sEMG-based gesture recognition PloS one 13 (10)

  10. Zhang Q, Wang D, Zhao R, Yu Y (2019) Myosign: enabling end-to-end sign language recognition with wearables. In: Proceedings of the 24th international conference on intelligent user interfaces, 2019. pp 650–660

  11. Wang F, Zhao S, Zhou X, Li C, Li M, Zeng Z (2019) An recognition–verification mechanism for real-time chinese sign language recognition based on multi-information fusion. Sensors 19(11):2495

    Article  Google Scholar 

  12. Krizhevsky A, Sutskever I (2012) Hinton GE Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, 2012. pp 1097–1105

  13. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence

  14. Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: International conference on learning representations

  15. Zakharov E, Shysheya A, Burkov E (2019) Few-shot adversarial learning of realistic neural talking head models. In: Proceedings of the IEEE international conference on computer vision

  16. Oreshkin B, López P R, Lacoste A (2018) Tadam: Task dependent adaptive metric for improved few-shot learning. In: Advances in neural information processing systems, 2018. pp 721–731

  17. Yoo S, Bahng H, Chung S (2019) Coloring with limited data: Few-shot colorization via memory augmented networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2019. pp 11283–11292

  18. Lüders B, Schläger M, Korach A, Risi S (2017) Continual and one-shot learning through neural networks with dynamic external memory. In: European conference on the applications of evolutionary computation, 2017. Springer, pp 886–901

  19. Zhang S, Huang K, Zhang R, Hussain A (2018) Learning from few samples with memory network. Cogn Comp 10(1):15–22

    Article  Google Scholar 

  20. Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: ICML Deep learning workshop, 2015. Lille

  21. Vinyals O, Blundell C, Lillicrap T, Wierstra D (2016) Matching networks for one shot learning. In: Advances in neural information processing systems, 2016. pp 3630–3638

  22. Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in neural information processing systems, 2017. pp 4077–4087

  23. Si J, Zhang H, Li C-G, Kuen J, Kong X, Kot AC, Wang G (2018) Dual attention matching network for context-aware feature sequence based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018. pp 5363–5372

  24. Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018. pp 1199–1208

  25. Jin X-B, Xie G-S, Huang K, Cao H, Wang Q. -F. (2019) Discriminant zero-shot learning with center loss. Cogn Comp 11(4):503–512

    Article  Google Scholar 

  26. Yue Z, Gao F, Xiong Q, Wang J, Huang T, Yang E, Zhou H (2019) A novel semi-supervised convolutional neural network method for synthetic aperture radar image recognition. Cogn Comp, pp 1–12

  27. Munusamy T, Karuppiah R, Bahuri NFA (2020) Telemedicine via smart glasses in critical care of the neurosurgical Patient–A COVID-19 pandemic preparedness and response in neurosurgery world neurosurgery

  28. Zhuang Y, Lv B, Sheng X, Zhu X (2017) Towards Chinese sign language recognition using surface electromyography and accelerometers. In: 2017 24Th international conference on mechatronics and machine vision in practice (m2VIP), 2017. IEEE, pp 1–5

  29. Nishikawa D, Yu W, Yokoi H, Kakazu Y (1999) EMG Prosthetic hand controller discriminating ten motions using real-time learning method. In: Proceedings 1999 IEEE/RSJ international conference on intelligent robots and systems. Human and environment friendly robots with high intelligence and emotional quotients (Cat. No. 99CH36289), 1999. IEEE, pp 1592– 1597

  30. Jane SPY, Sasidhar S (2018) Sign language interpreter: classification of forearm EMG and IMU signals for signing exact english. In: 2018 IEEE 14Th international conference on control and automation (ICCA), 2018. IEEE, pp 947–952

  31. Yu Y, Chen X, Cao S, Zhang X, Chen X (2019) Exploration of chinese sign language recognition using wearable sensors based on deep belief net IEEE journal of biomedical and health informatics

  32. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE Computer society conference on computer vision and pattern recognition (CVPR’05), 2005. IEEE, pp 886–893

  33. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations

  34. Merugu S, Dhillon IS, Ghosh J (2005) Clustering with Bregman divergences. Journal of machine learning research 6:1705–1749

    MathSciNet  MATH  Google Scholar 

  35. Lake BM, Salakhutdinov R, Gross J, Tenenbaum JB (2011) One shot learning of simple visual concepts. In: CogSci, 2011. pp 1–4

  36. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference on learning representations

Download references

Funding

This work was supported in part by the Fundamental Research Funds for the Central Universities of China under Grant N172608005, N182612002 and Liaoning Provincial Natural Science Foundation of China under Grant 20180520007.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fei Wang.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Consent for Publication

The authors declare that they consent to publication.

Additional information

Code Availability

Code availability is temporarily not allowed by the authors.

Ethical Approval

This article does not contain any studies with human participants performed by any of the authors.

Consent for Participate

Informed consent was obtained from all individual participants included in the study.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, F., Li, C., Zeng, Z. et al. Cornerstone network with feature extractor: a metric-based few-shot model for chinese natural sign language. Appl Intell 51, 7139–7150 (2021). https://doi.org/10.1007/s10489-020-02170-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-02170-9

Keywords

Navigation