Abstract
StandardChinese natural sign language (CNSL) contains over 8,000 words. We consider dividing the task of CNSL recognition into multiple subtasks. Few-shot learning on subtasks can achieve minimal acquisition cost and short-term training. However, the existing few-shot learning methods do not take into account the impact of ill-conditioned support samples, so we propose a new metric-based model, Cornerstone Network (CN), to complete the subtasks. CN is mainly composed of feature extractor (optional), embedding network and cornerstone generator. The cornerstone generator is designed as a semi-supervised clusterer. Compared with other metric-based few-shot models, CN without feature extractor improves 5-shot accuracy on Omniglot and miniImageNet. In order to verify the feasibility of our model on the task of CNSL recognition, we expanded the Chinese Natural Sign Language database, from CNSL-80 to CNSL-139, which integrates surface electromyography and inertial signals. The 5-shot accuracy on CNSL-139 increases from 65.25% to 68.83% comparing with the state-of-art model. After connecting with the 1-D convolution feature extractor using Siamese Network’s idea for secondary training, the accuracy increases by 10.38%. During the online test, the feature vector norms are used for selective matching. Although the accuracy drops, it is still at least 5% higher than that without feature extractor. Experimental results confirm the effectiveness of our model on 2-D images and 1-D time-series signals and the improvement of real-time recognition by SM.
Similar content being viewed by others
Data Availability
Availability of data is temporarily not allowed by the authors.
References
Wang P, Song Q, Han H, Cheng J (2016) Sequentially supervised long short-term memory for gesture recognition. Cognitive Computation 8(5):982–991
Chiu C-M, Chen S-W, Pao Y-P, Huang M-Z, Chan S-W, Lin Z-H (2019) A smart glove with integrated triboelectric nanogenerator for self-powered gesture recognition and language expression. Sci Technol Adv Mater 20(1):964–971
Camgoz NC, Hadfield S, Koller O, Bowden R (2017) Subunets: End-to-end hand shape and continuous sign language recognition. In: 2017 IEEE International conference on computer vision (ICCV), 2017. IEEE, pp 3075–3084
Perera AG, Law YW, Chahl J (2018) Human pose and path estimation from aerial video using dynamic classifier selection. Cogn Comp 10(6):1019–1041
Cui R, Liu H, Zhang C (2019) A deep neural framework for continuous sign language recognition by iterative training. IEEE Trans Multimedia 21(7):1880–1891
Liang RH, Ouhyoung MA (1995) Real-time Continuous Alphabetic Sign Language to Speech Conversion VR System. In: Computer graphics forum, 1995. vol 3. Wiley online library, pp 67–76
Mummadi CK, Leo FPP, Verma KD, Kasireddy S, Scholl PM, Van Laerhoven K (2017) Real-time embedded recognition of sign language alphabet fingerspelling in an imu-based glove. In: Proceedings of the 4th international workshop on sensor-based activity recognition and interaction, 2017. pp 1–6
Cheng J, Chen X, Liu A, Peng H (2015) A novel phonology-and radical-coded chinese sign language recognition framework using accelerometer and surface electromyography sensors. Sensors 15 (9):23303–23324
Hu Y, Wong Y, Wei W, Du Y, Kankanhalli M, Geng W (2018) A novel attention-based hybrid CNN-RNN architecture for sEMG-based gesture recognition PloS one 13 (10)
Zhang Q, Wang D, Zhao R, Yu Y (2019) Myosign: enabling end-to-end sign language recognition with wearables. In: Proceedings of the 24th international conference on intelligent user interfaces, 2019. pp 650–660
Wang F, Zhao S, Zhou X, Li C, Li M, Zeng Z (2019) An recognition–verification mechanism for real-time chinese sign language recognition based on multi-information fusion. Sensors 19(11):2495
Krizhevsky A, Sutskever I (2012) Hinton GE Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, 2012. pp 1097–1105
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence
Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: International conference on learning representations
Zakharov E, Shysheya A, Burkov E (2019) Few-shot adversarial learning of realistic neural talking head models. In: Proceedings of the IEEE international conference on computer vision
Oreshkin B, López P R, Lacoste A (2018) Tadam: Task dependent adaptive metric for improved few-shot learning. In: Advances in neural information processing systems, 2018. pp 721–731
Yoo S, Bahng H, Chung S (2019) Coloring with limited data: Few-shot colorization via memory augmented networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2019. pp 11283–11292
Lüders B, Schläger M, Korach A, Risi S (2017) Continual and one-shot learning through neural networks with dynamic external memory. In: European conference on the applications of evolutionary computation, 2017. Springer, pp 886–901
Zhang S, Huang K, Zhang R, Hussain A (2018) Learning from few samples with memory network. Cogn Comp 10(1):15–22
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: ICML Deep learning workshop, 2015. Lille
Vinyals O, Blundell C, Lillicrap T, Wierstra D (2016) Matching networks for one shot learning. In: Advances in neural information processing systems, 2016. pp 3630–3638
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in neural information processing systems, 2017. pp 4077–4087
Si J, Zhang H, Li C-G, Kuen J, Kong X, Kot AC, Wang G (2018) Dual attention matching network for context-aware feature sequence based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018. pp 5363–5372
Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018. pp 1199–1208
Jin X-B, Xie G-S, Huang K, Cao H, Wang Q. -F. (2019) Discriminant zero-shot learning with center loss. Cogn Comp 11(4):503–512
Yue Z, Gao F, Xiong Q, Wang J, Huang T, Yang E, Zhou H (2019) A novel semi-supervised convolutional neural network method for synthetic aperture radar image recognition. Cogn Comp, pp 1–12
Munusamy T, Karuppiah R, Bahuri NFA (2020) Telemedicine via smart glasses in critical care of the neurosurgical Patient–A COVID-19 pandemic preparedness and response in neurosurgery world neurosurgery
Zhuang Y, Lv B, Sheng X, Zhu X (2017) Towards Chinese sign language recognition using surface electromyography and accelerometers. In: 2017 24Th international conference on mechatronics and machine vision in practice (m2VIP), 2017. IEEE, pp 1–5
Nishikawa D, Yu W, Yokoi H, Kakazu Y (1999) EMG Prosthetic hand controller discriminating ten motions using real-time learning method. In: Proceedings 1999 IEEE/RSJ international conference on intelligent robots and systems. Human and environment friendly robots with high intelligence and emotional quotients (Cat. No. 99CH36289), 1999. IEEE, pp 1592– 1597
Jane SPY, Sasidhar S (2018) Sign language interpreter: classification of forearm EMG and IMU signals for signing exact english. In: 2018 IEEE 14Th international conference on control and automation (ICCA), 2018. IEEE, pp 947–952
Yu Y, Chen X, Cao S, Zhang X, Chen X (2019) Exploration of chinese sign language recognition using wearable sensors based on deep belief net IEEE journal of biomedical and health informatics
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE Computer society conference on computer vision and pattern recognition (CVPR’05), 2005. IEEE, pp 886–893
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations
Merugu S, Dhillon IS, Ghosh J (2005) Clustering with Bregman divergences. Journal of machine learning research 6:1705–1749
Lake BM, Salakhutdinov R, Gross J, Tenenbaum JB (2011) One shot learning of simple visual concepts. In: CogSci, 2011. pp 1–4
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference on learning representations
Funding
This work was supported in part by the Fundamental Research Funds for the Central Universities of China under Grant N172608005, N182612002 and Liaoning Provincial Natural Science Foundation of China under Grant 20180520007.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no conflict of interest.
Consent for Publication
The authors declare that they consent to publication.
Additional information
Code Availability
Code availability is temporarily not allowed by the authors.
Ethical Approval
This article does not contain any studies with human participants performed by any of the authors.
Consent for Participate
Informed consent was obtained from all individual participants included in the study.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, F., Li, C., Zeng, Z. et al. Cornerstone network with feature extractor: a metric-based few-shot model for chinese natural sign language. Appl Intell 51, 7139–7150 (2021). https://doi.org/10.1007/s10489-020-02170-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-02170-9