Normal Template Mapping: An Association-Inspired Handwritten Character Recognition Model

Miao, Jun; Liu, Peng; Chen, Chen; Qiao, Yuanhua

doi:10.1007/s12559-024-10270-8

Normal Template Mapping: An Association-Inspired Handwritten Character Recognition Model

Published: 12 March 2024

(2024)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Jun Miao ORCID: orcid.org/0000-0003-0344-7871¹,
Peng Liu¹,
Chen Chen² &
…
Yuanhua Qiao³

123 Accesses
Explore all metrics

Abstract

In identifying objects, people usually associate memory templates to guide visual attention and determine the category of an object. The initial character images that children learn are usually normal patterns. However, the variation in corresponding handwritten patterns is quite large. To learn these deformed images with large variance, current deep models must involve millions of parameters for such kind of classification tasks that seem much easier and simpler to children who learn to recognize new characters associated with their initially taught normal patterns. From the perspective of humans’ perception, when people see a new object, they first think of a template image in their memory, which is similar to the object. This mapping process makes it easier for humans to learn new objects. Inspired by this cognitive association mechanism, this study developed a cognition-inspired handwritten character recognition model using a proposed normal template mapping neural network. This model uses an encoder-decoder architecture to build a normal template mapping neural network that transforms handwritten character images of one class to normalized characters similar to a given printed template character image representing that class. Then, a simple shallow classifier recognizes these normalized images, which are easier to classify. The experimental results show that the proposed model completes handwritten character recognition with comparable or higher precision at a much lower parameter count than current representative deep models. The proposed model removes the individual styles of handwritten character images and maps them to patterns similar to normal template images. This greatly reduces the classification difficulty and enables the classifier to classify only known standard character images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

Deep Learning for Generic Object Detection: A Survey

Article Open access 31 October 2019

Data Availability

The datasets of NMIST and EMIST that support the findings of this study are publicly available from the web.

References

Ibadulla R, Chen TM, Reyes-Aldasoro CC. FatNet: high-resolution kernels for classification using fully convolutional optical neural networks. AI. 2023;4:361–74. https://doi.org/10.3390/ai4020018.
Article Google Scholar
Zhou Y, Sun P, Zhang Y, Anguelov D, Gao J, Ouyang T, Guo J, Ngiam J, Vasudevan V. “End-to-end multi-view fusion for 3d object detection in lidar point clouds,” InConference on Robot Learning, 2020, pp. 923–932.
Giv MD, Borujeini MH, Makrani DS, Dastranj L, Yadollahi M, Semyari S, Sadrnia M, Ataei G, Madvar HR. Lung segmentation using active shape model to detect the disease from chest radiography. J Biomed Phys Eng. 2021;11:747.
Google Scholar
Szegedy C, Zaremba W, Sutskever I, et al. Intriguing properties of neural networks. 2013. arXiv preprint arXiv:1312.6199.
Madry A, Makelov A, Schmidt L, et al. Towards deep learning models resistant to adversarial attacks. 2017. arXiv preprint arXiv:1706.06083.
Kim YG, Kim K, Wu D, Ren H, Tak WY, Park SY, Lee YR, Kang MK, Park JG, Kim BS, et al. Deep learning-based four-region lung segmentation in chest radiography for COVID-19 diagnosis. Diagnostics. 2022;12:101.
Article PubMed PubMed Central CAS Google Scholar
Nguyen A, Yosinski J, Clune J. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 427–436.
Emanuel Ben-Baruch, Tal Ridnik, Itamar Friedman, Avi Ben Cohen, Nadav Zamir, Asaf Noy, and Lihi Zelnik-Manor. Multi-label classification with partial annotations using class aware selective loss. In Proceedings of the IEEE/CVF Con ference on Computer Vision and Pattern Recognition, pages 4764–4772, 2022.
Juncheng Li, Siliang Tang, Linchao Zhu, Wenqiao Zhang, Yi Yang, Tat-Seng Chua, and Fei Wu. Variational cross graph reasoning and adaptive structured semantics learning for compositional temporal grounding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
Cavalin P, Oliveira L. Confusion matrix-based building of hierarchical classification[C]//Iberoamerican Congress on Pattern Recognition. Cham: Springer; 2018. p. 271–8.
Google Scholar
Law H, Deng J. CornerNet: detecting objects as paired keypoints. In Proceedings of the European conference on computer vision (ECCV). 2048;734–750.
LeCun Y, Bengio Y. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks. 1995;3361(10):1995.
Google Scholar
Biederman I. Recognition-by-components: a theory of human image understanding. Psychol Rev. 1987;94(2):115–47.
Article PubMed Google Scholar
Hopfield JJ. Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci USA. 1982;79:2554–8.
Article ADS MathSciNet PubMed PubMed Central CAS Google Scholar
Yu X, Johal S, Geng J. Visual search guidance uses coarser template information than target-match decisions. Atten Percept Psychophys. 2022;84(5):1432–45.
Article PubMed PubMed Central Google Scholar
Lau J, Pashler H, Brady T. Target templates in low target-distractor discriminability visual search have higher resolution, but the advantage they provide is short-lived. Atten Percept Psychophys. 2021;83(4):1435–54.
Article PubMed PubMed Central Google Scholar
Kiat J, Bahle B, Luck S. Search templates for real-world objects in natural scenes. J Vis. 2022;22(14):4477.
Article Google Scholar
Volkova S. Template selection technique on object recognition. Proc. SPIE 12564, International Conference on Computer Applications for Management and Sustainable Development of Production and Industry. 2023;125640V.
Sahadevan S, Chen Y, Caplan J. Imagery-based strategies for memory for associations. Memory. 2021;29(10):1275–95.
Article PubMed Google Scholar
Mei L, Zhao Y, Wang H, Wang C, Zhang J, Zhao X. Matching by pixel distribution comparison: multisource image template matching. IET Signal Process. 2022;17(2).
Le M, Lien J. Robot arm grasping using learning-based template matching and self-rotation learning network. Preprint of Research Square. 2022. https://doi.org/10.21203/rs.3.rs-1402918/v1.
Li D, Song L, Wei Q, Chai H, Han T. Dynamic learning rate of template update for visual target tracking. Mathematics. 2023;11(9):1988.
Article Google Scholar
Hanne A, Tünnermann J, Schubö A. Target templates and the time course of distractor location learning. PsyArXiv. 2022. https://doi.org/10.31234/osf.io/728ch
Liu T, Wei B, Chang B, Sui Z. Large-scale simple question generation by template-based Seq2seq learning. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Y. (eds) Natural language processing and Chinese computing. NLPCC 2017. Lect Notes Comput Sci. 2018;10619. Springer, Cham.
Wei H, Pan S, Ma G, Duan X. Vision-guided hand–eye coordination for robotic grasping and its application in tangram puzzles. AI 2021, 2, 209–228. https://doi.org/10.3390/ai2020013.
Wei H, Li H. Shape description and recognition method inspired by the primary visual cortex. Cogn Comput. 2014;6:164–74.
Article Google Scholar
Alain G, Bengio Y. Understanding intermediate layers using linear classifier probes. 2016. arXiv preprint arXiv:1610.01644.
LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proc IEEE. 1998;86(11):2278–324.
Article Google Scholar
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition”. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016;2016:770–8.
Google Scholar
Huang G, Liu Z, van der Maaten L, Weinberger KQ. Densely connected convolutional networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017;2017:2261–9.
Google Scholar
Kabir HM, Abdar M, Jalali SMJ, et al. SpinalNet: deep neural network with gradual input. arXiv preprint arXiv:2007.03347, 2020.
Jayasundara V, Jayasekara S, Jayasekara H, et al. TextCaps: handwritten character recognition with very small datasets[C]//2019 IEEE winter conference on applications of computer vision (WACV). IEEE, 2019: 254–262.
Howard AG. MobileNets: efficient convolutional neural networks for mobile vision applications. 2017. https://doi.org/10.48550/arXiv.1704.04861.
Ma N, Zhang X, Zheng H-T, Sun J. ShuffleNet V2: practical guidelines for efficient CNN architecture design. 2018. https://doi.org/10.48550/arXiv.1807.11164.
Cohen G, Afshar S, Tapson J, et al. EMNIST: extending MNIST to handwritten letters[C]//2017 International Joint Conference on Neural Networks (IJCNN). IEEE, 2017: 2921–2926.
Dufourq E, Bassett BA. Eden: Evolutionary deep networks for efficient machine learning[C]//2017 Pattern Recognition Association of South Africa and Robotics and Mechatronics (PRASA-RobMech). IEEE. 2017:110–115.
Cheolhwan O, Zak SH. Large-scale pattern storage and retrieval using generalized brain-state-in box neural networks. IEEE Trans Neural Networks. 2010;4(21):633–43.
Google Scholar
Kosko B. Adaptive bidirectional associative memories. Appl Opt. 1987;26(23):4947–4860.
Article ADS PubMed CAS Google Scholar
Kosko B. Constructing an associative memory. Byte. 1987;12(10):137–44.
Google Scholar
Kosko B. Bidirectional associative memory. IEEE Trans Syst Man Cybern. 1988;18(1):49–60.
Article MathSciNet Google Scholar
Isola P, Zhu J Y, Zhou T, et al. Image-to-image translation with conditional adversarial networks. InProceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1125–1134.
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. 2017. arXiv preprint arXiv:1706.03762.
Ronneberger O, Fischer P, Brox TT. U-Net: convolutional networks for biomedical image segmentation. International Conference on Medical image computing and computer-assisted intervention. Springer: Cham; 2015. p. 234–41.
Google Scholar
Wang Z, Cun X, Bao J, Zhou W, Liu J, Li H. Uformer: a general U-shaped transformer for image restoration. In CVPR. 2022;6.
Kramer MA. Nonlinear principal component analysis using autoassociative neural networks[J]. AIChE J. 1991;37(2):233–43.
Article ADS CAS Google Scholar
Lu X, Tsao Y, Matsuda S, et al. Speech enhancement based on deep denoising autoencoder[C]//Interspeech. 2013, 2013: 436–440.
Makhzani A, Frey B. K-sparse autoencoders. 2013. arXiv preprint arXiv:1312.5663.
An J, Cho S. Variational autoencoder based anomaly detection using reconstruction probability[J]. Special Lecture on IE. 2015;2(1):1–18.
Google Scholar
Zhang L, Chen X, Tu X, Wan P, Xu N, Ma K. Wavelet knowledge distillation: towards efficient image-to-image translation. In CVPR. 2022;6.
Goodfellow IJ. “Generative adversarial networks”, arXiv e-prints, 2014. https://doi.org/10.48550/arXiv.1406.2661.
Zhou Z, Rahman Siddiquee MM, Tajbakhsh N, Liang J. UNet++: A nested U-Net architecture for medical image segmentation. In: Stoyanov, D., et al. Deep learning in medical image analysis and multimodal learning for clinical decision support. DLMIA ML-CDS 2018. Lect Notes Comput Sci. 2018;11045. Springer, Cham. https://doi.org/10.1007/978-3-030-00889-5_1.
Cohen G, Afshar S, Tapson J, Van Schaik A. EMNIST: an extension of MNIST to handwritten letters. 2017. Retrieved from arxiv.org/abs/1702.05373.

Download references

Funding

This research is partially sponsored by the Beijing Natural Science Foundation (No. 4202025); the Tianjin Anjian IoT Technology Enterprise Key Laboratory Research Project (No. VTJ-OT20230209-2); the Beijing VanJee Technology Co., Ltd-Beijing Municipal Science and Technology Project (No. Z201100003920003); and the Guizhou Provincial Sci-Tech Project (No. zk[2022] general 012).

Author information

Authors and Affiliations

School of Computer Science, Beijing Information Science and Technology University, Beijing, China
Jun Miao & Peng Liu
Pathological Information Engineering Technology Center, Jinan Supercomputing Technology Research Institute, Jinan, China
Chen Chen
College of Applied Sciences, Beijing University of Technology, Beijing, China
Yuanhua Qiao

Authors

Jun Miao
View author publications
You can also search for this author in PubMed Google Scholar
Peng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chen Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yuanhua Qiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Miao.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflict of Interest

The authors have no conflicts of interest to declare relevant to this article’s content.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Miao, J., Liu, P., Chen, C. et al. Normal Template Mapping: An Association-Inspired Handwritten Character Recognition Model. Cogn Comput (2024). https://doi.org/10.1007/s12559-024-10270-8

Download citation

Received: 31 July 2023
Accepted: 03 March 2024
Published: 12 March 2024
DOI: https://doi.org/10.1007/s12559-024-10270-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Normal Template Mapping: An Association-Inspired Handwritten Character Recognition Model

Abstract

Access this article

Similar content being viewed by others

Image Matching from Handcrafted to Deep Features: A Survey

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

Deep Learning for Generic Object Detection: A Survey

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical Approval

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Normal Template Mapping: An Association-Inspired Handwritten Character Recognition Model

Abstract

Access this article

Similar content being viewed by others

Image Matching from Handcrafted to Deep Features: A Survey

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

Deep Learning for Generic Object Detection: A Survey

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical Approval

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation