Skip to main content
Log in

Normal Template Mapping: An Association-Inspired Handwritten Character Recognition Model

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

In identifying objects, people usually associate memory templates to guide visual attention and determine the category of an object. The initial character images that children learn are usually normal patterns. However, the variation in corresponding handwritten patterns is quite large. To learn these deformed images with large variance, current deep models must involve millions of parameters for such kind of classification tasks that seem much easier and simpler to children who learn to recognize new characters associated with their initially taught normal patterns. From the perspective of humans’ perception, when people see a new object, they first think of a template image in their memory, which is similar to the object. This mapping process makes it easier for humans to learn new objects. Inspired by this cognitive association mechanism, this study developed a cognition-inspired handwritten character recognition model using a proposed normal template mapping neural network. This model uses an encoder-decoder architecture to build a normal template mapping neural network that transforms handwritten character images of one class to normalized characters similar to a given printed template character image representing that class. Then, a simple shallow classifier recognizes these normalized images, which are easier to classify. The experimental results show that the proposed model completes handwritten character recognition with comparable or higher precision at a much lower parameter count than current representative deep models. The proposed model removes the individual styles of handwritten character images and maps them to patterns similar to normal template images. This greatly reduces the classification difficulty and enables the classifier to classify only known standard character images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability

The datasets of NMIST and EMIST that support the findings of this study are publicly available from the web.

References

  1. Ibadulla R, Chen TM, Reyes-Aldasoro CC. FatNet: high-resolution kernels for classification using fully convolutional optical neural networks. AI. 2023;4:361–74. https://doi.org/10.3390/ai4020018.

    Article  Google Scholar 

  2. Zhou Y, Sun P, Zhang Y, Anguelov D, Gao J, Ouyang T, Guo J, Ngiam J, Vasudevan V. “End-to-end multi-view fusion for 3d object detection in lidar point clouds,” InConference on Robot Learning, 2020, pp. 923–932.

  3. Giv MD, Borujeini MH, Makrani DS, Dastranj L, Yadollahi M, Semyari S, Sadrnia M, Ataei G, Madvar HR. Lung segmentation using active shape model to detect the disease from chest radiography. J Biomed Phys Eng. 2021;11:747.

    Google Scholar 

  4. Szegedy C, Zaremba W, Sutskever I, et al. Intriguing properties of neural networks. 2013. arXiv preprint arXiv:1312.6199.

  5. Madry A, Makelov A, Schmidt L, et al. Towards deep learning models resistant to adversarial attacks. 2017. arXiv preprint arXiv:1706.06083.

  6. Kim YG, Kim K, Wu D, Ren H, Tak WY, Park SY, Lee YR, Kang MK, Park JG, Kim BS, et al. Deep learning-based four-region lung segmentation in chest radiography for COVID-19 diagnosis. Diagnostics. 2022;12:101.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  7. Nguyen A, Yosinski J, Clune J. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 427–436.

  8. Emanuel Ben-Baruch, Tal Ridnik, Itamar Friedman, Avi Ben Cohen, Nadav Zamir, Asaf Noy, and Lihi Zelnik-Manor. Multi-label classification with partial annotations using class aware selective loss. In Proceedings of the IEEE/CVF Con ference on Computer Vision and Pattern Recognition, pages 4764–4772, 2022.

  9. Juncheng Li, Siliang Tang, Linchao Zhu, Wenqiao Zhang, Yi Yang, Tat-Seng Chua, and Fei Wu. Variational cross graph reasoning and adaptive structured semantics learning for compositional temporal grounding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.

  10. Cavalin P, Oliveira L. Confusion matrix-based building of hierarchical classification[C]//Iberoamerican Congress on Pattern Recognition. Cham: Springer; 2018. p. 271–8.

    Google Scholar 

  11. Law H, Deng J. CornerNet: detecting objects as paired keypoints. In Proceedings of the European conference on computer vision (ECCV). 2048;734–750.

  12. LeCun Y, Bengio Y. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks. 1995;3361(10):1995.

    Google Scholar 

  13. Biederman I. Recognition-by-components: a theory of human image understanding. Psychol Rev. 1987;94(2):115–47.

    Article  PubMed  Google Scholar 

  14. Hopfield JJ. Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci USA. 1982;79:2554–8.

    Article  ADS  MathSciNet  PubMed  PubMed Central  CAS  Google Scholar 

  15. Yu X, Johal S, Geng J. Visual search guidance uses coarser template information than target-match decisions. Atten Percept Psychophys. 2022;84(5):1432–45.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Lau J, Pashler H, Brady T. Target templates in low target-distractor discriminability visual search have higher resolution, but the advantage they provide is short-lived. Atten Percept Psychophys. 2021;83(4):1435–54.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Kiat J, Bahle B, Luck S. Search templates for real-world objects in natural scenes. J Vis. 2022;22(14):4477.

    Article  Google Scholar 

  18. Volkova S. Template selection technique on object recognition. Proc. SPIE 12564, International Conference on Computer Applications for Management and Sustainable Development of Production and Industry. 2023;125640V.

  19. Sahadevan S, Chen Y, Caplan J. Imagery-based strategies for memory for associations. Memory. 2021;29(10):1275–95.

    Article  PubMed  Google Scholar 

  20. Mei L, Zhao Y, Wang H, Wang C, Zhang J, Zhao X. Matching by pixel distribution comparison: multisource image template matching. IET Signal Process. 2022;17(2).

  21. Le M, Lien J. Robot arm grasping using learning-based template matching and self-rotation learning network. Preprint of Research Square. 2022. https://doi.org/10.21203/rs.3.rs-1402918/v1.

  22. Li D, Song L, Wei Q, Chai H, Han T. Dynamic learning rate of template update for visual target tracking. Mathematics. 2023;11(9):1988.

    Article  Google Scholar 

  23. Hanne A, Tünnermann J, Schubö A. Target templates and the time course of distractor location learning. PsyArXiv. 2022. https://doi.org/10.31234/osf.io/728ch

  24. Liu T, Wei B, Chang B, Sui Z. Large-scale simple question generation by template-based Seq2seq learning. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Y. (eds) Natural language processing and Chinese computing. NLPCC 2017. Lect Notes Comput Sci. 2018;10619. Springer, Cham.

  25. Wei H, Pan S, Ma G, Duan X. Vision-guided hand–eye coordination for robotic grasping and its application in tangram puzzles. AI 2021, 2, 209–228. https://doi.org/10.3390/ai2020013.

  26. Wei H, Li H. Shape description and recognition method inspired by the primary visual cortex. Cogn Comput. 2014;6:164–74.

    Article  Google Scholar 

  27. Alain G, Bengio Y. Understanding intermediate layers using linear classifier probes. 2016. arXiv preprint arXiv:1610.01644.

  28. LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proc IEEE. 1998;86(11):2278–324.

    Article  Google Scholar 

  29. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition”. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016;2016:770–8.

    Google Scholar 

  30. Huang G, Liu Z, van der Maaten L, Weinberger KQ. Densely connected convolutional networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017;2017:2261–9.

    Google Scholar 

  31. Kabir HM, Abdar M, Jalali SMJ, et al. SpinalNet: deep neural network with gradual input. arXiv preprint arXiv:2007.03347, 2020.

  32. Jayasundara V, Jayasekara S, Jayasekara H, et al. TextCaps: handwritten character recognition with very small datasets[C]//2019 IEEE winter conference on applications of computer vision (WACV). IEEE, 2019: 254–262.

  33. Howard AG. MobileNets: efficient convolutional neural networks for mobile vision applications. 2017. https://doi.org/10.48550/arXiv.1704.04861.

  34. Ma N, Zhang X, Zheng H-T, Sun J. ShuffleNet V2: practical guidelines for efficient CNN architecture design. 2018. https://doi.org/10.48550/arXiv.1807.11164.

  35. Cohen G, Afshar S, Tapson J, et al. EMNIST: extending MNIST to handwritten letters[C]//2017 International Joint Conference on Neural Networks (IJCNN). IEEE, 2017: 2921–2926.

  36. Dufourq E, Bassett BA. Eden: Evolutionary deep networks for efficient machine learning[C]//2017 Pattern Recognition Association of South Africa and Robotics and Mechatronics (PRASA-RobMech). IEEE. 2017:110–115.

  37. Cheolhwan O, Zak SH. Large-scale pattern storage and retrieval using generalized brain-state-in box neural networks. IEEE Trans Neural Networks. 2010;4(21):633–43.

    Google Scholar 

  38. Kosko B. Adaptive bidirectional associative memories. Appl Opt. 1987;26(23):4947–4860.

    Article  ADS  PubMed  CAS  Google Scholar 

  39. Kosko B. Constructing an associative memory. Byte. 1987;12(10):137–44.

    Google Scholar 

  40. Kosko B. Bidirectional associative memory. IEEE Trans Syst Man Cybern. 1988;18(1):49–60.

    Article  MathSciNet  Google Scholar 

  41. Isola P, Zhu J Y, Zhou T, et al. Image-to-image translation with conditional adversarial networks. InProceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1125–1134.

  42. Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. 2017. arXiv preprint arXiv:1706.03762.

  43. Ronneberger O, Fischer P, Brox TT. U-Net: convolutional networks for biomedical image segmentation. International Conference on Medical image computing and computer-assisted intervention. Springer: Cham; 2015. p. 234–41.

    Google Scholar 

  44. Wang Z, Cun X, Bao J, Zhou W, Liu J, Li H. Uformer: a general U-shaped transformer for image restoration. In CVPR. 2022;6.

  45. Kramer MA. Nonlinear principal component analysis using autoassociative neural networks[J]. AIChE J. 1991;37(2):233–43.

    Article  ADS  CAS  Google Scholar 

  46. Lu X, Tsao Y, Matsuda S, et al. Speech enhancement based on deep denoising autoencoder[C]//Interspeech. 2013, 2013: 436–440.

  47. Makhzani A, Frey B. K-sparse autoencoders. 2013. arXiv preprint arXiv:1312.5663.

  48. An J, Cho S. Variational autoencoder based anomaly detection using reconstruction probability[J]. Special Lecture on IE. 2015;2(1):1–18.

    Google Scholar 

  49. Zhang L, Chen X, Tu X, Wan P, Xu N, Ma K. Wavelet knowledge distillation: towards efficient image-to-image translation. In CVPR. 2022;6.

  50. Goodfellow IJ. “Generative adversarial networks”, arXiv e-prints, 2014. https://doi.org/10.48550/arXiv.1406.2661.

  51. Zhou Z, Rahman Siddiquee MM, Tajbakhsh N, Liang J. UNet++: A nested U-Net architecture for medical image segmentation. In: Stoyanov, D., et al. Deep learning in medical image analysis and multimodal learning for clinical decision support. DLMIA ML-CDS 2018. Lect Notes Comput Sci. 2018;11045. Springer, Cham. https://doi.org/10.1007/978-3-030-00889-5_1.

  52. Cohen G, Afshar S, Tapson J,  Van Schaik A. EMNIST: an extension of MNIST to handwritten letters. 2017. Retrieved from arxiv.org/abs/1702.05373.

Download references

Funding

This research is partially sponsored by the Beijing Natural Science Foundation (No. 4202025); the Tianjin Anjian IoT Technology Enterprise Key Laboratory Research Project (No. VTJ-OT20230209-2); the Beijing VanJee Technology Co., Ltd-Beijing Municipal Science and Technology Project (No. Z201100003920003); and the Guizhou Provincial Sci-Tech Project (No. zk[2022] general 012).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Miao.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflict of Interest

The authors have no conflicts of interest to declare relevant to this article’s content.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Miao, J., Liu, P., Chen, C. et al. Normal Template Mapping: An Association-Inspired Handwritten Character Recognition Model. Cogn Comput (2024). https://doi.org/10.1007/s12559-024-10270-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12559-024-10270-8

Keywords

Navigation