An end-to-end network for irregular printed Mongolian recognition

Cui, ShaoDong; Su, YiLa; Qing dao er ji, Ren; Ji, YaTu

doi:10.1007/s10032-021-00388-y

An end-to-end network for irregular printed Mongolian recognition

Original Paper
Published: 18 October 2021

Volume 25, pages 41–50, (2022)
Cite this article

International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

ShaoDong Cui¹,
YiLa Su ORCID: orcid.org/0000-0003-4775-390X¹,
Ren Qing dao er ji¹ &
…
YaTu Ji¹

440 Accesses
7 Citations
Explore all metrics

Abstract

Mongolian is a language spoken in Inner Mongolia, China. In the recognition process, due to the shooting angle and other reasons, the image and text will be deformed, which will cause certain difficulties in recognition. This paper propose a triplet attention Mogrifier network (TAMN) for print Mongolian text recognition. The network uses a spatial transformation network to correct deformed Mongolian images. It uses gated recurrent convolution layers (GRCL) combine with triplet attention module to extract image features for the corrected images. The Mogrifier long short-term memory (LSTM) network gets the context sequence information in the feature and finally uses the decoder’s LSTM attention to get the prediction result. Experimental results show the spatial transformation network can effectively recognize deformed Mongolian images, and the recognition accuracy can reach 90.30%. This network achieves good performance in Mongolian text recognition compare with the current mainstream text recognition network. The dataset has been publicly available at https://github.com/ShaoDonCui/Mongolian-recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Convolutional Neural Network for Machine-Printed Traditional Mongolian Font Recognition

TriView-ParNet: parallel network for hybrid recognition of touching printed and handwritten strings based on feature fusion and three-view co-training

Article 21 December 2022

Junhao Qiu, Shangyu Lai, … Wing-Kuen Ling

From Textline to Paragraph: A Promising Practice for Chinese Text Recognition

Notes

China Mongolian News Network—http://www.mgyxw.net.

References

Baek, J., Kim, G., Lee, J., Park, S., Han, D., Yun, S., Oh, S.J., Lee, H.: What is wrong with scene text recognition model comparisons? Dataset and model analysis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 4715–4723 (2019)
Corbetta, M., Shulman, G.L.: Control of goal-directed and stimulus-driven attention in the brain. Nat. Rev. Neurosci. 3(3), 201–215 (2002)
Article Google Scholar
Daoerji, F., Guanglai, G.: Dnn-hmm for large vocabulary mongolian offline handwriting recognition. In: Proceedings of the 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), IEEE, pp 72–77 (2016)
Feng, W., He, W., Yin, F., Zhang, X.Y., Liu, C.L.: Textdragon: An end-to-end framework for arbitrary shaped text spotting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 9076–9085 (2019)
Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd International Conference on Machine Learning, pp 369–376 (2006)
Hu, H., Wei, H., Liu, Z.: The cnn based machine-printed traditional mongolian characters recognition. In: Proceedings of the 2017 36th Chinese Control Conference (CCC), IEEE, pp 3937–3941 (2017)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141 (2018)
Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. arXiv preprint arXiv:1506.02025 (2015)
Liu, W., Chen, C., Wong, K.Y.K., Su, Z., Han, J.: Star-net: a spatial attention residue network for scene text recognition. BMVC 2, 7 (2016)
Google Scholar
Liu, Y., Chen, H., Shen, C., He, T., Jin, L., Wang, L.: Abcnet: Real-time scene text spotting with adaptive bezier-curve network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9809–9818 (2020)
Luo, C., Zhu, Y., Jin, L., Wang, Y.: Learn to augment: Joint data augmentation and network optimization for text recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13746–13755 (2020)
Ma, L.L., Liu, J., Wu, J.: A new database for online handwritten mongolian word recognition. In: Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), IEEE, pp 1131–1136 (2016)
Melis, G., Kočiskỳ, T., Blunsom, P.: Mogrifier lstm. arXiv preprint arXiv:1909.01792 (2019)
Misra, D., Nalamada, T., Arasanipalai, A.U., Hou, Q.: Rotate to attend: Convolutional triplet attention module. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 3139–3148 (2021)
Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp 2204–2212 (2014)
Park, J., Woo, S., Lee, J.Y., Kweon, I.S.: Bam: Bottleneck attention module. arXiv preprint arXiv:1807.06514 (2018)
Peng, L., Liu, C., Ding, X., Jin, J., Wu, Y., Wang, H., Bao, Y.: Multi-font printed mongolian document recognition system. Int. J. Doc. Anal. Recogn. (IJDAR) 13(2), 93–106 (2010)
Article Google Scholar
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2016a)
Article Google Scholar
Shi, B., Wang, X., Lyu, P., Yao, C., Bai, X.: Robust scene text recognition with automatic rectification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4168–4176 (2016b)
Su, X., Gao, G., Wei, H., Bao, F.: A knowledge-based recognition system for historical mongolian documents. Int. J. Doc. Anal. Recogn. (IJDAR) 19(3), 221–235 (2016)
Article Google Scholar
Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., Tang, X.: Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164 (2017)
Wang, W., Wei, H., Zhang, H.: End-to-end model based on bidirectional lstm and ctc for segmentation-free traditional mongolian recognition. In: Proceedings of the 2019 Chinese Control Conference (CCC), IEEE, pp 8723–8727 (2019)
Wei, H., Gao, G.: Machine-printed traditional mongolian characters recognition using bp neural networks. In: Proceedings of the 2009 International Conference on Computational Intelligence and Software Engineering, IEEE, pp 1–7 (2009)
Wei, H., Gao, G.: A holistic recognition approach for woodblock-print mongolian words based on convolutional neural network. In: Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), IEEE, pp 2726–2730 (2019)
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19 (2018)
Zhang, H., Wei, H., Bao, F., Gao, G.: Segmentation-free printed traditional mongolian OCR using sequence to sequence with attention model. In: Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), IEEE, vol 1, pp 585–590 (2017)

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant Nos. 61966027 and 61966028), Inner Mongolia Autonomous Region Science and Technology Plan (Grant Nos. 2021GG0329 and 2021GG0140), National Natural Science Foundation of Inner Mongolia (Grant No. 2021MS06028). The authors wish to thank all the editors and anonymous reviewers for their constructive advice.

Author information

Authors and Affiliations

School of Information Engineering, Inner Mongolia University of Technology, Huhhot, China
ShaoDong Cui, YiLa Su, Ren Qing dao er ji & YaTu Ji

Authors

ShaoDong Cui
View author publications
You can also search for this author in PubMed Google Scholar
YiLa Su
View author publications
You can also search for this author in PubMed Google Scholar
Ren Qing dao er ji
View author publications
You can also search for this author in PubMed Google Scholar
YaTu Ji
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to YiLa Su.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cui, S., Su, Y., Qing dao er ji, R. et al. An end-to-end network for irregular printed Mongolian recognition. IJDAR 25, 41–50 (2022). https://doi.org/10.1007/s10032-021-00388-y

Download citation

Received: 11 May 2021
Revised: 11 September 2021
Accepted: 21 September 2021
Published: 18 October 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s10032-021-00388-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

An end-to-end network for irregular printed Mongolian recognition

Abstract

Access this article

Similar content being viewed by others

Convolutional Neural Network for Machine-Printed Traditional Mongolian Font Recognition

TriView-ParNet: parallel network for hybrid recognition of touching printed and handwritten strings based on feature fusion and three-view co-training

From Textline to Paragraph: A Promising Practice for Chinese Text Recognition

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An end-to-end network for irregular printed Mongolian recognition

Abstract

Access this article

Similar content being viewed by others

Convolutional Neural Network for Machine-Printed Traditional Mongolian Font Recognition

TriView-ParNet: parallel network for hybrid recognition of touching printed and handwritten strings based on feature fusion and three-view co-training

From Textline to Paragraph: A Promising Practice for Chinese Text Recognition

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation