Skip to main content
Log in

Retrieval-based language model adaptation for handwritten Chinese text recognition

  • Original Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

In handwritten text recognition, compared to human, computers are far short of linguistic context knowledge, especially domain-matched knowledge. In this paper, we present a novel retrieval-based method to obtain an adaptive language model for offline recognition of unconstrained handwritten Chinese texts. The content of handwritten texts to be recognized is varied and usually unknown a priori. Therefore we adopt a two-pass recognition strategy. In the first pass, we utilize a common language model to obtain initial recognition results, which are used to retrieve the related contents from Internet. In the content retrieval, we evaluate different types of semantic representation from BERT output and the traditional TF–IDF representation. Then, we dynamically generate an adaptive language model from these related contents, which will consequently be combined with the common language model and applied in the second-pass recognition. We evaluate the proposed method on two benchmark unconstrained handwriting datasets, namely CASIA-HWDB and ICDAR-2013. Experimental results show that the proposed retrieval-based language model adaptation yields improvements in recognition performance, despite the reduced Internet contents hereby employed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Accordingly, the over-segmentation is also called as explicit segmentation.

  2. https://pypi.org/project/jieba/.

  3. https://github.com/google-research/bert.

  4. http://www.sogou.com/labs/resource/list_news.php.

References

  1. Nagy, G.: Disruptive developments in document recognition. Pattern Recogn. Lett. 79, 106–112 (2016)

    Article  Google Scholar 

  2. Fujisawa, H.: Forty years of research in character and document recognition–an industrial perspective. Pattern Recogn. 41(8), 2435–2446 (2008)

    Article  Google Scholar 

  3. Dai, R.-W., Liu, C.-L., Xiao, B.-H.: Chinese character recognition: history, status and prospects. Front. Comput. Sci. China 1(2), 126–136 (2007)

    Article  Google Scholar 

  4. Liu, C.-L., Lu, Y. (eds.): Advances in Chinese Document and Text Processing, book in Series on Language Processing, Pattern Recognition, and Intelligent Systems, vol. 2. World Scientific (2017)

  5. Liu, C.-L., Yin, F., Wang, Q.-F., Wang, D.-H.: ICDAR 2011 Chinese Handwriting Recognition Competition. Proc. ICDAR, pp.1464–1469 (2011)

  6. Yin, F., Wang, Q.-F., Zhang, X.-Y., Liu, C.-L.: ICDAR 2013 Chinese Handwriting Recognition Competition. Proc. ICDAR, pp. 1464–1470 (2013)

  7. Cheng, C., Zhang, X.Y., Shao, X.H., Zhou, X.D.: Handwritten Chinese Character Recognition by Joint Classification and Similarity Ranking. Proc. Int’l Conf. on Frontiers in Handwriting Recognition (ICFHR), pp. 507-511 (2016)

  8. Zhang, X.-Y., Bengio, Y., Liu, C.-L.: Online and offline handwritten Chinese character recognition: a comprehensive study and new benchmark. Pattern Recogn. 61, 348–360 (2017)

    Article  Google Scholar 

  9. Wang, Q.-F., Yin, F., Liu, C.-L.: Handwritten Chinese text recognition by integrating multiple contexts. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 34(8), 1469–1481 (2012)

    Article  Google Scholar 

  10. Wu, Y.-C., Yin, F., Liu, C.-L.: Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models. Pattern Recogn. 65, 251–264 (2017)

    Article  Google Scholar 

  11. Wang, Q.-F., Cambria, E., Liu, C.-L., Hussain, A.: Common sense knowledge for handwritten Chinese text recognition. Cogn. Comput. 5(2), 234–242 (2013)

    Article  Google Scholar 

  12. Wang, Q.-F., Yin, F., Liu, C.-L.: Unsupervised language model adaptation for handwritten Chinese text recognition. Pattern Recogn. 47(3), 1202–1216 (2014)

    Article  Google Scholar 

  13. Li, Y.X., Tan, C.L., Ding, X.Q.: A hybrid postprocessing system for offline handwritten Chinese Script recognition. Pattern Anal. Appl. 8, 272–286 (2005)

    Article  MathSciNet  Google Scholar 

  14. Xu, R.F., Yeung, D.S., Shi, D.M.: A hybrid postprocessing system for offline handwritten Chinese character recognition based on a statistical language model. Int. J. Pattern Recognit. Artif. Intell. 19(3), 415–428 (2005)

    Article  Google Scholar 

  15. Wang, Q.-F., Yin, F., Liu, C.-L.: Integrating language model in handwriting Chinese text recognition. Proc. 10th ICDAR, pp. 1036-1040 (2009)

  16. Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka, E.R. Jr., Mitchell, T.M.: Toward an Architecture for Never-Ending Language Learning. In Proceedings of the Conference on Artificial Intelligence (AAAI) (2010)

  17. Mitchell, T., Cohen, W., Hruschka, E., et al.: Never-ending learning. In Proceedings of the Conference on Artificial Intelligence (AAAI) (2015)

  18. Fergus, R., Fei-Fei, L., Perona, P., et al.: Learning object categories from Google’s image search. Tenth IEEE International Conference on Computer Vision. IEEE, pp. 1816-1823 (2005)

  19. Nishizaki, H., Sekiguchi, Y.: Word Error Correction of Continuous Speech Recognition Using WEB Documents for Spoken Document Indexing. In: Matsumoto, Y., Sproat, R.W., Wong, K.F., Zhang, M. (eds.) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead, ICCPOL (2006)

    Google Scholar 

  20. Oertel, C., O’Shea, S., Bodnar, A., Blostein, D.: Using the web to validate document recognition results: experiments with business cards. Proc. SPIE, Document Recognition and Retrieval XII (2005)

  21. Donoser, M., Bischof, H., Wagner, S.: Using web search engines to improve text recognition. International Conference on Pattern Recognition, pp. 1–4 (2008)

  22. Donoser, M., Wagner, S., Bischof, H.: Context information from search engines for document recognition. Pattern Recogn. Lett. 31, 750–754 (2010)

    Article  Google Scholar 

  23. Bellegarda, J.R.: Statistical language model adaptation: review and perspectives. Speech Commun. 42(1), 93–108 (2004)

    Article  Google Scholar 

  24. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 (2019)

  25. Russell, B.C., Torralba, A., Murphy, K.P., et al.: LabelMe: a database and web-based tool for image annotation. Int. J. Comput. Vision 77(1–3), 157–173 (2008)

    Article  Google Scholar 

  26. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. IEEE Computer Vision and Pattern Recognition (CVPR) (2009)

  27. Li, F.-F. ImageNet: crowdsourcing, benchmarking & other cool things, CMU VASC Seminar, March (2010)

  28. Zhuo, H.H., Yang, Q., Pan, R., Li, L.: Cross-domain action-model acquisition for planning via web search. Twenty-First International Conference on Automated Planning and Scheduling (2011)

  29. Chen, L., Lamel, L., Gauvain, J.L., et al.: Dynamic language modeling for broadcast news. International Conference on Spoken Language Processing (2004)

  30. Whitelaw, C., Hutchinson, B., Chung, G.Y., et al.: Using the web for language independent spellchecking and autocorrection. Conference on Empirical Methods in Natural Language Processing: Volume. Association for Computational Linguistics, pp. 890-899 (2009)

  31. Bassil, Y., Alwani, M.: OCR post-processing error correction algorithm using Google’s online spelling suggestion. Emerg. Trends Comput. Inf. Sci. 3(1), 90–99 (2012)

    Google Scholar 

  32. Oprean, C., Likforman-Sulem, L., Popescu, A., et al.: Using the Web to Create Dynamic Dictionaries in Handwritten Out-of-Vocabulary Word Recognition. International Conference on Document Analysis and Recognition, pp. 989-993 (2013)

  33. Oprean, C., Popescu, A., Popescu, A., et al.: Handwritten word recognition using Web resources and recurrent neural networks. IJDAR 18(4), 287–301 (2015)

    Article  Google Scholar 

  34. Oprean, C., Likformansulem, L., Mokbel, C., et al.: BLSTM-based handwritten text recognition using Web resources. International Conference on Document Analysis and Recognition, pp. 466-470 (2015)

  35. Rosenfeld, R.: Two decades of statistical language modeling: Where do we go from here? Proc. IEEE 88(8), 1270–8 (2000)

    Article  Google Scholar 

  36. Marti, U.V., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition systems. Int. J. Pattern Recognit. Artif. Intell. 15(01), 65–90 (2001)

    Article  Google Scholar 

  37. Li, N.-X., Jin, L.-W.: A Bayesian-based probabilistic model for unconstrained handwritten offline Chinese text line recognition. Proc. IEEE Int’l Conf. Systems, Man, and Cybernetics, pp. 3664 - 3668 (2010)

  38. Zhou, X.-D., Wang, D.-H., Tian, F., Liu, C.-L., Nakagawa, M.: Handwritten Chinese/Japanese text recognition using semi-markov conditional random fields. IEEE Trans. Pattern Anal. Mach. Intell. 35(10), 2413–2426 (2013)

    Article  Google Scholar 

  39. Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)

    MATH  Google Scholar 

  40. Mikolov, T., Karafiat, M., Burget, L., Cernocky, J. H., Khudanpur, S.: Recurrent neural network based language model. Proc. Interspeech, pp. 1045-1048 (2010)

  41. Dauphin, Y.N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional networks arXiv:1612.08083v3 (2017.9)

  42. Irie, K., Tüske, Z., Alkhouli, T., et al.: LSTM, GRU, highway and a bit of attention: an empirical overview for language modeling in speech recognition. INTERSPEECH, 519-3523 (2016)

  43. Luong, T., Kayser, M., Manning, C.D.: Deep neural language models for machine translation. Nineteenth Conference on Computational Natural Language Learning, 305-309 (2015)

  44. Bellegarda, J.R.: Exploiting latent semantic information in statistical language modeling. Proc. IEEE 88(8), 1279–1296 (2000)

    Article  Google Scholar 

  45. Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1–2), 177–196 (2001)

    Article  MATH  Google Scholar 

  46. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. JMLR 3, 993–1022 (2003)

    MATH  Google Scholar 

  47. Xie, Z., Sun, Z., Jin, L., Ni, H., Lyons, T.: Learning spatial-semantic context with fully convolutional recurrent network for online handwritten Chinese text recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2017). https://doi.org/10.1109/TPAMI.2017.2732978

    Article  Google Scholar 

  48. Gao, J., Suzuki, H., Yuan, W.: An empirical study on language model adaptation. ACM Trans. Asian Lang. Inf. Process. 5(3), 209–227 (2006)

    Article  Google Scholar 

  49. Liu, X., Gales, M.J.F., Woodland, P.C.: Use of contexts in language model interpolation and adaptation. Comput. Speech Lang. 27(1), 301–321 (2013)

    Article  Google Scholar 

  50. Su, T.-H., Zhang, T.-W., Guan, D.-J.: Corpus-based HIT-MW database for offline recognition of general-purpose Chinese handwritten text. Int’l J. Doc. Anal. Recognit. 10(1), 27–38 (2007)

    Article  Google Scholar 

  51. Liu, C.-L., Yin, F., Wang, D.-H., Wang, Q.-F.: CASIA Online and offline Chinese handwriting databases. Proc. 11th Int’l Conf. Document Analysis and Recognition, pp. 37-41, (2011)

  52. Yin, F., Wang, Q.-F., Zhang, X.-Y., Liu, C.-L.: ICDAR 2013 Chinese handwriting recognition competition. Proc. ICDAR, pp. 1464-1470 (2013)

  53. Su, T.-H., Zhang, T., Guan, D.-J., Huang, H.-J.: Off-line recognition of realistic Chinese handwriting using segmentation-free strategy. Pattern Recognit. 42(1), 167–182 (2009)

  54. Wang, Z.-R., Du, Jun, Hu, J.-S., Hu, Yu-Long: Deep convolutional neural network based hidden markov model for offline handwritten Chinese text recognition. Proc. ACPR (2017)

  55. Peng, D., Jin, L., Ma, W., Xie, C., Zhang, H., Zhu, S., Li, J.: Recognition of handwritten Chinese text by segmentation: A segment-annotation-free approach. IEEE Trans. Multimed. (2022)

  56. Wang, Z.-R., Du, J., Wang, J.-M.: Writer-aware CNN for parsimonious HMM-based offline handwritten Chinese text recognition. Pattern Recogn. 100, 107102 (2020)

    Article  Google Scholar 

  57. Xu, L., Yin, F., Wang, Q.-F., Liu, C.-L.: An over-segmentation method for single touching Chinese handwriting with learning-based filtering. Int. J. Doc. Anal. Recognit. 17(1), 91–104 (2014)

    Article  Google Scholar 

  58. Wang, Z.X., Wang, Q.F., Yin, F., Liu, C.L.: Weakly supervised learning for over-segmentation based handwritten Chinese text recognition. 17th International Conference on Frontiers in Handwriting Recognition (ICFHR) (2020)

  59. Peng, D.-Z., Jin, L.-W., Wu, Y.-Q., Wang, Z.-P., Cai, M.-X.: A Fast and Accurate Fully Convolutional Network for End-to-End Handwritten Chinese Text Segmentation and Recognition. In Proc. 15th International Conference on Document Analysis and Recognition, pp. 25-30 (2019)

  60. Fink, G.A.: Markov models for offline handwriting recognition: a survey. Springer-Verlag (2009)

  61. Messina, R., Louradour, J.: Segmentation-free handwritten Chinese text recognition with LSTM-RNN. Proc. Int’l Conf. on Document Analysis and Recognition (ICDAR), pp.171-175 (2015)

  62. Stolcke, A.: SRILM—An extensible language modeling toolkit. In: Proceedings of the 7th international conference on spoken language processing (ICSLP 2002) 901-904 (2002)

  63. Wang, S., Chen, L., Xu, L., Fan, W., Sun, J., Naoi, S.: Deep knowledge training and heterogeneous cnn for handwritten chinese text recognition. Proc. 15th International Conference on Frontiers of Handwriting Recognition, pp. 84-89 (2016)

  64. Xie, Z.-C., Huang, Y.-X., Zhu, Y.-Z., Jin, L.-W., Liu, Y.-L., Xie, L.-L.: Aggregation cross-entropy for sequence recognition. In Proc. the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6538-6547 (2019)

Download references

Acknowledgements

The work was funded by National Natural Science Foundation of China under no. 61876154 and no. 61876155; Jiangsu Science and Technology Programme (Natural Science Foundation of Jiangsu Province) under no. BE2020006-4.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiufeng Wang.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, S., Wang, Q., Huang, K. et al. Retrieval-based language model adaptation for handwritten Chinese text recognition. IJDAR 26, 109–119 (2023). https://doi.org/10.1007/s10032-022-00419-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-022-00419-2

Keywords

Navigation