Abstract
In handwritten text recognition, compared to human, computers are far short of linguistic context knowledge, especially domain-matched knowledge. In this paper, we present a novel retrieval-based method to obtain an adaptive language model for offline recognition of unconstrained handwritten Chinese texts. The content of handwritten texts to be recognized is varied and usually unknown a priori. Therefore we adopt a two-pass recognition strategy. In the first pass, we utilize a common language model to obtain initial recognition results, which are used to retrieve the related contents from Internet. In the content retrieval, we evaluate different types of semantic representation from BERT output and the traditional TF–IDF representation. Then, we dynamically generate an adaptive language model from these related contents, which will consequently be combined with the common language model and applied in the second-pass recognition. We evaluate the proposed method on two benchmark unconstrained handwriting datasets, namely CASIA-HWDB and ICDAR-2013. Experimental results show that the proposed retrieval-based language model adaptation yields improvements in recognition performance, despite the reduced Internet contents hereby employed.
Similar content being viewed by others
Notes
Accordingly, the over-segmentation is also called as explicit segmentation.
References
Nagy, G.: Disruptive developments in document recognition. Pattern Recogn. Lett. 79, 106–112 (2016)
Fujisawa, H.: Forty years of research in character and document recognition–an industrial perspective. Pattern Recogn. 41(8), 2435–2446 (2008)
Dai, R.-W., Liu, C.-L., Xiao, B.-H.: Chinese character recognition: history, status and prospects. Front. Comput. Sci. China 1(2), 126–136 (2007)
Liu, C.-L., Lu, Y. (eds.): Advances in Chinese Document and Text Processing, book in Series on Language Processing, Pattern Recognition, and Intelligent Systems, vol. 2. World Scientific (2017)
Liu, C.-L., Yin, F., Wang, Q.-F., Wang, D.-H.: ICDAR 2011 Chinese Handwriting Recognition Competition. Proc. ICDAR, pp.1464–1469 (2011)
Yin, F., Wang, Q.-F., Zhang, X.-Y., Liu, C.-L.: ICDAR 2013 Chinese Handwriting Recognition Competition. Proc. ICDAR, pp. 1464–1470 (2013)
Cheng, C., Zhang, X.Y., Shao, X.H., Zhou, X.D.: Handwritten Chinese Character Recognition by Joint Classification and Similarity Ranking. Proc. Int’l Conf. on Frontiers in Handwriting Recognition (ICFHR), pp. 507-511 (2016)
Zhang, X.-Y., Bengio, Y., Liu, C.-L.: Online and offline handwritten Chinese character recognition: a comprehensive study and new benchmark. Pattern Recogn. 61, 348–360 (2017)
Wang, Q.-F., Yin, F., Liu, C.-L.: Handwritten Chinese text recognition by integrating multiple contexts. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 34(8), 1469–1481 (2012)
Wu, Y.-C., Yin, F., Liu, C.-L.: Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models. Pattern Recogn. 65, 251–264 (2017)
Wang, Q.-F., Cambria, E., Liu, C.-L., Hussain, A.: Common sense knowledge for handwritten Chinese text recognition. Cogn. Comput. 5(2), 234–242 (2013)
Wang, Q.-F., Yin, F., Liu, C.-L.: Unsupervised language model adaptation for handwritten Chinese text recognition. Pattern Recogn. 47(3), 1202–1216 (2014)
Li, Y.X., Tan, C.L., Ding, X.Q.: A hybrid postprocessing system for offline handwritten Chinese Script recognition. Pattern Anal. Appl. 8, 272–286 (2005)
Xu, R.F., Yeung, D.S., Shi, D.M.: A hybrid postprocessing system for offline handwritten Chinese character recognition based on a statistical language model. Int. J. Pattern Recognit. Artif. Intell. 19(3), 415–428 (2005)
Wang, Q.-F., Yin, F., Liu, C.-L.: Integrating language model in handwriting Chinese text recognition. Proc. 10th ICDAR, pp. 1036-1040 (2009)
Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka, E.R. Jr., Mitchell, T.M.: Toward an Architecture for Never-Ending Language Learning. In Proceedings of the Conference on Artificial Intelligence (AAAI) (2010)
Mitchell, T., Cohen, W., Hruschka, E., et al.: Never-ending learning. In Proceedings of the Conference on Artificial Intelligence (AAAI) (2015)
Fergus, R., Fei-Fei, L., Perona, P., et al.: Learning object categories from Google’s image search. Tenth IEEE International Conference on Computer Vision. IEEE, pp. 1816-1823 (2005)
Nishizaki, H., Sekiguchi, Y.: Word Error Correction of Continuous Speech Recognition Using WEB Documents for Spoken Document Indexing. In: Matsumoto, Y., Sproat, R.W., Wong, K.F., Zhang, M. (eds.) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead, ICCPOL (2006)
Oertel, C., O’Shea, S., Bodnar, A., Blostein, D.: Using the web to validate document recognition results: experiments with business cards. Proc. SPIE, Document Recognition and Retrieval XII (2005)
Donoser, M., Bischof, H., Wagner, S.: Using web search engines to improve text recognition. International Conference on Pattern Recognition, pp. 1–4 (2008)
Donoser, M., Wagner, S., Bischof, H.: Context information from search engines for document recognition. Pattern Recogn. Lett. 31, 750–754 (2010)
Bellegarda, J.R.: Statistical language model adaptation: review and perspectives. Speech Commun. 42(1), 93–108 (2004)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 (2019)
Russell, B.C., Torralba, A., Murphy, K.P., et al.: LabelMe: a database and web-based tool for image annotation. Int. J. Comput. Vision 77(1–3), 157–173 (2008)
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. IEEE Computer Vision and Pattern Recognition (CVPR) (2009)
Li, F.-F. ImageNet: crowdsourcing, benchmarking & other cool things, CMU VASC Seminar, March (2010)
Zhuo, H.H., Yang, Q., Pan, R., Li, L.: Cross-domain action-model acquisition for planning via web search. Twenty-First International Conference on Automated Planning and Scheduling (2011)
Chen, L., Lamel, L., Gauvain, J.L., et al.: Dynamic language modeling for broadcast news. International Conference on Spoken Language Processing (2004)
Whitelaw, C., Hutchinson, B., Chung, G.Y., et al.: Using the web for language independent spellchecking and autocorrection. Conference on Empirical Methods in Natural Language Processing: Volume. Association for Computational Linguistics, pp. 890-899 (2009)
Bassil, Y., Alwani, M.: OCR post-processing error correction algorithm using Google’s online spelling suggestion. Emerg. Trends Comput. Inf. Sci. 3(1), 90–99 (2012)
Oprean, C., Likforman-Sulem, L., Popescu, A., et al.: Using the Web to Create Dynamic Dictionaries in Handwritten Out-of-Vocabulary Word Recognition. International Conference on Document Analysis and Recognition, pp. 989-993 (2013)
Oprean, C., Popescu, A., Popescu, A., et al.: Handwritten word recognition using Web resources and recurrent neural networks. IJDAR 18(4), 287–301 (2015)
Oprean, C., Likformansulem, L., Mokbel, C., et al.: BLSTM-based handwritten text recognition using Web resources. International Conference on Document Analysis and Recognition, pp. 466-470 (2015)
Rosenfeld, R.: Two decades of statistical language modeling: Where do we go from here? Proc. IEEE 88(8), 1270–8 (2000)
Marti, U.V., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition systems. Int. J. Pattern Recognit. Artif. Intell. 15(01), 65–90 (2001)
Li, N.-X., Jin, L.-W.: A Bayesian-based probabilistic model for unconstrained handwritten offline Chinese text line recognition. Proc. IEEE Int’l Conf. Systems, Man, and Cybernetics, pp. 3664 - 3668 (2010)
Zhou, X.-D., Wang, D.-H., Tian, F., Liu, C.-L., Nakagawa, M.: Handwritten Chinese/Japanese text recognition using semi-markov conditional random fields. IEEE Trans. Pattern Anal. Mach. Intell. 35(10), 2413–2426 (2013)
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
Mikolov, T., Karafiat, M., Burget, L., Cernocky, J. H., Khudanpur, S.: Recurrent neural network based language model. Proc. Interspeech, pp. 1045-1048 (2010)
Dauphin, Y.N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional networks arXiv:1612.08083v3 (2017.9)
Irie, K., Tüske, Z., Alkhouli, T., et al.: LSTM, GRU, highway and a bit of attention: an empirical overview for language modeling in speech recognition. INTERSPEECH, 519-3523 (2016)
Luong, T., Kayser, M., Manning, C.D.: Deep neural language models for machine translation. Nineteenth Conference on Computational Natural Language Learning, 305-309 (2015)
Bellegarda, J.R.: Exploiting latent semantic information in statistical language modeling. Proc. IEEE 88(8), 1279–1296 (2000)
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1–2), 177–196 (2001)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. JMLR 3, 993–1022 (2003)
Xie, Z., Sun, Z., Jin, L., Ni, H., Lyons, T.: Learning spatial-semantic context with fully convolutional recurrent network for online handwritten Chinese text recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2017). https://doi.org/10.1109/TPAMI.2017.2732978
Gao, J., Suzuki, H., Yuan, W.: An empirical study on language model adaptation. ACM Trans. Asian Lang. Inf. Process. 5(3), 209–227 (2006)
Liu, X., Gales, M.J.F., Woodland, P.C.: Use of contexts in language model interpolation and adaptation. Comput. Speech Lang. 27(1), 301–321 (2013)
Su, T.-H., Zhang, T.-W., Guan, D.-J.: Corpus-based HIT-MW database for offline recognition of general-purpose Chinese handwritten text. Int’l J. Doc. Anal. Recognit. 10(1), 27–38 (2007)
Liu, C.-L., Yin, F., Wang, D.-H., Wang, Q.-F.: CASIA Online and offline Chinese handwriting databases. Proc. 11th Int’l Conf. Document Analysis and Recognition, pp. 37-41, (2011)
Yin, F., Wang, Q.-F., Zhang, X.-Y., Liu, C.-L.: ICDAR 2013 Chinese handwriting recognition competition. Proc. ICDAR, pp. 1464-1470 (2013)
Su, T.-H., Zhang, T., Guan, D.-J., Huang, H.-J.: Off-line recognition of realistic Chinese handwriting using segmentation-free strategy. Pattern Recognit. 42(1), 167–182 (2009)
Wang, Z.-R., Du, Jun, Hu, J.-S., Hu, Yu-Long: Deep convolutional neural network based hidden markov model for offline handwritten Chinese text recognition. Proc. ACPR (2017)
Peng, D., Jin, L., Ma, W., Xie, C., Zhang, H., Zhu, S., Li, J.: Recognition of handwritten Chinese text by segmentation: A segment-annotation-free approach. IEEE Trans. Multimed. (2022)
Wang, Z.-R., Du, J., Wang, J.-M.: Writer-aware CNN for parsimonious HMM-based offline handwritten Chinese text recognition. Pattern Recogn. 100, 107102 (2020)
Xu, L., Yin, F., Wang, Q.-F., Liu, C.-L.: An over-segmentation method for single touching Chinese handwriting with learning-based filtering. Int. J. Doc. Anal. Recognit. 17(1), 91–104 (2014)
Wang, Z.X., Wang, Q.F., Yin, F., Liu, C.L.: Weakly supervised learning for over-segmentation based handwritten Chinese text recognition. 17th International Conference on Frontiers in Handwriting Recognition (ICFHR) (2020)
Peng, D.-Z., Jin, L.-W., Wu, Y.-Q., Wang, Z.-P., Cai, M.-X.: A Fast and Accurate Fully Convolutional Network for End-to-End Handwritten Chinese Text Segmentation and Recognition. In Proc. 15th International Conference on Document Analysis and Recognition, pp. 25-30 (2019)
Fink, G.A.: Markov models for offline handwriting recognition: a survey. Springer-Verlag (2009)
Messina, R., Louradour, J.: Segmentation-free handwritten Chinese text recognition with LSTM-RNN. Proc. Int’l Conf. on Document Analysis and Recognition (ICDAR), pp.171-175 (2015)
Stolcke, A.: SRILM—An extensible language modeling toolkit. In: Proceedings of the 7th international conference on spoken language processing (ICSLP 2002) 901-904 (2002)
Wang, S., Chen, L., Xu, L., Fan, W., Sun, J., Naoi, S.: Deep knowledge training and heterogeneous cnn for handwritten chinese text recognition. Proc. 15th International Conference on Frontiers of Handwriting Recognition, pp. 84-89 (2016)
Xie, Z.-C., Huang, Y.-X., Zhu, Y.-Z., Jin, L.-W., Liu, Y.-L., Xie, L.-L.: Aggregation cross-entropy for sequence recognition. In Proc. the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6538-6547 (2019)
Acknowledgements
The work was funded by National Natural Science Foundation of China under no. 61876154 and no. 61876155; Jiangsu Science and Technology Programme (Natural Science Foundation of Jiangsu Province) under no. BE2020006-4.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hu, S., Wang, Q., Huang, K. et al. Retrieval-based language model adaptation for handwritten Chinese text recognition. IJDAR 26, 109–119 (2023). https://doi.org/10.1007/s10032-022-00419-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-022-00419-2