Towards Accurate Handwritten Word Recognition for Hindi and Bangla
- 4 Citations
- 1.1k Downloads
Abstract
Building accurate lexicon free handwritten text recognizers for Indic languages is a challenging task, mostly due to the inherent complexities in Indic scripts in addition to the cursive nature of handwriting. In this work, we demonstrate an end-to-end trainable CNN-RNN hybrid architecture which takes inspirations from recent advances of using residual blocks for training convolutional layers, along with the inclusion of spatial transformer layer to learn a model invariant to geometric distortions present in handwriting. In this work we focus building state of the art handwritten word recognizers for two popular Indic scripts – Devanagari and Bangla. To address the need of large scale training data for such low resources languages, we utilize synthetically rendered data for pre-training the network and later fine tune it on the real data. We outperform the previous lexicon based, state of the art methods on the test set of Devanagari and Bangla tracks of RoyDB by a significant margin.
Keywords
Handwriting recognition Lexicon free Indic scriptsNotes
Acknowledgement
This work was partly supported by IMPRINT scheme, Govt. of India. The authors would also like to thank Oishika, Sounak and Sreya for their help in verifying the results for Bangla.
References
- 1.Rath, T.M., Manmatha, R.: Word spotting for historical documents. IJDAR 9, 139–152 (2007)CrossRefGoogle Scholar
- 2.Srihari, S.N., Kuebert, E.J.: Integration of hand-written address interpretation technology into the united states postal service remote computer reader system. In: DAS (1997)Google Scholar
- 3.Krishnan, P., Jawahar, C.V.: Matching handwritten document images. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 766–782. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_46CrossRefGoogle Scholar
- 4.Milewski, R.J., Govindaraju, V., Bhardwaj, A.: Automatic recognition of handwritten medical forms for search engines. IJDAR 11, 203–218 (2009)CrossRefGoogle Scholar
- 5.Simons, G.F., Fennig, C.D.: Ethnologue: languages of the world. In: SIL International (2017)Google Scholar
- 6.Pal, U., Chaudhuri, B.: Indian script character recognition: a survey. PR 37, 1887–1899 (2004)Google Scholar
- 7.Adak, C., Chaudhuri, B.B., Blumenstein, M.: Offline cursive Bengali word recognition using CNNs with a recurrent model. In: ICFHR (2016)Google Scholar
- 8.Chaudhuri, B., Pal, U.: A complete printed bangla OCR system. PR 31, 531–549 (1998)Google Scholar
- 9.Bansal, V., Sinha, R.: A complete OCR for printed hindi text in Devanagari script. In: DAS (2001)Google Scholar
- 10.Mathew, M., Singh, A.K., Jawahar, C.: Multilingual OCR for Indic scripts. In: DAS (2016)Google Scholar
- 11.Bhowmik, T.K., Parui, S.K., Roy, U.: Discriminative HMM training with GA for handwritten word recognition. In: ICPR (2008)Google Scholar
- 12.Bhowmik, T.K., Roy, U., Parui, S.K.: Lexicon reduction technique for bangla handwritten word recognition. In: DAS (2012)Google Scholar
- 13.Shaw, B., Bhattacharya, U., Parui, S.K.: Combination of features for efficient recognition of offline handwritten Devanagari words. In: ICFHR (2014)Google Scholar
- 14.Shaw, B., Parui, S.K., Shridhar, M.: Offline handwritten Devanagari word recognition: a holistic approach based on directional chain code feature and HMM. In: ICIT (2008)Google Scholar
- 15.Roy, P.P., Bhunia, A.K., Das, A., Dey, P., Pal, U.: HMM-based Indic handwritten word recognition using zone segmentation. PR 60, 1057–1075 (2016)Google Scholar
- 16.Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. PAMI 3, 855–868 (2009)CrossRefGoogle Scholar
- 17.Garain, U., Mioulet, L., Chaudhuri, B.B., Chatelain, C., Paquet, T.: Unconstrained Bengali handwriting recognition with recurrent models. In: ICDAR (2015)Google Scholar
- 18.Liu, W., Chen, C., Wong, K.Y.K., Su, Z., Han, J.: STAR-Net: a spatial attention residue network for scene text recognition. In: BMVC (2016)Google Scholar
- 19.Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: NIPS (2015)Google Scholar
- 20.Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Synthetic data and artificial neural networks for natural scene text recognition (2014)Google Scholar
- 21.Krishnan, P., Jawahar, C.: Generating synthetic data for text recognition. arXiv preprint arXiv:1608.04224 (2016)
- 22.Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. PAMI 39, 2298–2304 (2016)CrossRefGoogle Scholar
- 23.Shi, B., Wang, X., Lyu, P., Yao, C., Bai, X.: Robust scene text recognition with automatic rectification. In: CVPR (2016)Google Scholar
- 24.Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: ICML (2006)Google Scholar
- 25.Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)Google Scholar
- 26.Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. IJCV 116, 1–20 (2016)MathSciNetCrossRefGoogle Scholar
- 27.Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
- 28.Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: CVPR (2015)Google Scholar
- 29.Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015)Google Scholar
- 30.He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
- 31.He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38CrossRefGoogle Scholar
- 32.Pham, V., Bluche, T., Kermorvant, C., Louradour, J.: Dropout improves recurrent neural networks for handwriting recognition. In: ICFHR (2014)Google Scholar
- 33.Zeiler, M.D.: ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)