Towards Accurate Handwritten Word Recognition for Hindi and Bangla

Dutta, Kartik; Krishnan, Praveen; Mathew, Minesh; Jawahar, C. V.

doi:10.1007/978-981-13-0020-2_41

Kartik Dutta¹²,
Praveen Krishnan ORCID: orcid.org/0000-0003-4620-2292¹²,
Minesh Mathew¹² &
…
C. V. Jawahar¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 841))

Included in the following conference series:

National Conference on Computer Vision, Pattern Recognition, Image Processing, and Graphics

1436 Accesses
11 Citations

Abstract

Building accurate lexicon free handwritten text recognizers for Indic languages is a challenging task, mostly due to the inherent complexities in Indic scripts in addition to the cursive nature of handwriting. In this work, we demonstrate an end-to-end trainable CNN-RNN hybrid architecture which takes inspirations from recent advances of using residual blocks for training convolutional layers, along with the inclusion of spatial transformer layer to learn a model invariant to geometric distortions present in handwriting. In this work we focus building state of the art handwritten word recognizers for two popular Indic scripts – Devanagari and Bangla. To address the need of large scale training data for such low resources languages, we utilize synthetically rendered data for pre-training the network and later fine tune it on the real data. We outperform the previous lexicon based, state of the art methods on the test set of Devanagari and Bangla tracks of RoyDB by a significant margin.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Rath, T.M., Manmatha, R.: Word spotting for historical documents. IJDAR 9, 139–152 (2007)
Article Google Scholar
Srihari, S.N., Kuebert, E.J.: Integration of hand-written address interpretation technology into the united states postal service remote computer reader system. In: DAS (1997)
Google Scholar
Krishnan, P., Jawahar, C.V.: Matching handwritten document images. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 766–782. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_46
Chapter Google Scholar
Milewski, R.J., Govindaraju, V., Bhardwaj, A.: Automatic recognition of handwritten medical forms for search engines. IJDAR 11, 203–218 (2009)
Article Google Scholar
Simons, G.F., Fennig, C.D.: Ethnologue: languages of the world. In: SIL International (2017)
Google Scholar
Pal, U., Chaudhuri, B.: Indian script character recognition: a survey. PR 37, 1887–1899 (2004)
Google Scholar
Adak, C., Chaudhuri, B.B., Blumenstein, M.: Offline cursive Bengali word recognition using CNNs with a recurrent model. In: ICFHR (2016)
Google Scholar
Chaudhuri, B., Pal, U.: A complete printed bangla OCR system. PR 31, 531–549 (1998)
Google Scholar
Bansal, V., Sinha, R.: A complete OCR for printed hindi text in Devanagari script. In: DAS (2001)
Google Scholar
Mathew, M., Singh, A.K., Jawahar, C.: Multilingual OCR for Indic scripts. In: DAS (2016)
Google Scholar
Bhowmik, T.K., Parui, S.K., Roy, U.: Discriminative HMM training with GA for handwritten word recognition. In: ICPR (2008)
Google Scholar
Bhowmik, T.K., Roy, U., Parui, S.K.: Lexicon reduction technique for bangla handwritten word recognition. In: DAS (2012)
Google Scholar
Shaw, B., Bhattacharya, U., Parui, S.K.: Combination of features for efficient recognition of offline handwritten Devanagari words. In: ICFHR (2014)
Google Scholar
Shaw, B., Parui, S.K., Shridhar, M.: Offline handwritten Devanagari word recognition: a holistic approach based on directional chain code feature and HMM. In: ICIT (2008)
Google Scholar
Roy, P.P., Bhunia, A.K., Das, A., Dey, P., Pal, U.: HMM-based Indic handwritten word recognition using zone segmentation. PR 60, 1057–1075 (2016)
Google Scholar
Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. PAMI 3, 855–868 (2009)
Article Google Scholar
Garain, U., Mioulet, L., Chaudhuri, B.B., Chatelain, C., Paquet, T.: Unconstrained Bengali handwriting recognition with recurrent models. In: ICDAR (2015)
Google Scholar
Liu, W., Chen, C., Wong, K.Y.K., Su, Z., Han, J.: STAR-Net: a spatial attention residue network for scene text recognition. In: BMVC (2016)
Google Scholar
Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: NIPS (2015)
Google Scholar
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Synthetic data and artificial neural networks for natural scene text recognition (2014)
Google Scholar
Krishnan, P., Jawahar, C.: Generating synthetic data for text recognition. arXiv preprint arXiv:1608.04224 (2016)
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. PAMI 39, 2298–2304 (2016)
Article Google Scholar
Shi, B., Wang, X., Lyu, P., Yao, C., Bai, X.: Robust scene text recognition with automatic rectification. In: CVPR (2016)
Google Scholar
Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: ICML (2006)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. IJCV 116, 1–20 (2016)
Article MathSciNet Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: CVPR (2015)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Chapter Google Scholar
Pham, V., Bluche, T., Kermorvant, C., Louradour, J.: Dropout improves recurrent neural networks for handwriting recognition. In: ICFHR (2014)
Google Scholar
Zeiler, M.D.: ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)

Download references

Acknowledgement

This work was partly supported by IMPRINT scheme, Govt. of India. The authors would also like to thank Oishika, Sounak and Sreya for their help in verifying the results for Bangla.

Author information

Authors and Affiliations

CVIT, IIIT Hyderabad, Hyderabad, India
Kartik Dutta, Praveen Krishnan, Minesh Mathew & C. V. Jawahar

Authors

Kartik Dutta
View author publications
You can also search for this author in PubMed Google Scholar
Praveen Krishnan
View author publications
You can also search for this author in PubMed Google Scholar
Minesh Mathew
View author publications
You can also search for this author in PubMed Google Scholar
C. V. Jawahar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kartik Dutta .

Editor information

Editors and Affiliations

Indian Institute of Technology Mandi, Mandi, Himachal Pradesh, India
Renu Rameshan
Indraprastha Institute of Information Technology, New Delhi, India
Chetan Arora
Indian Institute of Technology, New Delhi, India
Sumantra Dutta Roy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dutta, K., Krishnan, P., Mathew, M., Jawahar, C.V. (2018). Towards Accurate Handwritten Word Recognition for Hindi and Bangla. In: Rameshan, R., Arora, C., Dutta Roy, S. (eds) Computer Vision, Pattern Recognition, Image Processing, and Graphics. NCVPRIPG 2017. Communications in Computer and Information Science, vol 841. Springer, Singapore. https://doi.org/10.1007/978-981-13-0020-2_41

Download citation

DOI: https://doi.org/10.1007/978-981-13-0020-2_41
Published: 26 April 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0019-6
Online ISBN: 978-981-13-0020-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics