Abstract
Decomposition of a word into a set of appropriate pseudo-characters is a challenging task in case of a cursive script like Bangla. Segmentation-free approach bypasses the decomposition problem entirely and treats the handwritten word as an individual entity. From the literature, we found that the accuracy of handwritten Bangla cursive word recognition using segmentation-free approach is relatively low (accuracy hovers between 80% and 90%). In the current work, we aim to provide a threefold study on this particular domain. Firstly, we extract different statistical feature sets from word images and use five different off-the-shelf classifiers to delineate their performance. Then, we employ five different CNN-TL architectures, namely AlexNet, VGG-16, VGG-19, ResNet50, and GoogleNet, to understand how they perform on holistic Bangla words. Finally, we use a seven-layer FCN architecture and provide a comparison of results with all the aforementioned experimentations. We achieved an accuracy of 98.86% with ResNet50, which is nearly 19% improvement when compared with other recent state-of-the-art methodologies.
Similar content being viewed by others
References
Acharyya A, Rakshit S, Sarkar R, Basu S, Nasipuri M (2013) Handwritten word recognition using MLP based classifier: a holistic approach. Int J Comput Sci Issues (IJCSI) 10(22):422
Adak C, Chaudhuri BB, Blumenstein M (2016) Offline cursive Bengali word recognition using CNNs with a recurrent model. In: Proceedings of the international conference on frontiers in handwriting recognition (ICFHR 2016). IEEE, pp 429–434
Bag S, Bhowmick P, Harit G, Biswas A (2011) Character segmentation of handwritten Bangla text by vertex characterization of isothetic covers. In: Proceedings of the national conference on computer vision, pattern recognition, image processing and graphics (NCVPRIPG 2011). IEEE, pp 21–24
Barua S, Malakar S, Bhowmik S, Sarkar R, Nasipuri M (2017) Bangla handwritten city name recognition using gradient–based feature. In: Proceedings of the international conference on frontiers in intelligent computing: theory and applications (FICTA 2017). Springer, pp 343–352
Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127
Bhattacharya S, Maitra DS, Bhattacharya U, Parui SK (2016) An end–to–end system for Bangla online handwriting recognition. In: Proceedings of the international conference on frontiers in handwriting recognition (ICFHR 2016). IEEE, pp 373–378
Bhattad AJ, Chaudhuri BB (2015) An approach for character segmentation of handwritten Bangla and Devanagari script. In: Proceedings of the international advance computing conference (IACC 2015). IEEE, pp 676–680
Bhowmik S, Malakar S, Sarkar R, Basu S, Kundu M, Nasipuri M (2019) Off–line Bangla handwritten word recognition: a holistic approach. Neural Comput Appl 31:5783–5798
Bhowmik S, Malakar S, Sarkar R, Nasipuri M (2014) Handwritten Bangla word recognition using elliptical features. In: Proceedings of the international conference on computational intelligence and communication networks (CICN 2014). IEEE, pp 257–261
Bhowmik S, Roushan MG, Sarkar R, Nasipuri M, Polley S, Malakar S (2014) Handwritten Bangla word recognition using HOG descriptor. In: Proceedings of the international conference on emerging applications of information technology (EAIT 2014). IEEE, pp 193–197
Bluche T, Ney H, Kermorvant C (2013) Tandem HMM with convolutional neural network for handwritten word recognition. In: Proceedings of the international conference on acoustics, speech, and signal processing (ICASSP 2013). IEEE, pp 2390–2394
Ciresan DC, Meier U, Masci J, Maria Gambardella L, Schmidhuber J (2011) Flexible, high performance convolutional neural networks for image classification. In: Proceedings of the international joint conference on artificial intelligence (IJCAI 2011), vol 22, p 1237
Dasgupta J, Bhattacharya K, Chanda B (2016) A holistic approach for off-line handwritten cursive word recognition using directional feature based on Arnold transform. Pattern Recogn Lett 79:73–79
Ebrahimpour R, Vahid RD, Nezhad BM (2011) Decision templates with gradient based features for Farsi handwritten word recognition. Int J Hybrid Inf Technol 4(1):1–12
Freeman H (1974) Computer processing of line-drawing images. ACM Comput Surv (CSUR) 6(1):57–97
Habibzadeh M, Jannesari M, Rezaei Z, Baharvand H, Totonchi M (2018) Automatic white blood cell classification using pre-trained deep learning models: Resnet and inception. In: Proceedings of the international conference on machine vision (ICMV 2017), vol 10696. International Society for Optics and Photonics, p 1069612
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hoo-Chang S, Roth HR, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers RM (2016) Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imag 35(5):1285–1298
Huang L, Wan G, Liu C (2003) An improved parallel thinning algorithm. In: Proceedings of the international conference on document analysis and recognition (ICDAR 2003). IEEE Computer Society, USA, p 780
Keskar NS, Mudigere D, Nocedal J, Smelyanskiy M, Tang PTP (2016) On large-batch training for deep learning: generalization gap and sharp minima. arXiv preprint arXiv:1609.04836
Kessentini Y, Paquet T, Hamadou AB (2010) Off-line handwritten word recognition using multi-stream Hidden Markov Models. Pattern Recogn Lett 31(1):60–70
Khémiri A, Echi AK, Belaïd A, Elloumi M (2016) A system for off–line Arabic handwritten word recognition based on Bayesian approach. In: Proceedings of the international conference on frontiers in handwriting recognition (ICFHR 2016). IEEE, pp 560–565
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems (NIPS 2012), pp 1097–1105
Le QV (2013) Building high–level features using large scale unsupervised learning. In: Proceedings of the international conference on acoustics, speech, and signal processing (ICASSP 2013). IEEE, pp 8595–8598
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
LeCun Y, Cortes C, Burges C (2010) MNIST handwritten digit database. AT&T Labs [Online]. Available: http://yann. lecun. com/exdb/mnist 2
LeCun YA, Bottou L, Orr GB, Müller KR (2012) Efficient backprop. In: Neural networks: Tricks of the trade. Springer, pp 9–48
Mohiuddin S, Bhattacharya U, Parui SK (2011) Unconstrained Bangla online handwriting recognition based on MLP and SVM. In: Proceedings of the joint workshop on multilingual OCR and analytics for noisy unstructured text data (JMOCR–AND 2011). ACM, p 16
Nawab NB, Hassan M (2012) Optical Bangla character recognition using chain–code. In: Proceedings of the international conference on informatics, electronics, and vision (ICIEV 2012). IEEE, pp 622–627
Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR 2014), pp 1717–1724
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66
Persello C, Stein A (2017) Deep fully convolutional networks for the detection of informal settlements in VHR images. IEEE Geosci Remote Sens Lett 14(12):2325–2329
Pramanik R, Bag S (2018) Shape decomposition-based handwritten compound character recognition for Bangla OCR. J Vis Commun Image Represent 50:123–134
Sagheer MW, He CL, Nobile N, Suen CY (2010) Holistic Urdu handwritten word recognition using support vector machine. In: Proceedings of the International Conference on Pattern Recognition (ICPR 2010). IEEE, pp 1900–1903
Sahoo S, Nandi SK, Barua S, Bhowmik S, Malakar S, Sarkar R et al (2018) Handwritten Bangla word recognition using negative refraction based shape transformation. J Intell Fuzzy Syst 35(2):1765–1777
Sen S, Chowdhury S, Mitra M, Schwenker F, Sarkar R, Roy K (2020) A novel segmentation technique for online handwritten Bangla words. Pattern Recogn Lett 139:26–33
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Tajbakhsh N, Shin JY, Gurudu SR, Hurst RT, Kendall CB, Gotway MB, Liang J (2016) Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imag 35(5):1299–1312
Van Opbroek A, Ikram MA, Vernooij MW, De Bruijne M (2015) Transfer learning improves supervised image segmentation across imaging protocols. IEEE Trans Med Imag 34(5):1018–1030
Wichrowska O, Maheswaranathan N, Hoffman MW, Colmenarejo SG, Denil M, de Freitas N, Sohl-Dickstein J (2017) Learned optimizers that scale and generalize. In: Proceedings of the international conference on machine learning (ICML 2017). JMLR. org, pp 3751–3760
Zeiler MD, Taylor GW, Fergus R (2011) Adaptive deconvolutional networks for mid and high level feature learning. In: Proceedings of the international conference on computer vision (ICCV 2011). IEEE, pp 2018–2025
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that no potential conflict of interest to be reported to this work.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Pramanik, R., Bag, S. Handwritten Bangla city name word recognition using CNN-based transfer learning and FCN. Neural Comput & Applic 33, 9329–9341 (2021). https://doi.org/10.1007/s00521-021-05693-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-05693-5