Abstract
The handwritten Urdu text recognition is a challenging area in pattern recognition and has gained much importance after the rapid emergence of several camera-based applications on portable devices, which facilitate the daily processing of plenty of images. The various challenges encountered in handwritten Urdu recognition are writer-dependent variations amongst different Urdu writers, irregular positioning of diacritics associated with a character, context sensitivity of characters, and cursive nature of Urdu script, that also enhances the difficulty in the formulation of a large handwritten Urdu dataset. Several researchers have proposed handwritten Urdu databases but none of them can individually cover the entire Urdu ligature corpus. Moreover, the handwritten Urdu recognition rate obtained by these researchers has been determined to be poor once their approaches are applied on different Urdu datasets. Hence a novel Transformer based methodology using BERT architecture has been proposed in this research. The proposed approach uses convolution feature maps as word embedding in the transformer that makes full use of powerful attention mechanism of the transformer to focus on handwritten data. To cover the entire Urdu corpus, we have evaluated all available handwritten Urdu datasets for experimental evaluation to determine the various performance parameters. The methodology is computationally efficient with a small constant time complexity and reveal a better recognition rate of 98% on similar datasets and 95% on dissimilar datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Naz S, Umar AI, Shirazi SH, Khan SA, Ahmed I, Khan AA (2014) Challenges of Urdu named entity recognition: a scarce resourced language. Res J Appl Sci Eng Technol 8(10):1272–1278
Daud A, Khan W, Che D (2017) Urdu language processing: a survey. Artif Intell Rev 47(3):279–311
Khan NH, Adnan A (2018) Urdu optical character recognition systems: present contributions and future directions. IEEE Access 6:46019–46046
Satti DA, Saleem K (2012) Complexities and implementation challenges in offline Urdu Nastaliq OCR. In: Proceedings of the conference on language & technology, pp 85–91
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 11 Oct 2018
Ezen-Can A (2020) A comparison of LSTM and BERT for small corpus. arXiv preprint arXiv:2009.05451. 11 Sept 2020
Ahmed SB, Naz S, Swati S, Razzak MI (2019) Handwritten Urdu character recognition using one-dimensional BLSTM classifier. Neural Comput & Applic 31(4):1143–1151
ul Sehr Zia N, Naeem MF, Raza SMK et al (2022) A convolutional recursive deep architecture for unconstrained Urdu handwriting recognition. Neural Comput & Applic 34:1635–1648. https://doi.org/10.1007/s00521-021-06498-2
Ramchoun H, Ghanou Y, Ettaouil M, Janati Idrissi MA (2016) Multilayer perceptron: architecture optimization and training
Ahmed SB, Naz S, Swati S, Razzak I, Umar AI, Khan AA (2017) UCOM offline dataset-an Urdu handwritten dataset generation. Int Arab J Inf Technol (IAJIT) 14(2)
Husnain M, Saad Missen MM, Mumtaz S, Jhanidr MZ, Coustaty M, Muzzamil Luqman M, Ogier JM, Sang Choi G (2019) Recognition of Urdu handwritten characters using convolutional neural network. Appl Sci 9(13):2758
Hassan S, Irfan A, Mirza A, Siddiqi I (2019) Cursive handwritten text recognition using bi-directional LSTMs: a case study on Urdu handwriting. In: 2019 international conference on deep learning and machine learning in emerging applications (deep-ML). IEEE, pp 67–72
Ahmed SB, Hameed IA, Naz S, Razzak MI, Yusof R (2019) Evaluation of handwritten Urdu text by integration of MNIST dataset learning experience. IEEE Access 7:153566–153578
Pauls A, Klein D (2011) Faster and smaller n-gram language models. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies 2011 June, pp 258–267
Ganai AF, Koul A (2016) Projection profile based ligature segmentation of Nastaleeq Urdu OCR. In: 2016 4th international symposium on computational and business intelligence (ISCBI). IEEE, pp 170–175
Lehal GS (2013) Ligature segmentation for Urdu OCR. In: 2013 12th international conference on document analysis and recognition. Available at: https://doi.org/10.1109/icdar.2013.229
Gazit H (1991) An optimal randomized parallel algorithm for finding connected components in a graph. SIAM Journal on Computing. 20(6):1046–67
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Proces Syst 30, pp 1–11
Jais IK, Ismail AR, Nisa SQ (2019) Adam optimization algorithm for wide and deep neural network. Knowledge Engineering and Data Science. 2(1): 41–6
Acknowledgement
We thank Dr. Saad Bin Ahmed for providing UNHD Database. We also thank Dr. Zia for providing NUST-UHWR database.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Ganai, A.F., Khurshid, F. (2023). Handwritten Urdu Recognition Using BERT with Vision Transformers. In: Parah, S.A., Hurrah, N.N., Khan, E. (eds) Intelligent Multimedia Signal Processing for Smart Ecosystems. Springer, Cham. https://doi.org/10.1007/978-3-031-34873-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-34873-0_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34872-3
Online ISBN: 978-3-031-34873-0
eBook Packages: Computer ScienceComputer Science (R0)