Abstract
An offline handwriting recognition system for Urdu, a language with a user base of 200 Million and written in Nastaleeq script, has been a challenge for the research community. The key problems include recognition of complex ligature shapes and lack of publicly available datasets. This paper addresses both these problems by (i) proposing an end-to-end handwriting recognition system based on a new CNN-RNN architecture with n-gram language modeling, and (ii) presenting a new unconstrained dataset called NUST-UHWR. We compiled the first unconstrained Urdu handwritten data from around 1000 people from diverse background, age and gender population. The text in this dataset is selected carefully from seven different fields to ensure the presence of commonly used words in different domains. The model architecture is capable of incorporating fine-grained features necessary for handwritten text recognition of complex ligature languages. Our method addresses the limitations of existing architectures and provides state-of-the-art performance on Urdu handwritten text. We achieve a minimum character error rate of 5.28% on Urdu handwriting recognition (UHWR) and establish a state-of-the-art. The paper further demonstrates the generalization ability of the proposed model by training on English language and bilingual (Urdu and English) handwritten data.
Similar content being viewed by others
References
Ul-Hasan A, Ahmed SB, Rashid F, Shafait F, Breuel TM (2013) Offline printed Urdu Nastaleeq script recognition with Bidirectional LSTM networks. In: Document analysis and recognition (ICDAR), 2013 12th international conference on IEEE, pp 1061–1065
Khattak IU, Siddiqi I, Khalid S, Djeddi C (2015) Recognition of Urdu ligatures: a holistic approach. In: Document analysis and recognition (ICDAR), 2015 13th international conference on, IEEE, pp 71–75
Ahmad R, Zeshan AM, Faisal RS, Liwicki M, Dengel A (2018) Space anomalies in ocrs for arabic like scripts. In: 2018 IEEE 2nd international workshop on arabic and derived script analysis and recognition (ASAR), IEEE, pp 67–71
Ul-Hasan A (2016) Generic text recognition using long short-term memory networks. PhD thesis, University of Kaiserslautern
Hussain S, Niazi A, Anjum U, Irfan F (2014) Adapting tesseract for complex scripts: an example for Urdu Nastalique. In: Document analysis systems (DAS), 2014 11th IAPR international workshop on, IEEE, pp 191–195
Naeem MF, ul Sehr ZN, Awan AA, Shafait Faisal, ul Hasan A (2017) Impact of ligature coverage on training practical Urdu OCR systems. In: Document analysis and recognition (ICDAR), 2017 14th IAPR international conference on, IEEE, vol 1, pp 131–136
Bin AS, Saeeda N, Salahuddin S, Imran RM, Iqbal UA, Ali KA (2017) UCOM offline dataset: an Urdu handwritten dataset generation. Int Arab J Inf Technol 14(2):239–245
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Sepp H, Jürgen S (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Pal U, Sarkar A (2003) Recognition of printed Urdu script. In: International conference on document analysis and recognition, pp 1183–1187
Javed Sobia T, Sarmad H, Ameera M, Samia A, Sehrish J, Huma M (2010) Segmentation free Nastalique Urdu OCR. World Acad Sci Eng Technol 46:456–461
Ud Din I, Siddiqi I, Khalid S, Azam T (2017) Segmentation-free optical character recognition for printed Urdu text. EURASIP J Image Video Process 1:62
Sabbour N, Shafait F (2013) A Segmentation-free approach to Arabic and Urdu OCR. In: Document recognition and retrieval XX. International society for optics and photonics, vol 8658, p 86580N
Smith R (2007) An overview of the Tesseract OCR engine. In: Document analysis and recognition, 2007. ICDAR 2007. Ninth international conference on, IEEE, vol 2, pp 629–633
Sardar S, Wahab A (2010) Optical character recognition system for Urdu. In: Information and emerging technologies (ICIET), 2010 international conference on, IEEE, pp 1–5
Bin AS, Saeeda N, Imran RM, Faisal RS, Zeeshan AM, Breuel Thomas M (2016) Evaluation of cursive and non-cursive scripts using recurrent neural networks. Neural Comput Appl 27(3):603–613
Saeeda N, Umar Arif I, Riaz A, Ahmed Saad B, Shirazi Syed H, Razzak Muhammad I (2017) Urdu Nastaliq text recognition system based on multi-dimensional recurrent neural network and statistical features. Neural Comput Appl 28(2):219–231
Saeeda N, Umar AI, Ahmed R, Razzak MI, Rashid SF, Shafait F (2016) Urdu nastaliq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks. SpringerPlus 5(1):1–16
Saeeda N, Umar Arif I, Riaz A, Ahmed Saad B, Shirazi Syed H, Imran S, Razzak Muhammad I (2016) Offline cursive Urdu-Nastaliq script recognition using multidimensional recurrent neural networks. Neurocomputing 177:228–241
Saeeda N, Umar Arif I, Riaz A, Imran S, Ahmed Saad B, Razzak Muhammad I, Faisal S (2017) Urdu Nastaliq recognition using convolutional-recursive deep learning. Neurocomputing 243:80–87
Réjean P, Srihari Sargur N (2000) Online and off-line handwriting recognition: a comprehensive survey. IEEE Trans Pattern Anal Mach Intell 22(1):63–84
Alex G, Marcus L, Santiago F, Roman B, Horst B, Jürgen S (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(5):855–868
Graves A, Fernández S, Gomez F, Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd international conference on machine learning, ACM, pp 369–376
Graves A, Schmidhuber J (2009) Offline handwriting recognition with multidimensional recurrent neural networks. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds) Advances in neural information processing systems 21. Curran Associates Inc., pp 545–552
Pal U, Jayadevan R, Sharma N (2012) Handwriting recognition in Indian regional scripts: a survey of offline techniques. ACM Trans Asian Lang Inf Process 11(1):1–35
Messina R, Louradour J (2015) Segmentation-free handwritten Chinese text recognition with LSTM-RNN. In: Document analysis and recognition (ICDAR), 2015 13th international conference on, IEEE, pp 171–175
Chen L, Yan R, Peng L, Furuhata A, Ding X (2017) Multi-layer recurrent neural network based offline Arabic handwriting recognition. In: 2017 1st international workshop on Arabic script analysis and recognition (ASAR), pp 6–10
Bluche T, Louradour J, Messina R (2017) Scan, attend and read: end-to-end handwritten paragraph recognition with MDLSTM attention. In: Document analysis and recognition (ICDAR), 2017 14th IAPR international conference on, IEEE, vol 1, pp 1050–1055
Wu YC, Yin F, Chen Z, Liu CL (2017) Handwritten chinese text recognition using separable multi-dimensional recurrent neural network. In: Document analysis and recognition (ICDAR), 2017 14th IAPR international conference on, IEEE, vol 1, pp 79–84
Puigcerver J (2017) Are multidimensional recurrent layers really necessary for handwritten text recognition? In: Document analysis and recognition (ICDAR), 2017 14th IAPR international conference on, IEEE, vol 1, pp 67–72
Bluche T, Messina R (2017) Gated convolutional recurrent neural networks for multilingual handwriting recognition. In: Document analysis and recognition (ICDAR), 2017 14th IAPR international conference on, IEEE, vol 1, pp 646–651
Adak C, Chaudhuri BB, Blumenstein M (2016) Offline cursive Bengali word recognition using CNNs with a Recurrent model. In: 2016 15th international conference on frontiers in handwriting recognition (ICFHR), IEEE, pp 429–434
Ignacio TJ, Dey S, Fornés A, Lladós J (2017) Handwriting recognition by attribute embedding and recurrent neural networks. In: Document analysis and recognition (ICDAR), 2017 14th IAPR international conference on, IEEE, vol 1, pp 1038–1043
Shaw B, Bhattacharya U, Parui SK (2014) Combination of features for efficient recognition of offline handwritten Devanagari words. In: Frontiers in handwriting recognition (ICFHR), 2014 14th international conference on, IEEE, pp 240–245
Shaw B, Bhattacharya U, Parui SK (2015) Offline handwritten Devanagari word recognition: information fusion at feature and classifier levels. In: Pattern recognition (ACPR), 2015 3rd IAPR Asian conference on, IEEE, pp 720–724
Mukherjee PS, Bhattacharya U, Parui SK (2018) An efficient feature vector for segmentation-free recognition of online cursive handwriting based on a hybrid deep neural network. In: 2018 13th IAPR international workshop on document analysis systems (DAS), IEEE, pp 435–440
Dutta K, Krishnan P, Mathew M, Jawahar CV (2018) Offline handwriting recognition on Devanagari using a new benchmark dataset. In: 2018 13th IAPR international workshop on document analysis systems (DAS), pp 25–30
Chakraborty B, Mukherjee PS, Bhattacharya U (2016) Bangla online handwriting recognition using recurrent neural network architecture. In: Proceedings of the tenth Indian conference on computer vision, graphics and image processing, ACM, p 63
Kumar M, Jindal SR, Jindal MK, Lehal GS (2018) Improved recognition results of medieval handwritten Gurmukhi manuscripts using boosting and bagging methodologies. Neural Process Lett 20:43–56
Raza A, Siddiqi I, Abidi A, Arif F (2012) An unconstrained benchmark Urdu handwritten sentence database with automatic line segmentation. In: Frontiers in handwriting recognition (ICFHR), 2012 international conference on, IEEE, pp 491–496
Malik S, Khan SA (2005) Urdu online handwriting recognition. In: Emerging technologies, 2005. Proceedings of the IEEE symposium on, IEEE, pp 27–31
Urs-Viktor M, Horst B (2001) Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int J Pattern Recognit Artif Intell 15(01):65–90
Wassim S (2017) Language modelling for handwriting recognition. Theses, Normandie Université
Lehal GS, Rana A (2013) Recognition of Nastalique Urdu ligatures. In: Proceedings of the 4th international workshop on multilingual OCR, ACM, p 7
UET (2012) Valid ligatures of Urdu. http://www.cle.org.pk/software/ling_resources/UrduHighFreqLigature.htm. (Accessed 7 Sep 2019)
Joost VB, Faisal S, Breuel Thomas M (2010) Combined orientation and skew detection using geometric text-line modeling. Int J Doc Anal Recognit (IJDAR) 13(2):79–92
Lehal GS (2013) Ligature segmentation for Urdu OCR. In: Document analysis and recognition (ICDAR), 2013 12th international conference on, IEEE, pp 1130–1134
Baoguang S, Xiang B, Cong Y (2017) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell 39(11):2298–2304
Snoek J, Larochelle H, Adams RP (2012) Practical bayesian optimization of machine learning algorithms. In: Advances in neural information processing systems, pp 2951–2959
Voigtlaender P, Doetsch P, Ney H (2016) Handwriting recognition with large multidimensional long short-term memory recurrent neural networks. In: Frontiers in handwriting recognition (ICFHR), 2016 15th international conference on, IEEE, pp 228–233
Doetsch P, Kozielski M, Ney H (2014) Fast and robust training of recurrent neural networks for offline handwriting recognition. In: Frontiers in handwriting recognition (ICFHR), 2014 14th international conference on, IEEE, pp 279–284
Zimmermann M, Bunke H (2004) N-gram language models for offline handwritten text recognition. In: Ninth international workshop on frontiers in handwriting recognition, IEEE, pp 203–208
Martin JH, Jurafsky D (2009) Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. Prentice Hall Upper Saddle River, Pearson
Chen Stanley F, Joshua G (1999) An empirical study of smoothing techniques for language modeling. Comput Speech Lang 13(4):359–394
Hermann N, Ute E, Reinhard K (1994) On structuring probabilistic dependences in stochastic language modelling. Comput Speech Lang 8(1):1–38
Johansson S (2008) The tagged LOB corpus: user’s manual. http://www.helsinki.fi/varieng/CoRD/corpora/LOB/. (Accessed 15 Dec 2018)
Urs-Viktor M, Horst B (2002) The IAM-database: an english sentence database for offline handwriting recognition. Int J Doc Anal Recognit 5(1):39–46
Stolcke A (2002) SRILM-an extensible language modeling toolkit. In: Seventh international conference on spoken language processing
Povey D, Ghoshal A, Boulianne G, Burget L, Glembek O, Goel N, Hannemann M, Motlicek P, Qian Y, Schwarz P et al. (2011) The Kaldi speech recognition toolkit. In: IEEE 2011 workshop on automatic speech recognition and understanding. IEEE signal processing society
Safarzadeh VM, Jafarzadeh P (2020) Offline persian handwriting recognition with CNN and RNN-CTC. In: 2020 25th International computer conference, computer society of Iran (CSICC), IEEE, pp 1–10
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767
Jaramillo JCA, Murillo-Fuentes JJ, Olmos PM (2018) Boosting handwriting text recognition in small databases with transfer learning. In: 2018 16th international conference on frontiers in handwriting recognition (ICFHR), IEEE, pp 429–434
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was partially funded by HEC Pakistan’s grant for National Center of Artificial Intelligence (NCAI).
Rights and permissions
About this article
Cite this article
ul Sehr Zia, N., Naeem, M.F., Raza, S.M.K. et al. A convolutional recursive deep architecture for unconstrained Urdu handwriting recognition. Neural Comput & Applic 34, 1635–1648 (2022). https://doi.org/10.1007/s00521-021-06498-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06498-2