Skip to main content

Handwritten Urdu Recognition Using BERT with Vision Transformers

  • Chapter
  • First Online:
Intelligent Multimedia Signal Processing for Smart Ecosystems

Abstract

The handwritten Urdu text recognition is a challenging area in pattern recognition and has gained much importance after the rapid emergence of several camera-based applications on portable devices, which facilitate the daily processing of plenty of images. The various challenges encountered in handwritten Urdu recognition are writer-dependent variations amongst different Urdu writers, irregular positioning of diacritics associated with a character, context sensitivity of characters, and cursive nature of Urdu script, that also enhances the difficulty in the formulation of a large handwritten Urdu dataset. Several researchers have proposed handwritten Urdu databases but none of them can individually cover the entire Urdu ligature corpus. Moreover, the handwritten Urdu recognition rate obtained by these researchers has been determined to be poor once their approaches are applied on different Urdu datasets. Hence a novel Transformer based methodology using BERT architecture has been proposed in this research. The proposed approach uses convolution feature maps as word embedding in the transformer that makes full use of powerful attention mechanism of the transformer to focus on handwritten data. To cover the entire Urdu corpus, we have evaluated all available handwritten Urdu datasets for experimental evaluation to determine the various performance parameters. The methodology is computationally efficient with a small constant time complexity and reveal a better recognition rate of 98% on similar datasets and 95% on dissimilar datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Naz S, Umar AI, Shirazi SH, Khan SA, Ahmed I, Khan AA (2014) Challenges of Urdu named entity recognition: a scarce resourced language. Res J Appl Sci Eng Technol 8(10):1272–1278

    Article  Google Scholar 

  2. Daud A, Khan W, Che D (2017) Urdu language processing: a survey. Artif Intell Rev 47(3):279–311

    Article  Google Scholar 

  3. Khan NH, Adnan A (2018) Urdu optical character recognition systems: present contributions and future directions. IEEE Access 6:46019–46046

    Article  Google Scholar 

  4. Satti DA, Saleem K (2012) Complexities and implementation challenges in offline Urdu Nastaliq OCR. In: Proceedings of the conference on language & technology, pp 85–91

    Google Scholar 

  5. Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 11 Oct 2018

    Google Scholar 

  6. Ezen-Can A (2020) A comparison of LSTM and BERT for small corpus. arXiv preprint arXiv:2009.05451. 11 Sept 2020

    Google Scholar 

  7. Ahmed SB, Naz S, Swati S, Razzak MI (2019) Handwritten Urdu character recognition using one-dimensional BLSTM classifier. Neural Comput & Applic 31(4):1143–1151

    Article  Google Scholar 

  8. ul Sehr Zia N, Naeem MF, Raza SMK et al (2022) A convolutional recursive deep architecture for unconstrained Urdu handwriting recognition. Neural Comput & Applic 34:1635–1648. https://doi.org/10.1007/s00521-021-06498-2

    Article  Google Scholar 

  9. Ramchoun H, Ghanou Y, Ettaouil M, Janati Idrissi MA (2016) Multilayer perceptron: architecture optimization and training

    Google Scholar 

  10. Ahmed SB, Naz S, Swati S, Razzak I, Umar AI, Khan AA (2017) UCOM offline dataset-an Urdu handwritten dataset generation. Int Arab J Inf Technol (IAJIT) 14(2)

    Google Scholar 

  11. Husnain M, Saad Missen MM, Mumtaz S, Jhanidr MZ, Coustaty M, Muzzamil Luqman M, Ogier JM, Sang Choi G (2019) Recognition of Urdu handwritten characters using convolutional neural network. Appl Sci 9(13):2758

    Article  Google Scholar 

  12. Hassan S, Irfan A, Mirza A, Siddiqi I (2019) Cursive handwritten text recognition using bi-directional LSTMs: a case study on Urdu handwriting. In: 2019 international conference on deep learning and machine learning in emerging applications (deep-ML). IEEE, pp 67–72

    Chapter  Google Scholar 

  13. Ahmed SB, Hameed IA, Naz S, Razzak MI, Yusof R (2019) Evaluation of handwritten Urdu text by integration of MNIST dataset learning experience. IEEE Access 7:153566–153578

    Article  Google Scholar 

  14. Pauls A, Klein D (2011) Faster and smaller n-gram language models. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies 2011 June, pp 258–267

    Google Scholar 

  15. Ganai AF, Koul A (2016) Projection profile based ligature segmentation of Nastaleeq Urdu OCR. In: 2016 4th international symposium on computational and business intelligence (ISCBI). IEEE, pp 170–175

    Chapter  Google Scholar 

  16. Lehal GS (2013) Ligature segmentation for Urdu OCR. In: 2013 12th international conference on document analysis and recognition. Available at: https://doi.org/10.1109/icdar.2013.229

  17. Gazit H (1991) An optimal randomized parallel algorithm for finding connected components in a graph. SIAM Journal on Computing. 20(6):1046–67

    Google Scholar 

  18. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Proces Syst 30, pp 1–11

    Google Scholar 

  19. Jais IK, Ismail AR, Nisa SQ (2019) Adam optimization algorithm for wide and deep neural network. Knowledge Engineering and Data Science. 2(1): 41–6

    Google Scholar 

Download references

Acknowledgement

We thank Dr. Saad Bin Ahmed for providing UNHD Database. We also thank Dr. Zia for providing NUST-UHWR database.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ganai, A.F., Khurshid, F. (2023). Handwritten Urdu Recognition Using BERT with Vision Transformers. In: Parah, S.A., Hurrah, N.N., Khan, E. (eds) Intelligent Multimedia Signal Processing for Smart Ecosystems. Springer, Cham. https://doi.org/10.1007/978-3-031-34873-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-34873-0_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-34872-3

  • Online ISBN: 978-3-031-34873-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics