Advertisement

Cluster Computing

, Volume 21, Issue 1, pp 703–714 | Cite as

Ligature based Urdu Nastaleeq sentence recognition using gated bidirectional long short term memory

  • Ibrar AhmadEmail author
  • Xiaojie Wang
  • Yuz hao Mao
  • Guang Liu
  • Haseeb Ahmad
  • Rahat Ullah
Article

Abstract

Bidirectional long short term memory (BLSTM) architecture—a special case of recurrent neural network—is successfully applied for recognition of Urdu Nastaleeq sentence images based on character information. In such cases, manual labeling of characters in sentences for a large dataset is an intensive job, because identical characters observe different shapes at different positions inside ligatures and words. On the other hand, labeling any dataset with ligatures is a relatively easier and more accurate phenomenon. In the current paper, we propose a novel gated BLSTM (GBLSTM) model for recognition of printed Urdu Nastaleeq text based on ligature information. Our proposed model incorporates raw pixel values as features instead of human crafted features, because of the latter being more error prone. The model is trained on un-degraded and tested on unseen artificially degraded versions of Urdu printed text images dataset. The recognition accuracy of the proposed GBLSTM model is 96.71% that is higher than the prevalent Urdu optical character recognition systems.

Keywords

Optical character recognition Urdu Nastaleeq sentence recognition BLSTM GBLSTM Urdu OCR 

Notes

Acknowledgements

The authors would like to thank Dr. Ruifan Li, Center for Intelligence of Science and Technology (CIST), Beijing University of Posts and Telecommunications, China. We would also like to thank Mr. Mohammad Saad Khan, Beijing University of Posts and Telecommunications, China for his guidance and unprecedented help.

References

  1. 1.
    Naz, S., Umar, A.I., Shirazi, S.H., Khan, S.A., Ahmed, I., Khan, A.A.: Challenges of Urdu named entity recognition: a scarce resourced language. Res. J. Appl. Sci. Eng. Technol. 8(10), 1272–1278 (2014)Google Scholar
  2. 2.
    Daud, A., Khan, W., Che, D.: Urdu language processing: a survey. Artif. Intell. Rev. (2016). doi: 10.1007/s10462-016-9482-x
  3. 3.
    Weber, G.: Top languages. World 10 (2008)Google Scholar
  4. 4.
    Naz, S., Hayat, K., Razzak, M.I., Anwar, M.W., Madani, S.A., Khan, S.U.: The optical character recognition of Urdu-like cursive scripts. Pattern Recognit. 47(3), 1229–1248 (2014)CrossRefGoogle Scholar
  5. 5.
    Javed, S.T., Hussain, S., Maqbool, A., Asloob, S., Jamil, S., Moin, H.: Segmentation free Nastalique Urdu OCR. World Acad. Sci. Eng. Technol. 46, 456–461 (2010)Google Scholar
  6. 6.
    Husain, S.A.: A multi-tier holistic approach for Urdu Nastaleeq recognition. In: International Multi-topic Conference, 2002. INMIC 2002. Abstracts, pp. 84–84. IEEE (2002)Google Scholar
  7. 7.
    Ahmad, I., Wang, X., Li, R., Rasheed, S.: Offline Urdu Nastaleeq optical character recognition based on stacked denoising autoencoder. China Commun. 14(1), 146–157 (2017)CrossRefGoogle Scholar
  8. 8.
    Sabbour, N., Shafait, F.: A segmentation-free approach to Arabic and Urdu OCR. IS&T/SPIE Electron. Imaging Int. Soc. Opt. Photon. (2013) doi: 10.1117/12.2003731
  9. 9.
    Lehal, G.S., Rana, A.: Recognition of Nastalique Urdu ligatures. In: Proceedings of the 4th International Workshop on Multilingual OCR, p. 7. ACM, New York (2013)Google Scholar
  10. 10.
    El-Korashy, A., Shafait, F.: Search space reduction for holistic ligature recognition in Urdu Nastalique script. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1125–1129. IEEE (2013)Google Scholar
  11. 11.
    Hussain, D., Hassan, D., Guo, P.: Improved Arabic word classification using spatial pyramid matching method. In: International Conference Image and Vision Computing New Zealand (IVCNZ, 2011)Google Scholar
  12. 12.
    Hussain, S., Ali, S., Akram, Q.: Nastalique segmentation-based approach for Urdu OCR. Int. J. Doc. Anal. Recognit. 18(4), 357–374 (2015)CrossRefGoogle Scholar
  13. 13.
    Javed, S.T., Hussain, S.: Segmentation based Urdu Nastalique OCR. In: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pp. 41–49. Springer, Berlin (2013)Google Scholar
  14. 14.
    Ul-Hasan, A., Ahmed, S.B., Rashid, F., Shafait, F., Breuel, T.M.: Offline printed Urdu Nastaleeq script recognition with bidirectional LSTM networks. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1061–1065. IEEE (2013)Google Scholar
  15. 15.
    Naz, S., Umar, A.I., Ahmad, R., Ahmed, S.B., Shirazi, S.H., Razzak, M.I.: Urdu Nastaliq text recognition system based on multidimensional recurrent neural network and statistical features. Neural Comput. Appl. (2015). doi: 10.1007/s00521-015-2051-4
  16. 16.
    Naz, S., Umar, A.I., Ahmad, R., Ahmed, S.B., Shirazi, S.H., Siddiqi, I., Razzak, M.I.: Offline cursive Urdu-Nastaliq script recognition using multidimensional recurrent neural networks. Neurocomputing (2015). doi: 10.1016/j.neucom.2015.11.030
  17. 17.
    Ahmed, S.B., Naz, S., Razzak, M.I., Rashid, S.F., Afzal, M.Z., Breuel, T.M.: Evaluation of cursive and non-cursive scripts using recurrent neural networks. Neural Comput. Appl. (2015). doi: 10.1007/s00521-015-1881-4
  18. 18.
    Naz, S., Ahmed, S.B., Ahmad, R., Razzak, M.I.: Zoning features and 2DLSTM for Urdu text-line recognition. Procedia Comput. Sci. 96, 16–22 (2016)CrossRefGoogle Scholar
  19. 19.
    Naz, S., Umar, A.I., Ahmad, R., Razzak, M.I., Rashid, S.F., Shafait, F.: Urdu Nasta’liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks. SpringerPlus 5(1), 2010 (2016)CrossRefGoogle Scholar
  20. 20.
    Naz, S., Umar, A.I., Ahmad, R., Siddiqi, I., Ahmed, S.B., Razzak, M.I., Shafait, F.: Urdu Nastaliq recognition using convolutional-recursive deep learning. Neurocomputing 243, 80–87 (2017)CrossRefGoogle Scholar
  21. 21.
    Graves, A.: Supervised Sequence Labelling. Springer, Berlin (2012)CrossRefzbMATHGoogle Scholar
  22. 22.
    Satti, D.A., Saleem, K.: Complexities and implementation challenges in offline Urdu Nastaliq OCR. In: Proceedings of the Conference on Language and Technology, pp. 85–91 (2012)Google Scholar
  23. 23.
    Sarfraz, H., Dilawari, A., Hussain, S.: Assessing Urdu language support on the multilingual web. In: Proceedings of the 12th AMIC Annual Conference on e-Worlds: Governments, Business and Civil Society. Asian Media Information Center, Singapore (2003)Google Scholar
  24. 24.
    Bridle, J.S.: Probabilistic interpretation of feed forward classification network outputs, with relationships to statistical pattern recognition. Neurocomputing (1990). doi: 10.1007/978-3-642-76153-9_28
  25. 25.
    Canyameres Masip, S., López Peña, A.M., et al.: On the use of convolutional neural networks for pedestrian detection (2015)Google Scholar
  26. 26.
    Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 855–868 (2009)CrossRefGoogle Scholar
  27. 27.
    Wu, D.: Human action recognition using deep probabilistic graphical models. PhD Thesis, University of Sheffield (2014)Google Scholar
  28. 28.
    Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.-A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM, New York (2008)Google Scholar
  29. 29.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Plamondon, R., Srihari, S.N.: Online and off-line handwriting recognition: a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 63–84 (2000)CrossRefGoogle Scholar
  31. 31.
    Jaeger, H.: Tutorial on Training Recurrent Neural Networks, Covering BPPT, RTRL, EKF and the “echo state network” Approach. GMDForschungszentrum Informationstechnik (2002)Google Scholar
  32. 32.
    Senior, A., Robinson, T.: Forward–backward retraining of recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 743–749 (1996)Google Scholar
  33. 33.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  34. 34.
    Graves, A.: Offline Arabic handwriting recognition with multidimensional recurrent neural networks. In: Guide to OCR for Arabic Scripts, pp. 297–313. Springer, London (2012)Google Scholar
  35. 35.
    Graves, A., Schmidhuber, J.: Offline handwriting recognition with multidimensional recurrent neural networks. Adv. Neural Inf. Process. Syst. 19, 545–552 (2009)Google Scholar
  36. 36.
    Graves, A., Fernández, S., Schmidhuber, J.: Bidirectional LSTM networks for improved phoneme classification and recognition. In: International Conference on Artificial Neural Networks, pp. 799–804. Springer, Berlin (2005)Google Scholar
  37. 37.
    Graves, A., Liwicki, M., Bunke, H., Schmidhuber, J., Fern’andez, S.: Unconstrained on-line handwriting recognition with recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 577–584 (2008)Google Scholar
  38. 38.
    Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)CrossRefGoogle Scholar
  39. 39.
    Wojna, Z.: Fast methods in training deep neural networks for image recognition. PhD Thesis, University College London (2015)Google Scholar
  40. 40.
    Lian, Z., Jing, X., Wang, X., Huang, H., Tan, Y., Cui, Y.: DropConnect regularization method with sparsity constraint for neural networks. Chin. J. Electron. 25(1), 152–158 (2016)CrossRefGoogle Scholar
  41. 41.
    Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics (AISTATS-11) (G.J. Gordon and D.B. Dunson, eds.). J. Mach. Learn. Res. Workshop Conf. Proc. 15, 315–323 (2011)Google Scholar
  42. 42.
    Baird, H.S.: Document image defect models. In: Structured Document Image Analysis, pp. 546–556. Springer, New York (1992)Google Scholar
  43. 43.
    Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 369–376. ACM, New York (2006)Google Scholar
  44. 44.
    Ahmad, I., Wang, X., Li, R., Ahmad, M., Ullah, R.: Line and ligature segmentation of Urdu Nastaleeq text. IEEE Access (2017). doi: 10.1109/ACCESS.2017.2703155

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Center for Intelligence of Science and Technology (CIST)Beijing University of Posts and TelecommunicationsBeijingChina
  2. 2.State Key Laboratory of Networking and Switching TechnologyBeijing University of Posts and TelecommunicationsBeijingChina
  3. 3.Nanjing University of Information Sciences and TechnologyNanjingChina

Personalised recommendations