Advertisement

A fast and accurate contour-based method for writer-dependent offline handwritten Farsi/Arabic subwords recognition

  • Kazim Fouladi
  • Babak N. AraabiEmail author
  • Ehsanollah Kabir
Original Paper

Abstract

This paper concerns with the recognition of offline Farsi/Arabic handwriting. The overall appearance of each subword in Farsi/Arabic script is described by its shape contour that provides us with a rich set of discriminative characteristics. Our approach is writer-dependent; that is, the system is trained to recognize the subwords written by a particular writer. A fast contour alignment is the central part of the proposed algorithm, where the alignment is performed based on a handful of feature points. An efficient lexicon reduction algorithm based on characteristic loci feature, which works directly on subwords’ binary images, is proposed as well. Fast and precise alignment along with efficient lexicon reduction and appropriate similarity matching yields a high recognition rate while kept the speed high. Our experiment on IBN SINA database shows that the correct classification rate could be as high as 91.08 %. This figure is achieved merely by subword shape matching, without dots and diacritics, and without any statistical language model.

Keywords

Farsi Persian Arabic subword Contour alignment Handwriting recognition Writer-dependent Lexicon reduction  Characteristic loci IBN SINA database 

Notes

Acknowledgments

The authors would like to thank the anonymous reviewers for their valuable comments and constructive suggestions that helped them to improve content and presentation of the paper.

References

  1. 1.
    AbdulKader, A.: A two-tier Arabic offline handwriting recognition based on conditional joining rules. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNCS 4768, Springer (2008)Google Scholar
  2. 2.
    Abdulla, S., Al-Nassiri, A., Salam, R.A.: Off-line Arabic handwritten word segmentation using rotational invariant segments features. Int. Arab J. Inf. Technol. 5(2), 200–208 (Apr 2008)Google Scholar
  3. 3.
    Abed, H., Margner, V.: Arabic text recognition systems—state of the art and future trends. In: Proceedings of International Conference on Innovations in Information Technology, pp. 692–696, Al Ain (2008)Google Scholar
  4. 4.
    Aburas, A.A., Rehiel, S.M.A.: Off-line omni-style handwriting Arabic character recognition system based on wavelet compression. J. Arab Res. Inst. Sci. Eng. (ARISER) 3(4), 123–135 (2007)Google Scholar
  5. 5.
    Al Hamad, H.A., Abu Zitar, R.: Development of an efficient neural-based segmentation technique for Arabic handwriting recognition. Pattern Recognit. 43(8), 2773–2798 (2010)CrossRefzbMATHGoogle Scholar
  6. 6.
    Al-Hajj Mohamad, R., Likforman-Sulem, L., Mokbel, C.: Combining slanted-frame classifiers for improved HMM-based Arabic handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(7), 1165–1177 (2009)CrossRefGoogle Scholar
  7. 7.
    Al Khateeb, J.H., Jianmin, J., Jinchang, R., Stan, S.I.: Component-based segmentation of words from handwritten Arabic text. Int. J. Comput. Syst. Sci. Eng. 5(1), 344–348 (2009)Google Scholar
  8. 8.
    Alma’adeed, S., Higgens, C., Elliman, D.: Off-line recognition of handwritten Arabic words using multiple hidden Markov models. Knowl. Based Syst. 17, 75–79 (2004)CrossRefGoogle Scholar
  9. 9.
    Amrouch, M., Elyassa, M., Rachidi, A., Mammass, D.: Off-line arabic handwritten characters recognition based on a hidden markov models. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNCS 5099, pp. 447–454 (2008)Google Scholar
  10. 10.
    Azmi, R.: Recognition of omnifont printed Farsi text. PhD Thesis, Tarbiat Modarres University, Tehran, Iran (1999) (in Farsi)Google Scholar
  11. 11.
    Ball, G.R., Srihari, S.N.: Prototype integration in off-line handwriting recognition adaptation. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 529–534, Montreal, Canada (2008)Google Scholar
  12. 12.
    Ball, G.R., Srihari, S.N.: Writer adaptation in off-line Arabic handwriting recognition. In: Proceedings of SPIE, 6815 (2008)Google Scholar
  13. 13.
    Basu, S., Das, N., Sarkar, R., Kundu, M., Nasipuri, M., Basu D.: Recognition of numeric postal codes from multi-script postal address blocks. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNCS 5909, pp. 381–386 (2009)Google Scholar
  14. 14.
    Benouareth, A., Ennaji, A., Sellami, M.: Semi-continuous HMMs with explicit state duration for unconstrained arabic word modeling and recognition. Pattern Recognit. Lett. 29(12), 1742–1752 (2008)CrossRefGoogle Scholar
  15. 15.
    Bergroth, L., Hakonen, H., Raita, T.: A survey of longest common subsequence algorithms. In: Proceedings of the 7th Symposium on String Processing and, Information Retrieval (SPIRE), pp. 39–48 (2000)Google Scholar
  16. 16.
    Cheikh, I.B., Kacem, A.: Neural network for the recognition of handwritten Tunisian city names. In: Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR’07), vol. 2, pp. 1108–1112, Curitiba (2007)Google Scholar
  17. 17.
    Chen, J., Cao, H., Prasad, R., Bhardwaj, A., Natarajan, P.: Gabor features for offline arabic handwriting recognition. In: Proceedings of IAPR Workshop on Document Analysis Systems (DAS’10), pp. 53–58, Boston, MA (2010)Google Scholar
  18. 18.
    Cheriet, M., Kharma, N., Liu, C.L., Suen, C.Y.: Character Recognition Systems: A Guide for Students and Practioners. Wiley, London (2007)CrossRefGoogle Scholar
  19. 19.
    Chherawala, Y., Cheriet, M.: W-TSV: weighted topological signature vector for lexicon reduction in handwritten Arabic documents. Pattern Recognit. 45, 3277–3287 (2012)CrossRefGoogle Scholar
  20. 20.
    Dehghan, M., Faez, K., Ahmadi, M., Shridhar, M.: Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM. Pattern Recognit. 34(5), 1057–1065 (2001)CrossRefzbMATHGoogle Scholar
  21. 21.
    Dreuw, P., Rybach, D., Gollan, C., Ney, H.: Writer adaptive training and writing variant model refinement for offline Arabic handwriting recognition. In: Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR’09), pp. 21–25, Barcelona (2009)Google Scholar
  22. 22.
    Ebrahimi, A., Kabir, E.: A pictorial dictionary for printed Farsi subwords. Pattern Recognit. Lett. 29, 656–663 (2008)CrossRefGoogle Scholar
  23. 23.
    Ehsani, M., Babaee, M.: Recognition of Farsi handwritten cheque values using neural networks. In: Proceedings of the 3rd International IEEE Conference Intelligent Systems, pp. 656–660 (2006)Google Scholar
  24. 24.
    Eldin, A.S., Nouh, A.S.: Arabic character recognition: a survey. In: Proceedings of SPIE Optical Pattern Recognition, vol. 3386, pp. 331–340, Orlando, Florida, USA (1998)Google Scholar
  25. 25.
    Farah, N., Souici, L., Farah, L., Sellami, M.: Arabic words recognition with classifiers combination: an application to literal amounts. In: Proceedings of Artificial Intelligence: Methodology, Systems, and Applications, pp. 331–340, Varna, Bulgaria (2004)Google Scholar
  26. 26.
    Farah, N., Souici, L., Sellami, M.: Classifiers combination and syntax analysis for arabic literal amount recognition. Eng. Appl. Artif. Intell. 19(1), 29–39 (2006)CrossRefGoogle Scholar
  27. 27.
    Farrahi Moghaddam, R., Cheriet, M., Adankon, M., Filonenko, K., Wisnovsky, R.: IBN SINA: a database for research on processing and understanding of Arabic manuscripts images. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems (DAS ’10), pp. 11–18. ACM (2010)Google Scholar
  28. 28.
    Farrahi Moghaddam, R., Cheriet, M., Milo, T., Wisnovsky, R.: A prototype system for handwritten sub-word recognition: toward Arabic-manuscript transliteration CoRR, abs/1111.3281 (2011)Google Scholar
  29. 29.
    Farrahi Moghaddam, R., Cheriet, M.: A multi-scale framework for adaptive binarization of degraded document images. Pattern Recognit. 43, 2186–2198 (2010)CrossRefzbMATHGoogle Scholar
  30. 30.
    Fischer, A., Riesen, K., Bunke, H.: Graph similarity features for HMM-based handwriting recognition in historical documents. In: Proceedings of the International Conference on Frontiers in Handwriting Recognition (ICFHR ’10), pp. 253–258 (2010)Google Scholar
  31. 31.
    Glucksman, H.: Classification of mixed-font alphabets by characteristic loci. In: Proceedings of IEEE Computer Conference, pp. 138–141 (1967)Google Scholar
  32. 32.
    James, G.M.: Curve alignment by moments. Ann. Appl. Stat. 1(2), 480–501 (2007)CrossRefzbMATHMathSciNetGoogle Scholar
  33. 33.
    Jou, F.D., Fan, K.C., Chang, Y.L.: Efficient matching of large-size histograms. Pattern Recognit. Lett. 25, 277–286 (2004)CrossRefGoogle Scholar
  34. 34.
    Kessentini, Y., Paquet, T., Ben Hamadou, A.: Off-line handwritten word recognition using multi-stream hidden markov models. Pattern Recognit. Lett. 31(1), 60–70 (2010)CrossRefGoogle Scholar
  35. 35.
    Khorsheed, M.S.: Off-line Arabic character recognition—a review. Pattern Anal. Appl. 5, 31–45 (2002)CrossRefMathSciNetGoogle Scholar
  36. 36.
    Koerich, A.L., Sabourin, R., Suen, C.Y.: Large vocabulary off-line handwriting recognition: a survey. Pattern Anal. Appl. 6, 97–121 (2003)CrossRefMathSciNetGoogle Scholar
  37. 37.
    Li, Z., Luo, X., Gao, C.: Multi-resolution curve alignment based on salient features. In: Proceedings of the 18th International Conference on, Pattern Recognition (ICPR’06), vol. 2, pp. 357–360 (2006)Google Scholar
  38. 38.
    Liu, C.L., Suen, C.Y.: A new benchmark on the recognition of handwritten Bangla and Farsi numeral characters. Pattern Recognit. 42(12), 3287–3295 (2009)CrossRefzbMATHGoogle Scholar
  39. 39.
    Lopresti, D., Nagy, G., Seth, S., Zhang, X.: Multi-character field recognition for Arabic and chinese handwriting. In: Lecture Notes in Computer Science, vol. 4768, p. 218 (2008)Google Scholar
  40. 40.
    Lorigo, L.M., Govindaraju, V.: Offline Arabic handwriting recognition: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 28(5), 712–724 (2006)CrossRefGoogle Scholar
  41. 41.
    Madhvanath, S., Govindaraju, V.: The role of holistic paradigms in handwritten word recognition. IEEE Trans. Pattern Anal. Mach. Intell. 23, 149–164 (2001)Google Scholar
  42. 42.
    Mahmoud, S.: Arabic (Indian) handwritten digits recognition using Gabor-based features. In: Proceedings of International Conference on Innovations in Information Technology, pp. 683–687, Al Ain (2008)Google Scholar
  43. 43.
    Marques, J.S.: A fuzzy algorithm for curve and surface alignment. Pattern Recognit. Lett. 19(9), 797–803 (1998)CrossRefzbMATHGoogle Scholar
  44. 44.
    Mattar, M.A., Ross, M.G., Learned-Miller, E.G.: Nonparametric curve alignment. In: Proceedings of IEEE International Conference on Acoustics, Speech, and, Signal Processing (ICASSP’09), pp. 3457–3460 (2009)Google Scholar
  45. 45.
    Mozaffari, S., Faez, K., Margner, V.: Application of fractal theory for on-line and off-line Farsi digit recognition. In: Lecture Notes in Computer Science, vol. 4571, p. 868 (2007)Google Scholar
  46. 46.
    Mozaffari, S., Faez, K., Margner, V., El-Abed, H.: Two-stage Lexicon reduction for offline Arabic handwritten word recognition. Int. J. Pattern Recognit. Artif. Intell. 22, 1323–1341 (2008)CrossRefGoogle Scholar
  47. 47.
    Munich, M.E., Perona, P.: Continuous dynamic time warping for translation-invariant curve alignment with applications to signature verification. In: Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV’99), vol. 1, pp. 108–115 (1999)Google Scholar
  48. 48.
    Myers, C., Rabiner, L., Rosenberg, A.: Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. IEEE Trans. Signal Process. Acoust. Speech Signal Process. 28, 623–635 (1980)CrossRefzbMATHGoogle Scholar
  49. 49.
    Parvez, M.T., Mahmoud, S.A.: Arabic handwriting recognition using structural and syntactic pattern attributes. Pattern Recognit. 46, 141–154 (2013)CrossRefGoogle Scholar
  50. 50.
    Plamondon, R., Srihari, S.N.: On-line and off-line handwriting recognition: a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 63–84 (2000)CrossRefGoogle Scholar
  51. 51.
    Quiniou, S., Anquetil, E., Carbonnel, S.: Statistical language models for on-line handwritten sentence recognition. In: Proceedings of the Eight International Conference on Document Analysis and Recognition (ICDAR05) (2005)Google Scholar
  52. 52.
    Ravani, R., Nooralishahi, P., Amani, A.S.: A novel approach for Persian/Arabic Intelligent Word Recognition (IWR). In: Proceedings of the 3rd European Workshop on Visual Information Processing (EUVIP), pp. 292–297 (2011)Google Scholar
  53. 53.
    Ronn, B.B.: Non-parametric maximum likelihood estimation for shifted curves. J. R. Stat. Soc. B(63), 243–259Google Scholar
  54. 54.
    Saeed, K., Albakoor, M.: Region growing based segmentation algorithm for typewritten and handwritten text recognition. Appl. Soft Comput. 9(2), 608–617 (2009)CrossRefGoogle Scholar
  55. 55.
    Sari, T., Souici, L., Sellami, M.: Off-line handwritten Arabic character segmentation algorithm: ACSA. In: Proceedings of International Workshop on Frontiers in Handwriting Recognition, pp. 452–457, Niagara-on-the-Lake Ontario, Canada (2002)Google Scholar
  56. 56.
    Sari, T., Sellami, M.: Cursive Arabic script segmentation and recognition system. Int. J. Comput. Appl. 27(3), 161–168 (2005)Google Scholar
  57. 57.
    Sebastian, T., Klein, P., Kimia, B.: On aligning curves. IEEE Trans. Pattern Anal. Mach. Intell. 25, 116–125 (2003)CrossRefGoogle Scholar
  58. 58.
    Sonka, M., Hlavac, V., Boyle, R.: Image Processing, Analysis and Machine Vision. Thomson Learning, USA (2008)Google Scholar
  59. 59.
    Souici-Meslati, L., Sellami, M.: A hybrid approach for Arabic literal amounts recognition. Arab. J. Sci. Eng. 29, 177–194 (2004)Google Scholar
  60. 60.
    Steinherz, T., Rivlin, E., Intrator, N.: Off-line cursive script word recognition: a survey. Int. J. Document Anal. Recognit. (IJDAR) 2, 90–110 (1999)CrossRefGoogle Scholar
  61. 61.
    Vamvakas, G., Gatos, B., Stamatopoulos, N., Perantonis, S.: A complete optical character recognition methodology for historical documents. In: Proceedings of the Eighth IAPR International Workshop on Document Analysis Systems (DAS ’08), pp. 525–532 (2008)Google Scholar
  62. 62.
    Vinciarelli, A., Bengio, S.: Writer adaptation techniques in HMM based off-line cursive script recognition. Pattern Recognit. Lett. 23(8), 905–915 (2002) Google Scholar
  63. 63.
    Wang, K.M., Gasser, T.: Alignment of curves by dynamic time warping. Ann. Stat. 25(3), 1251–1276 (1997)CrossRefzbMATHMathSciNetGoogle Scholar
  64. 64.
    Wshah, S., Govindaraju, V., Cheng, Y., Li, H.: A novel lexicon reduction method for Arabic handwriting recognition. In: Proceedings of the 20th International Conference on Pattern Recognition (ICPR ’10), pp. 2865–2868 (2010)Google Scholar
  65. 65.
    Wshah, S., Shi, Z., Govindaraju, V.: Segmentation of Arabic handwriting based on both contour and skeleton segmentation. In: Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR’09), pp. 793–797, Barcelona (2009)Google Scholar
  66. 66.
    Wuthrich, M., Liwicki, M., Fischer, A., Indermuhle, E., Bunke, H., Viehhauser, G., Stolz, M.: Language model integration for the recognition of handwritten medieval documents. In: Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR ’09), pp. 211–215 (2009)Google Scholar
  67. 67.
    Xia, M., Liu, B.: Aligning curves under projective transform and its application to image registration. In: Proceedings of IEEE International Conference on Image Processing (ICIP’06), pp. 349–352 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Kazim Fouladi
    • 1
    • 2
  • Babak N. Araabi
    • 1
    • 2
    Email author
  • Ehsanollah Kabir
    • 3
  1. 1.Learning Intelligent Systems Lab, School of Electrical and Computer Engineering, College of EngineeringUniversity of TehranTehranIran
  2. 2.School of Cognitive SciencesInstitute for Research in Fundamental Sciences (IPM)TehranIran
  3. 3.Department of Electrical and Computer EngineeringTarbiat Modarres UniversityTehranIran

Personalised recommendations