Offline Handwritten Arabic Character Segmentation with Probabilistic Model

  • Pingping Xiu
  • Liangrui Peng
  • Xiaoqing Ding
  • Hua Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3872)

Abstract

The research on offline handwritten Arabic character recognition has received more and more attention in recent years, because of the increasing needs of Arabic document digitization. The variation in Arabic handwriting brings great difficulty in character segmentation and recognition, eg., the sub-parts (diacritics) of the Arabic character may shift away from the main part. In this paper, a new probabilistic segmentation model is proposed. First, a contour-based over-segmentation method is conducted, cutting the word image into graphemes. The graphemes are sorted into 3 queues, which are character main parts, sub-parts (diacritics) above or below main parts respectively. The confidence for each character is calculated by the probabilistic model, taking into account both of the recognizer output and the geometric confidence besides with logical constraint. Then, the global optimization is conducted to find optimal cutting path, taking weighted average of character confidences as objective function. Experiments on handwritten Arabic documents with various writing styles show the proposed method is effective.

Keywords

Optical Character Recognition Logical Rule Integral Character Arabic Text Stroke Width 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Al-Yousefi, H., Udpa, S.S.: Recognition of Arabic characters. IEEE Transactions on Pattern Analysis and Machine Intelligence (1992)Google Scholar
  2. 2.
    Amin, A., Mari, J.F.: Machine recognition and correction of printed Arabic text. IEEE Transactions on Systems, Man and Cybernetics (1989)Google Scholar
  3. 3.
    Amin, A., Al-Sadoun, H.B.: A new segmentation technique of Arabic text. in Pattern Recognition. In: Conference B: Proceedings of 11th IAPR International Conference on Pattern Recognition Methodology and Systems, vol. II (1992)Google Scholar
  4. 4.
    Sari, T., Souici, L., Sellami, M.: Off-line handwritten Arabic character Segmentation algorithm: ACSA. In: Proceedings of Eighth International Workshop on Frontiers in Handwriting Recognition, pp. 452–457 (2002)Google Scholar
  5. 5.
    Olivier, C., et al.: Segmentation and Coding of Arabic Handwritten Words. In: 13th International Conference on Pattern Recognition, ICPR 1996 (1996)Google Scholar
  6. 6.
    Jin, J., et al.: Printed Arabic document recognition system. Vision Geometry XIII. In: Latecki, L.J., Mount, D.M., Wu, A.Y. (eds.) Proceedings of the SPIE, vol. 5676, pp. 48–55 (2004)Google Scholar
  7. 7.
    Cheung, A., Bennamoun, M., Bergmann, N.W.: A recognition-based Arabic optical character recognition system. In: IEEE International Conference on Systems, Man, and Cybernetics (1998)Google Scholar
  8. 8.
    Pechwitz, M., Maergner, V.: HMM based approach for handwritten Arabic word recognition using the IFN/ENIT - database. In: Proceedings of Seventh International Conference on Document Analysis and Recognition (2003)Google Scholar
  9. 9.
    Fakir, M., Hassani, M.M., Sodeyama, C.: Recognition of Arabic characters using Karhunen-Loeve transform anddynamic programming. In: Proceedings of 1999 IEEE International Conference on Systems, Man, and Cybernetics. IEEE SMC 1999 (1999)Google Scholar
  10. 10.
    Dehghan, M., et al.: Holistic handwritten word recognition using discrete HMM and self-organizing feature map. In: Proc. IEEE Int. Conf. Syst. Man Cybern. (2000)Google Scholar
  11. 11.
    Bortolozzi, F., et al.: Recent advances in handwriting recognition. In: Proceedings of the IWDA 2005 (2005)Google Scholar
  12. 12.
    Sarfraz, M., Nawaz, S.N., Al-Khuraidly, A.: Offline Arabic Text Recognition System. In: 2003 International Conference on Geometric Modeling and Graphics, GMAG 2003 (2003)Google Scholar
  13. 13.
    Najoua, B.A., Noureddine, E.: A robust approach for Arabic printed character segmentation. In: Proceedings of the Third International Conference on Document Analysis and Recognition (1995)Google Scholar
  14. 14.
    Motawa, D., Amin, A., Sabourin, R.: Segmentation of Arabic cursive script. In: Proceedings of the Fourth International Conference on Document Analysis and Recognition (1997)Google Scholar
  15. 15.
    Bushofa, B.M.F., Spann, M.: Segmentation of Arabic characters using their contour information. In: The 1997 13th International Conference on Digital Signal Processing, DSP. Part 2 (of 2) (1997)Google Scholar
  16. 16.
    Romeo-Pakker, K., Miled, H., Lecourtier, Y.: A new approach for Latin/Arabic character segmentation. In: Proceedings of the Third International Conference on Document Analysis and Recognition (1995)Google Scholar
  17. 17.
    Tolba, M.F., Shaddad, E.: On the automatic reading of printed Arabic characters. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, pp. 496–498 (1990)Google Scholar
  18. 18.
    Maergner, V.: SARAT-a system for the recognition of Arabic printed text. In: Conference B: Proceedings of 11th IAPR International Conference on Pattern Recognition Methodology and Systems. Pattern Recognition, vol. II (1992)Google Scholar
  19. 19.
    Elgammal, A.M., Ismail, M.A.: A Graph-Based Segmentation and Feature-Extraction Framework for Arabic Text Recognition. In: Sixth InternationalConference on Document Analysis and Recognition, ICDAR 2001 (2001)Google Scholar
  20. 20.
    Lethelier, E., Leroux, M., Poste, M.G.L.: An automatic reading system for handwritten numeral amounts on French checks. In: Proceedings of the Third International Conference on Document Analysis and Recognition (1995)Google Scholar
  21. 21.
    Wang, H., et al.: New statistical method for machine-printed Arabic character recognition. In: Smith, E.H.B., Taghva, K. (eds.) Proceedings of SPIE. Document Recognition and Retrieval XII, vol. 5676 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Pingping Xiu
    • 1
  • Liangrui Peng
    • 1
  • Xiaoqing Ding
    • 1
  • Hua Wang
    • 1
  1. 1.Dept. of Electronic EngineeringTsinghua University, State Key Laboratory of Intelligent Technology and SystemsBeijingChina

Personalised recommendations