Advertisement

Text Line and Word Extraction of Arabic Handwritten Documents

  • Asmae LamsafEmail author
  • Mounir Aitkerroum
  • Siham Boulaknadel
  • Youssef Fakhri
Conference paper
Part of the Lecture Notes in Intelligent Transportation and Infrastructure book series (LNITI)

Abstract

The documents of Arabic handwritten contain text lines and words. Words are often a succession of sub-words (characters, connected components) separated by spaces, in Arabic handwritten its spaces are divided into two types: the first type represents the spaces that separate two connected components of the same word (within-word), the second type are spaces that separate two connected components from two consecutive words(between-words). We detect the second type for word extracting. Word extraction based on the classification of spaces detected and extracts between-words spaces to segment the text into words. In this paper, we present a method for segmenting Arabic handwritten text into lines and words, to make our method of word extraction more optimal, we compute the threshold of spaces for each line, the threshold is not fixed in the document, each line is associated its classification threshold spaces. Before segmenting the text into words, it is necessary to segment it into text lines in order to apply our method to each line. To extract the lines, the preprocessing is applied to the text images in order to apply the proposed method for the line segmentation step. Our system is applied on the benchmarking datasets of the Arabic handwriting database for text recognition (AHDB) and the experimental results are very promising as we achieved a success word extraction rate of 87.9%.

Keywords

Word extraction Arabic handwriting Recognition of arabic handwriting Lines segmentation Handwriting analysis 

References

  1. 1.
    AlKhateeb, J.H., Jiang, J.J., Ren, J., Ipson, S.: Interactive knowledge discovery for baseline estimation and word segmentation. Recent advances in technologies (2009)Google Scholar
  2. 2.
    Al-Dmour, A., Fraij, F.: Segmenting Arabic handwritten documents into text lines and words. Int. J. Adv. Comput. Technol. (IJACT) 6(3), 2014 (2014)Google Scholar
  3. 3.
    Al-Dmour, A., Abu Zitar, R.: Word extraction from Arabic handwritten documents based on statistical measures. Int. Rev. Comput. Software 11(5), 2016 (2016)Google Scholar
  4. 4.
    Al-Muallim, H., Yamaguchi, S.: A method of recognition of Arabic cursive handwriting. Pattern Anal. Mach. Intell. 9(1987), 715–722 (1987)CrossRefGoogle Scholar
  5. 5.
    Papavassiliou, V., Stafylakis, T., Katsouros, V., Carayannis, G.: Handwritten document image segmentation into text lines and words. Pattern Recogn. 43(1), 369–377 (2010)CrossRefGoogle Scholar
  6. 6.
    Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979)CrossRefGoogle Scholar
  7. 7.
    Al-ma’adeed, S., Elliman, D. . Higgins, C.A., Campus, J.: A data base for Arabic handwritten text recognition research. In: Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition (IWFHR’02) (2002)Google Scholar
  8. 8.
    Aouadi, N., Echi, AK.: Word extraction and recognition in Arabic handwritten text. Int. J. Comput. Inf. Sci. 12(1) (2016)CrossRefGoogle Scholar
  9. 9.
    Kumar, J., Abd-Almageed, W., Kang, L., Doermann, D.S.: Handwritten Arabic text line segmentation using affinity propagation. In: Proceeding(s) of DAS 10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp. 135–142 (2010)Google Scholar
  10. 10.
    Shi, Z., Setlur, S., Govindaraju, V., Setlur, S., Govindaraju V.: A steerable directional local profile technique for extraction of handwritten Arabic text lines. In: ICDAR, pp. 176–180 (2009)Google Scholar
  11. 11.
    Ouwayed, N., Belaıd, A.: Separation of overlapping and touching lines within handwritten Arabic documents. In: Proceeding(s) of the 13th International Conference on Computer Analysis of Images and Patterns, CAIP. 9, pp. 123–138 (2009)CrossRefGoogle Scholar
  12. 12.
    Khayyat, M., Lam, L., Suen, C.Y., Yin, F., Liu, C-L.: Arabic handwritten text line extraction by applying an adaptive mask to morphological dilation. In: Proceeding(s) of 10th IAPR International Workshop on Document Analysis Systems, pp. 100–104 (2012)Google Scholar
  13. 13.
    Dinges, L., Al-Hamadi, A., Elzobi, M.: A locale group based line segmentation approach for non uniform skewed and curved Arabic handwritings. In: 12th International Conference on Document Analysis and Recognition (ICDAR), IEEE (2013)Google Scholar
  14. 14.
    Yousif, I., Shaout, A.: Off-Line handwriting Arabic text recognition: a survey. Int. J. Adv. Res. Comput. Sci. Software Eng. 4(9) (2014)Google Scholar
  15. 15.
    Ouwayed, N., Belaid, A.: A general approach for multi-oriented text line extraction of handwritten document. Int. J. Doc. Anal. Recogn., Springer Verlag (2011)Google Scholar
  16. 16.
    Abdullah, S., AL-Nassiri, A., Salam, R.A.: Off-Line Arabic handwritten word segmentation using rotational invariant segments features (2008)Google Scholar
  17. 17.
    Elnagar, A., Bentrcia, R.: A recognition-based approach to segmenting Arabic handwritten text. J. Intell. Learn. Syst. Appl. 93–103 (2015)CrossRefGoogle Scholar
  18. 18.
    Lawgali, A.: A survey on arabic character recognition. Int. J. Signal Process. Image Process. Pattern Recogn. 8(2) 401–426 (2015)CrossRefGoogle Scholar
  19. 19.
    Lorigo, L., Govindaraju,V.: Off-line Arabic handwriting recognition: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 28(05) 712–724 (2006)CrossRefGoogle Scholar
  20. 20.
    Parvez, M.T., Mahmoud, S.A.: Offline Arabic handwritten text recognition: a survey. ACM Comput. Surv. 45(2) (2013)CrossRefGoogle Scholar
  21. 21.
    Boulid, Y., El Youssfi, E.M., A. SOUHAR. Reconnaissance de l’écriture manuscrite arabe en mode hors ligne (2016)Google Scholar
  22. 22.
    El Abed, H., Märgner, V.: The IFN/ENIT-database—a tool to develop Arabic handwriting recognition systems. In: 9th International Symposium on Signal Processing and Its Application (2007)Google Scholar
  23. 23.
  24. 24.
    Menasri, F.,: Contributions à la reconnaissance de l’écriture arabe manuscrite, Thèse Université Paris Descartes (2008)Google Scholar
  25. 25.
    Ouchtati, S., Redjimi, M. ., Bedda, M.: Recognition of the Arabic handwritten words of the algerian departments. Int. J. Comput. Theory Eng. 6(2) (2014)CrossRefGoogle Scholar
  26. 26.
    Abuzaraida, M.A., Zeki, A.M., Zeki, A.M.: Online recognition of Arabic handwritten words system based on Alignments matching Algorithm. In: Proceedings of the International conference on computing, Mathematics and statistics, Springer Nature Singapore (2017)Google Scholar
  27. 27.
    Khémiri, A., KacemEchi, A., Belaid, A., Elloumi, M.: A system for off-line Arabic handwritten word recognition based on bayesian approach. In: 15th International Conference on Frontiers in Handwriting Recognition (2016)Google Scholar
  28. 28.
    Ebrahinpour, R., Amini, M., Sharifizadehi, F.: Farsi handwritten recognition using combining neural networks based on stacked generalization. Int. J. Electr. Eng. Inf. 3(2) 146–160 (2011)CrossRefGoogle Scholar
  29. 29.
    Nouar, F., Aissaoui, M.E., Seridi, H.: Approche globale pour la reconnaissance de mots arabes manuscrits par combinaison parallèle de classifieurs. In: Proceedings des Journées des Jeunes Chercheurs en Informatique (JCI) (2008)Google Scholar
  30. 30.
    Alkhoury, I.: Arabic handwritten word recognition based on Bernoulli mixture HMM, Master Thesis, University of Valencia (2010)Google Scholar
  31. 31.
    Mohamed, K.: Reconnaissance de formes appliquée à l’écriture Arabe manuscrite par des multiclassifieurs, thesis (2010)Google Scholar
  32. 32.
    Boukerma, H.: Combinaison de classifieurs flous pour la reconnaissance de l’écriture arabe manuscrite, Master Thesis, (2010)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Asmae Lamsaf
    • 1
    Email author
  • Mounir Aitkerroum
    • 1
  • Siham Boulaknadel
    • 2
  • Youssef Fakhri
    • 1
  1. 1.LaRITFaculty of Sciences-Ibn Tofail UniversityKenitraMorocco
  2. 2.IRCAM, Madinat al Irfane, Rabat-InstitutsRabatMorocco

Personalised recommendations