Abstract
The issue of handwritten recognition in Arabic script nature has attracted many researchers from both academic and industrial fields. But their efforts have not reached satisfying outcomes till now. In this paper, a survey has been done in segmentation and recognition of handwritten documents in Arabic script. Most of the previous published works have been analyzed, and some remedies have been suggested. Various strategies used for creating a powerful recognition system have been summarized. This paper presents various algorithms with respect to text, word and characters segmentation and recognition of Arabic document. It analyzes the recognition stage of Arabic script depending on segmentation strategies.
Similar content being viewed by others
Change history
28 September 2023
A Correction to this paper has been published: https://doi.org/10.1007/s42979-023-02168-3
References
Mezghani A, Kanoun S, Khemakhem M, Abed H. A database for arabic handwritten text image recognition and writer identification. In: International conference on frontiers in handwriting recognition (ICFHR), 2012. p. 339–402.
Ali AAA, Suresha M. A novel approach to correction of a skew at document level using an arabic script. Int J Comput Sci Inf Technol. 2017;8(5):569–73.
Zahour A, Taconet B, Mercy P, Ramdane S. Arabic hand-written text-line extraction. In: Proceedings of the sixth international. conference on document analysis and recognition, ICDAR 2001, USA; 2001. p. 281–285.
Razak Z, Zulkiflee K, Idris MYI, Tamil EM, Noorzaily M, Noor M, Salleh R, Yaakob M, Yusof ZM, Yaacob M. Off-line handwriting text line segmentation: a review. Int J Comput Sci Netw Secur. 2008;8(7):12–20.
Likforman-Sulem L, Zahour A, Taconet B. Text line segmentation of historical documents: a survey. Int J Doc Anal Recognit. 2007;9:123–38.
Abdullah MA, Al-Harigy LM, Al-Fraidi HH. Off-line arabic handwriting character recognition using word segmentation. J Comput. 2012;4:40–4.
Attia M, El-Mahallawy M. Histogram-based lines and words decomposition for arabic omni font-written OCR systems; enhancements and evaluation. In: International conference on computer analysis of images and patterns, 2007; p. 522–530.
Zheng L, Hassin AH, Tang X. A new algorithm for machine printed Arabic character segmentation. Pattern Recognit Lett. 2004;25:1723–9.
Elaiwat S, Abu-zanona MA. A three stages segmentation model for a higher accurate off-line Arabic handwriting recognition. World Comput Sci Inf Technol J. 2012;2(3):98–104.
Adiguzel H, Sahin E, Duygulu P. A hybrid for line segmentation in handwritten documents. In: International conference on frontiers in handwriting recognition (ICFHR), 2012; p. 503–508.
Zahour A, Likforman-Sulem L, Boussalaa W, Taconet B. Text line segmentation of historical Arabic documents. In: Ninth international conference on document analysis and recognition (ICDAR), IEEE, vol. 1; 2007. p. 138–142.
Khayyat M, Lam L, Suen CY, Yin F, Liu CL. Arabic handwritten text line extraction by applying an adaptive mask to morphological dilation. In: 10th IAPR International workshop on document analysis systems (DAS), IEEE; 2012. p. 100–104.
Ouwayed N, Belaïd A. A general approach for multi-oriented text line extraction of handwritten documents. Int J Doc Anal Recognit (IJDAR). 2012;15(4):297–314.
Kumar J, Abd-Almageed W, Kang L, Doermann D. Handwritten Arabic text line segmentation using affinity propagation. In: Proceedings of the 9th IAPR international workshop on document analysis systems, ACM; 2010. p. 135–142
Kumar J, Kang L, Doermann D, Abd-Almageed W. Segmentation of handwritten text lines in presence of touching components. In: International conference on document analysis and recognition (ICDAR), IEEE, 2011. p. 109–113.
Feldbach M, Tönnies KD. Line detection and segmentation in historical church registers. In: International conference on document analysis and recognition, 2001; p. 743–747.
Nicolaou A, Gatos B. Handwritten text line segmentation by shredding text into its lines. In: 10th International conference on document analysis and recognition ICDAR’09, IEEE; 2009. p. 626–630.
Khandelwal A, Choudhury P, Sarkar R, Basu S, Nasipuri M, Das N. Text line segmentation for unconstrained handwritten document images using neighborhood connected component analysis. In: Pattern recognition and machine intelligence, Springer, Berlin; 2009. p. 369–374.
Shi Z, Setlur S, Govindaraju V. A steerable directional local profile technique for extraction of handwritten Arabic text lines. In: 10th International conference on document analysis and recognition ICDAR’09, IEEE; 2009. p. 176–180.
Stamatopoulos N, Gatos B, Louloudis G, Pal U, Alaei A. Handwriting segmentation contest. In: 12th International conference on document analysis and recognition (ICDAR), IEEE; 2013. p. 1402–1406.
Hamid A, Haraty R. A neuro-heuristic approach for segmenting handwritten arabic text. In: International conference on computer systems and applications, ACS/IEEE, Beirut; 2001. p. 110–113.
Boussellaa W, Zahour A, Elabed H, Benabdelhafid A, Alimi A. Unsupervised block covering analysis for text-line segmentation of Arabic ancient handwritten document images. In: 20th International conference on pattern recognition (ICPR), IEEE, 2010.
Shi Z, Govindaraju V. Line separation for complex document images using fuzzy runlength. In: International workshop on document image analysis for libraries, 2004.
Louloudis G, Gatos B, Pratikakis I, Halatsis K. A block-based hough transform mapping for text line detection in handwritten documents. In: International workshop on frontiers in handwriting recognition, 2006.
Suresha M, Ali AAA. segmentation of handwritten text lines with touching of line. Int J Comput Eng Appl. 2018;12(6):1–12.
AlKhateeb JH, Jiang J, Ren J, Ipson S. Interactive knowledge discovery for baseline estimation and word segmentation in handwritten Arabic text. In: Strangio MA (ed), Recent advances in technologies, 2009.
Wshah S, et al.: Segmentation of Arabic handwriting based on both contour and skeleton segmentation. In: 10th International conference on document analysis and recognition ICDAR’09, IEEE; 2009.
Al-Dmour A, Fraij F. Segmenting Arabic handwritten documents into text lines and words. Int J Adv Comput Technol (IJACT). 2014;6(3):109–19.
Xiu P, Peng L, Ding X, Wang H. Offline handwritten Arabic character segmentation with probabilistic model. In: Document analysis systems, Springer; 2006. p. 402–12.
Ali AAA, Suresha M. A novel features and classifiers fusion technique for recognition of Arabic handwritten character script. SN Appl Sci. 2019;1(10):1286.
Ali AAA, Suresha M. A new design based-fusion of features to recognize Arabic handwritten characters. Int J Eng Adv Technol (IJEAT). 2019;8(5):2570–4.
Ali AAA, Suresha M. An efficient character segmentation algorithm for recognition of Arabic handwritten script. In: International conference on data science and communication (IconDSC), IEEE, 2019. p. 1–6.
Rehman A, Mohamad D, Sulong G. Implicit vs explicit based script segmentation and recognition: a performance comparison on benchmark database. Int J Open Probl Comput Math. 2009;2(3):352–64.
Sari T, Souici L, Sellami M. Handwritten Arabic character segmentation and recognition system: ACSA-RECAM. In: Proceedings of IWFHR’02, Canada; 2002. p. 452–457.
Sari T, Sellami M. Morpho-lexical analysis for correcting arabic OCR-generated word. In: Proceedings of WFHR’02, Canada; 2002. p. 461–466.
Pavlidis T. Algorithms for graphics and image processing. New Jersey: Murray Hill; 1981.
Sari T, Sellami M. Segmentation and recognition of arabic handwritten words. Int J Comput Appl. 2005;27(3):161–8.
Pechwitz M, Snoussi-Maddouri S, Märgner V, Ellouze N, Amiri H. IFN/ENIT database of handwritten arabic words. In: Proceedings of the Colloque Francophone International sur l’Ecrit et le Document. Tunisia: Hammamet; 2002. p. 129–36.
Cheriet M, Miled H, Olivier C. Visual aspect of cursive Arabic handwriting recognition. In: Proceeding of VI’99, 1998; p. 263–270.
Olivier C, Miled H, Romeo K, Lecourtier Y. Segmentation and coding of arabic handwritten words. In: Proceedings of ICPR’96, 1996. p. 264–268.
Miled H, Cheriet M, Olivier C. Markovian modelling of arabic cursive handwriting: an analytical approach. In: Proceedings VI’98, 1998. p. 255–262.
Motawa D, Amin A, Sabourin R. Segmentation of Arabic cursive script. In: International conference on document analysis and recognition proceedings of ICDAR’97, vol. 2; 1997. p. 525–628.
Simon JC. Off-line cursive word recognition. Proc IEEE. 1992;80(7):1150–61.
Lawgali A, Bouridane A, Angelova M, Ghassemlooy Z. Automatic segmentation for Arabic characters in handwriting documents. In: 18th IEEE International conference on image processing (ICIP), 2011. p. 3529–3532.
Almuallim H, Yamaguchi S. A method of recognition of Arabic cursive handwriting. Trans PAMI IEEE. 1987;9(5):715–22.
Zahour A, Djematena A, Kebairi S, Bennasri A, Taconet B. Contribution A La Reconnaissance De L’écriture Arabe Manuscrite. In: Proceeding of CIFED’98, 1998. p. 219–227.
Zahour A, Taconet B, Faure A. Une Méthode de Reconnaissance Structurelle de L’arabe Ecrit. In: Proceedings Actes RFIA’87, vol. 3; 1987. p. 1521–1530.
Romeo-Pakker K, Miled H, Lecourtier Y. A new approach for Latin/Arabic character segmentation. In: Proceedings of ICDAR’95, vol. 2, Germany; 1995. p. 874–877.
Baldominos A, Saez Y, Isasi P. A survey of handwritten character recognition with mnist and emnist. Appl Sci. 2019;9(15):3169.
Razzak MI, Hussain SA, Belaïd A, Sher M. Multi-font numerals recognition for Urdu script based languages, 2009.
Razzak MI, Hussain SA, Sher M. Numeral recognition for Urdu script in unconstrained environment. In: International conference on emerging technologies, IEEE; 2009. p. 44–47.
Ali AAA, Suresha M, Ahmed HAM. Different handwritten character recognition methods: a review. In: Global conference for advancement in technology (GCAT), IEEE; 2019. p. 1–8.
Malik SA, Maqsood M, Aadil F, Khan MF. An efficient segmentation technique for Urdu optical character recognizer (ocr). In: Future of information and communication conference, Springer, Cham; 2019. p. 131–141.
Ali AAA, Suresha M. Arabic handwritten character recognition using machine learning approaches. In: International conference on image information processing (ICIIP), IEEE; 2019. p. 187—192.
Ahmed SB, Naz S, Swati S, Razzak MI. Handwritten Urdu character recognition using one-dimensional BLSTM classifier. Neural Comput Appl. 2019;31(4):1143–51.
Alghazo JM, Latif G, Alzubaidi L, Elhassan A. Multi-language handwritten digits recognition based on novel structural features. J Imaging Sci Technol. 2019;63(2):20502-1.
Husnain M, Saad Missen MM, Mumtaz S, Jhanidr MZ, Coustaty M, Muzzamil Luqman M, Sang Choi G. Recognition of Urdu handwritten characters using convolutional neural network. Appl Sci. 2019;9(13):2758.
Anwar F. Online Urdu handwritten text recognition for mobile devices using intelligent techniques. PhD diss., International Islamic University, Islamabad; 2019.
Khan NH, Adnan A, Basar S. Urdu ligature recognition using multi-level agglomerative hierarchical clustering. Clust Comput. 2018;21(1):503–14.
Ahmad I, Wang X, hao Mao Y, Liu G, Ahmad H, Ullah R. Ligature based Urdu Nastaleeq sentence recognition using gated bidirectional long short term memory. Clust Comput. 2018;21(1):703–14.
Maddouri SS, Ghazouani F, Samoud FB. Text lines and paws segmentation of handwritten Arabic document by two hybrid methods. In: 1st International conference on advanced technologies for signal and image processing (ATSIP), 2014. p. 310–315.
Osman Y. Segmentation algorithm for Arabic handwritten text based on contour analysis. In: International conference on computing, electrical and electronics engineering (IC-CEEE); 2013. p. 447–452.
Ali AAA, Suresha M. Efficient algorithms for text lines and words segmentation for recognition of Arabic handwritten script. In: Emerging research in computing, information, communication and applications. Advances in intelligent systems and computing, vol. 882. Springer, Singapore; 2019. p. 387–401.
Ain Safdar QT, Ullah Khan K, Peng L. A novel similar character discrimination method for online handwritten Urdu character recognition in half forms. Scientia Iran. 2018. https://doi.org/10.24200/sci.2018.20826
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ali, A.A.A., Suresha, M. Survey on Segmentation and Recognition of Handwritten Arabic Script. SN COMPUT. SCI. 1, 192 (2020). https://doi.org/10.1007/s42979-020-00187-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-020-00187-y