Skip to main content
Log in

Cursive Arabic handwritten word recognition system using majority voting and k-NN for feature descriptor selection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Handwriting text recognition for computer systems has been the subject of more and more research. Recognition of Arabic handwritten text is always an ongoing challenge, mainly due to the similarity between its letters and various writing styles. However, the problem of cursive handwriting recognition remains laborious due to the complexity of the Arabic handwriting morphology. This paper proposes a new efficient-based image processing approach that combines three image descriptors, the Oriented Gradient Histogram, the Gabor filter, and the Local Binary Pattern method for the features extraction phase. To prepare the training and testing datasets, we applied a series of preprocessing techniques on 100 classes selected from the handwritten Arabic database IFN/ENIT. Then, we trained the k-nearest neighbor algorithm (k-NN) to generate the best model for each feature extraction descriptor. The best k-NN model is used to classify Arabic handwritten images according to their classes. The model’s performance evaluation uses the main common metrics, namely accuracy, sensitivity, specificity, and precision. Based on the performance evaluation results of the three k-NN generated models, the Majority-voting algorithm is used to combine the prediction results. A high recognition rate of up to 99.88% is achieved, far exceeding the already state-of-the-art results on the IFN/ENIT dataset. The obtained results highlight the reliability of the final model generated for recognizing handwritten Arabic words.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Algorithm 2
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://www.ifnenit.com/

  2. https://www.kaggle.com/mloey1/ahcd1

  3. https://fki.tic.heia-fr.ch/databases/iam-handwriting-database

References

  1. Alalshekmubarak A, Hussain A, Wang Q-F (2012) Off-line handwritten Arabic word recognition using SVMs with normalized poly kernel. In: Huang T, Zeng Z, Li C, Leung CS (eds) Neural information processing, vol 7664. Springer Berlin Heidelberg, Berlin, pp 85–91. https://doi.org/10.1007/978-3-642-34481-7_11

    Chapter  Google Scholar 

  2. Al-Hajj Mohamad R, Likforman-Sulem L, Mokbel C (2009) Combining slanted-frame classifiers for improved HMM-based Arabic handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(7):1165–1177. https://doi.org/10.1109/TPAMI.2008.136

    Article  Google Scholar 

  3. AlKhateeb JH (2015) A database for Arabic handwritten character recognition. Procedia Comput Sci 65:556–561. https://doi.org/10.1016/j.procs.2015.09.130

    Article  Google Scholar 

  4. Almodfer R, Xiong S, Mudhsh M, Duan P (2017) Multi-column deep neural network for offline Arabic handwriting recognition. In: Lintas A, Rovetta S, Verschure PFMJ, Villa AEP (eds) Artificial neural networks and machine learning – ICANN 2017, vol 10614. Springer International Publishing, Cham, pp 260–267. https://doi.org/10.1007/978-3-319-68612-7_30

    Chapter  Google Scholar 

  5. Alrobah N, Albahli S (2021) A hybrid deep model for recognizing Arabic handwritten characters. IEEE Access 9:87058–87069. https://doi.org/10.1109/ACCESS.2021.3087647

    Article  Google Scholar 

  6. AL-Saffar A, Awang S, AL-Saiagh W, Tiun S, Al-khaleefa AS (2018) Deep learning algorithms for Arabic handwriting recognition: a review. IJET 7(3.20):344. https://doi.org/10.14419/ijet.v7i3.20.19271

    Article  Google Scholar 

  7. Altwaijry N, Al-Turaiki I (2021) Arabic handwriting recognition system using convolutional neural network. Neural Comput & Applic 33(7):2249–2261. https://doi.org/10.1007/s00521-020-05070-8

    Article  Google Scholar 

  8. Al-wajih E, Ghazali R (2021) An enhanced LBP-based technique with various size of sliding window approach for handwritten Arabic digit recognition. Multimed Tools Appl 80(16):24399–24418. https://doi.org/10.1007/s11042-021-10762-x

    Article  Google Scholar 

  9. Alyahya H, Ismail MMB, Al-Salman A (2020) Deep ensemble neural networks for recognizing isolated Arabic handwritten characters. TIPCV 6(21):68–79. https://doi.org/10.19101/TIPCV.2020.618051

    Article  Google Scholar 

  10. Amara M, Zidi K, Ghedira K (2020) Structural and statistical feature extraction methodology for the recognition of handwritten Arabic words. In: Madureira AM, Abraham A, Gandhi N, Varela ML (eds) Hybrid intelligent systems, vol 923. Springer International Publishing, Cham, pp 570–580. https://doi.org/10.1007/978-3-030-14347-3_56

    Chapter  Google Scholar 

  11. Armi L, Fekri-Ershad S (2019) Texture image analysis and texture classification methods - a review. arXiv:1904.06554 [cs]. Accessed 04 Feb 2022. [Online]. Available: http://arxiv.org/abs/1904.06554

  12. Balaha HM, Ali HA, Saraya M, Badawy M (2021) A new Arabic handwritten character recognition deep learning system (AHCR-DLS). Neural Comput & Applic 33(11):6325–6367. https://doi.org/10.1007/s00521-020-05397-2

    Article  Google Scholar 

  13. Boland PJ (1989) Majority systems and the condorcet jury theorem. Statistician 38(3):181. https://doi.org/10.2307/2348873

    Article  Google Scholar 

  14. Bouressace H, Csirik J (2019) A self-organizing feature map for Arabic word extraction. In: Ekštein K (ed) Text, speech, and dialogue, vol 11697. Springer International Publishing, Cham, pp 127–136. https://doi.org/10.1007/978-3-030-27947-9_11

    Chapter  Google Scholar 

  15. Cao S, Wang X (2019) Visual contour tracking based on inner-contour model particle filter under complex background. J Image Video Proc 2019(1):85. https://doi.org/10.1186/s13640-019-0487-7

    Article  Google Scholar 

  16. Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27. https://doi.org/10.1109/TIT.1967.1053964

    Article  MATH  Google Scholar 

  17. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), San Diego, CA, USA, vol 1, pp 886–893. https://doi.org/10.1109/CVPR.2005.177

  18. El Abed H, Margner V (2007) The IFN/ENIT-database - a tool to develop Arabic handwriting recognition systems. In: 2007 9th international symposium on signal processing and its applications, Sharjah, United Arab Emirates, pp 1–4. https://doi.org/10.1109/ISSPA.2007.4555529

  19. Fürnkranz J, Chan PK, Craw S, Sammut C, Uther W, Ratnaparkhi A, Jin X, Han J, Yang Y, Morik K, Dorigo M, Birattari M, Stützle T, Brazdil P, Vilalta R, Giraud-Carrier C, Soares C, Rissanen J, Baxter RA, … de Raedt L (2011) Model evaluation. In: Sammut C, Webb GI (eds) Encyclopedia of machine learning. Springer US, Boston, pp 683–683. https://doi.org/10.1007/978-0-387-30164-8_550

  20. Ghadhban HQ, Othman M, Samsudin NA, Ismail MNB, Hammoodi MR (2020) Survey of offline Arabic handwriting word recognition. In: Ghazali R, Nawi NM, Deris MM, Abawajy JH (eds) Recent advances on soft computing and data mining, vol 978. Springer International Publishing, Cham, pp 358–372. https://doi.org/10.1007/978-3-030-36056-6_34

    Chapter  Google Scholar 

  21. Guyon I, Elisseeff A (2006) An introduction to feature extraction. In: Guyon I, Nikravesh M, Gunn S, Zadeh LA (eds) Feature extraction, vol 207. Springer Berlin Heidelberg, Berlin, pp 1–25. https://doi.org/10.1007/978-3-540-35488-8_1

    Chapter  Google Scholar 

  22. Hamida S, Cherradi B, Terrada O, Raihani A, Ouajji H, Laghmati S (2020) A novel feature extraction system for cursive word vocabulary recognition using local features descriptors and Gabor filter. In: 2020 3rd international conference on advanced communication technologies and networking (CommNet), Marrakech, Morocco, pp 1–7. https://doi.org/10.1109/CommNet49926.2020.9199642

  23. Hamida S, Cherradi B, El Gannour O, Terrada O, Raihani A, Ouajji H (2021) New database of French computer science words handwritten vocabulary. In: 2021 international congress of advanced technology and engineering (ICOTEN), Taiz, Yemen, pp 1–5. https://doi.org/10.1109/ICOTEN52080.2021.9493438

  24. Harwood D, Ojala T, Pietikäinen M, Kelman S, Davis L (1995) Texture classification by center-symmetric auto-correlation, using Kullback discrimination of distributions. Pattern Recogn Lett 16(1):1–10. https://doi.org/10.1016/0167-8655(94)00061-7

    Article  Google Scholar 

  25. Hassaballah M, Abdelmgeid AA, Alshazly HA (2016) Image features detection, description and matching. In: Awad AI, Hassaballah M (eds) Image feature detectors and descriptors, vol 630. Springer International Publishing, Cham, pp 11–45. https://doi.org/10.1007/978-3-319-28854-3_2

    Chapter  Google Scholar 

  26. Ho TK, Hull JJ, Srihari SN (1994) Decision combination in multiple classifier systems. IEEE Trans Pattern Anal Mach Intell 16(1):66–75. https://doi.org/10.1109/34.273716

    Article  Google Scholar 

  27. Huang D, Shan C, Ardabilian M, Wang Y, Chen L (2011) Local binary patterns and its application to facial image analysis: a survey. IEEE Trans Syst, Man, Cybern C 41(6):765–781. https://doi.org/10.1109/TSMCC.2011.2118750

    Article  Google Scholar 

  28. Khalifa M, BingRu Y (2011) A novel word based arabic handwritten recognition system using SVM classifier. In: Shen G, Huang X (eds) Advanced research on electronic commerce, web application, and communication, vol 143. Springer Berlin Heidelberg, Berlin, pp 163–171. https://doi.org/10.1007/978-3-642-20367-1_26

    Chapter  Google Scholar 

  29. Khan S, Khan A, Maqsood M, Aadil F, Ghazanfar MA (2019) Optimized Gabor feature extraction for mass classification using cuckoo search for big data E-healthcare. J Grid Comput 17(2):239–254. https://doi.org/10.1007/s10723-018-9459-x

    Article  Google Scholar 

  30. Kim C-M, Hong EJ, Chung K, Park RC (2020) Line-segment feature analysis algorithm using input dimensionality reduction for handwritten text recognition. Appl Sci 10(19):6904. https://doi.org/10.3390/app10196904

    Article  Google Scholar 

  31. Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Machine Intell 20(3):226–239. https://doi.org/10.1109/34.667881

    Article  Google Scholar 

  32. Kobayashi T, Hidaka A, Kurita T (2008) Selection of histograms of oriented gradients features for pedestrian detection. In: Ishikawa M, Doya K, Miyamoto H, Yamakawa T (eds) Neural information processing, vol 4985. Springer Berlin Heidelberg, Berlin, pp 598–607. https://doi.org/10.1007/978-3-540-69162-4_62

    Chapter  Google Scholar 

  33. Lorigo LM, Govindaraju V (2006) Offline Arabic handwriting recognition: a survey. IEEE Trans Pattern Anal Mach Intell. 28(5):712–724. https://doi.org/10.1109/TPAMI.2006.102

    Article  Google Scholar 

  34. Maalej R, Kherallah M (2020) Improving the DBLSTM for on-line Arabic handwriting recognition. Multimed Tools Appl 79(25–26):17969–17990. https://doi.org/10.1007/s11042-020-08740-w

    Article  Google Scholar 

  35. Mohammad K, Qaroush A, Washha M, Agaian S, Tumar I (2021) An adaptive text-line extraction algorithm for printed Arabic documents with diacritics. Multimed Tools Appl 80(2):2177–2204. https://doi.org/10.1007/s11042-020-09737-1

    Article  Google Scholar 

  36. Mouhcine R, Mustapha A, Zouhir M (2018) Recognition of cursive Arabic handwritten text using embedded training based on HMMs. J Electr Syst Inf Technol 5(2):245–251. https://doi.org/10.1016/j.jesit.2017.02.001

    Article  Google Scholar 

  37. Nemouchi S, Meslati LS, Farah N (2012) Classifiers combination for Arabic words recognition: application to handwritten Algerian City names. In: Elmoataz A, Mammass D, Lezoray O, Nouboud F, Aboutajdine D (eds) Image and Signal Processing, vol 7340. Springer Berlin Heidelberg, Berlin, pp 562–570. https://doi.org/10.1007/978-3-642-31254-0_64

    Chapter  Google Scholar 

  38. Neto AFS, Bezerra BLD, Toselli AH (2020) Towards the natural language processing as spelling correction for offline handwritten text recognition systems. Appl Sci 10(21):7711. https://doi.org/10.3390/app10217711

    Article  Google Scholar 

  39. Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987. https://doi.org/10.1109/TPAMI.2002.1017623

    Article  MATH  Google Scholar 

  40. Palatnik de Sousa I (2018) Convolutional ensembles for Arabic handwritten character and digit recognition. PeerJ Comput Sci 4:e167. https://doi.org/10.7717/peerj-cs.167

    Article  Google Scholar 

  41. Pechwitz M, El Abed H, Märgner V (2012) Handwritten Arabic word recognition using the IFN/ENIT-database. In: Märgner V, El Abed H (eds) Guide to OCR for Arabic scripts. Springer London, London, pp 169–213. https://doi.org/10.1007/978-1-4471-4072-6_8

    Chapter  Google Scholar 

  42. Ramdan J, Omar K, Faidzul M, Mady A (2013) Arabic handwriting data base for text recognition. Procedia Technol 11:580–584. https://doi.org/10.1016/j.protcy.2013.12.231

    Article  Google Scholar 

  43. Saddami K, Munadi K, Away Y, Arnia F (2019) Effective and fast binarization method for combined degradation on ancient documents. Heliyon 5(10):e02613. https://doi.org/10.1016/j.heliyon.2019.e02613

    Article  Google Scholar 

  44. Saeed K, Tabędzki M, Rybnik M, Adamski M (2010) K3M: a universal algorithm for image skeletonization and a review of thinning techniques. Int J Appl Math Comput Sci 20(2):317–335. https://doi.org/10.2478/v10006-010-0024-4

    Article  MATH  Google Scholar 

  45. Sahlol AT, Suen CY, Elbasyoni MR, Sallam AA (2014) Investigating of preprocessing techniques and novel features in recognition of handwritten Arabic characters. In: Salinesi C, Norrie MC, Pastor Ó (eds) Advanced information systems engineering, vol 7908. Springer Berlin Heidelberg, Berlin, pp 264–276. https://doi.org/10.1007/978-3-319-11656-3_24

    Chapter  Google Scholar 

  46. Su C-T, Chen L-S, Yih Y (2006) Knowledge acquisition through information granulation for imbalanced data. Expert Syst Appl 31(3):531–541. https://doi.org/10.1016/j.eswa.2005.09.082

    Article  Google Scholar 

  47. Tanvir Parvez M, Mahmoud SA (2013) Arabic handwriting recognition using structural and syntactic pattern attributes. Pattern Recogn 46(1):141–154. https://doi.org/10.1016/j.patcog.2012.07.012

    Article  Google Scholar 

  48. Tulyakov S, Jaeger S, Govindaraju V, Doermann D (2008) Review of classifier combination methods. In: Marinai S, Fujisawa H (eds) Machine learning in document analysis and recognition, vol 90. Springer Berlin Heidelberg, Berlin, pp 361–386. https://doi.org/10.1007/978-3-540-76280-5_14

    Chapter  Google Scholar 

  49. Wang Z, Wang E, Zhu Y (2020) Image segmentation evaluation: a survey of methods, Artif Intell Rev. https://doi.org/10.1007/s10462-020-09830-9.

  50. Xu L, Krzyzak A, Suen CY (1992) Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans Syst Man Cybern 22(3):418–435. https://doi.org/10.1109/21.155943

    Article  Google Scholar 

  51. Zoizou A, Zarghili A, Chaker I (2020) A new hybrid method for Arabic multi-font text segmentation, and a reference corpus construction. J King Saud Univ - Comput Inf Sci 32(5):576–582. https://doi.org/10.1016/j.jksuci.2018.07.003

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bouchaib Cherradi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest regarding the publication of this work.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hamida, S., Cherradi, B., El Gannour, O. et al. Cursive Arabic handwritten word recognition system using majority voting and k-NN for feature descriptor selection. Multimed Tools Appl 82, 40657–40681 (2023). https://doi.org/10.1007/s11042-023-15167-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15167-6

Keywords

Navigation