Skip to main content
Log in

Recognition of Handwritten Arabic Characters using Histograms of Oriented Gradient (HOG)

  • Applied Problems
  • Published:
Pattern Recognition and Image Analysis Aims and scope Submit manuscript

Abstract

Optical Character Recognition (OCR) is the process of recognizing printed or handwritten text on paper documents. This paper proposes an OCR system for Arabic characters. In addition to the preprocessing phase, the proposed recognition system consists mainly of three phases. In the first phase, we employ word segmentation to extract characters. In the second phase, Histograms of Oriented Gradient (HOG) are used for feature extraction. The final phase employs Support Vector Machine (SVM) for classifying characters. We have applied the proposed method for the recognition of Jordanian city, town, and village names as a case study, in addition to many other words that offers the characters shapes that are not covered with Jordan cites. The set has carefully been selected to include every Arabic character in its all four forms. To this end, we have built our own dataset consisting of more than 43.000 handwritten Arabic words (30000 used in the training stage and 13000 used in the testing stage). Experimental results showed a great success of our recognition method compared to the state of the art techniques, where we could achieve very high recognition rates exceeding 99%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. S. Tanner, Deciding whether Optical Character Recognition is feasible, Technical report (King’s Digital Consultancy Services, 2004).

    Google Scholar 

  2. A. Cheung, M. Bennamoun, and N. W. Bergmann, “An Arabic optical character recognition system using recognition-based segmentation,” Pattern Recogn. 34 (2), 215–233 (2001).

    Article  MATH  Google Scholar 

  3. A. Goyal, K. Khandelwal, and P. Keshri, “Optical Character Recognition for handwritten Hindi,” CS229 Notes — Machine Learning, Stanford University, 2010, pp. 1–5.

    Google Scholar 

  4. A. J. Newell and L. D. Griffin, “Multiscale Histogram of Oriented Gradient descriptors for robust character recognition,” in Proc. 11th Int. Conf. on Document Analysis and Recognition (ICDAR 2011) (IEEE Computer Society, 2011), pp. 1085–1089.

    Google Scholar 

  5. Ø. D. Trier, A. K. Jain, and T. Taxt, “Feature extraction methods for character recognition—A survey,” Pattern Recogn. 29 (4), 641–662 (1996).

    Article  Google Scholar 

  6. B. A. Yanikoglu and L. Vincent, “Pink Panther: A complete environment for ground-truthing and benchmarking document page segmentation,” Pattern Recogn. 31 (9), 1191–1204 (1998).

    Article  Google Scholar 

  7. J. Kanai, S. V. Rice, T. A. Nartker, and G. Nagy, “Automated evaluation of OCR zoning,” IEEE Trans. Pattern Anal. Mach. Intell. 17 (1), 86–90 (1995).

    Article  Google Scholar 

  8. A. Negi and C.K. Chereddi, “Candidate search and elimination approach for Telugu OCR,” in Proc. Conf. on Convergent Technologies for the Asia-Pacific Region (TENCON 2003) (IEEE, 2003), Vol. 2, pp. 745–748.

    Article  Google Scholar 

  9. S. Abdleazeem and E. El-Sherif, “Arabic handwritten digit recognition,” Int. J. Doc. Anal. Recogn. (IJDAR) 11 (3), 127–141 (2008).

    Article  Google Scholar 

  10. R. J. Rodrigues and A. C. Gay Thomé, “Cursive character recognition — A character segmentation method using projection profile-based technique,” in Proc. 4th World Multiconf. on Systemics, Cybernetics And Informatics (SCI 2000) and 6th Int. Conf. on Information Systems Analysis and Synthesis (ISAS 2000) (Orlando, 2000), pp. 109–115.

    Google Scholar 

  11. R. Cooper and T. Hwang, “IMRESTAURANT() MATLAB for feature-based restaurant logo recognition,” Department of Electrical Engineering, Stanford University, 2010.

    Google Scholar 

  12. S. Xu and M. Krauthammer, “Boosting text extraction from biomedical images using text region detection,” in Proc. 2011 Biomedical Sciences and Engineering Conference (BSEC): Image Informatics and Analytics in Biomedicine (IEEE, 2011), pp. 1–4.

    Google Scholar 

  13. R. M. O. Cruz, G. D. C. Gavalcanti, and Tsang I.R., “Handwritten digit recognition using multiple feature extraction techniques and classifier ensemble,” in Proc. 17th International Conference on Systems, Signals and Image Processing (IWSSIP 2010) (Rio De Janeiro, 2010), pp. 215–218.

    Google Scholar 

  14. W. Chmielnicki and K. Stąpor, “Investigation of normalization techniques and impact on a recognition rate in handwritten numeral recognition,” Schedae Informaticae 19, 53–77 (2010).

    Article  Google Scholar 

  15. P. Sankar K., C. V. Jawahar, and R. Manmatha, “Nearest Neighbor based Collection OCR,” in DAS’10 Proc. 9th IAPR International Workshop on Document Analysis Systems (Boston, 2010), pp. 207–214.

    Chapter  Google Scholar 

  16. J. Park, V. Govindaraju, and S. N. Srihari, “OCR in a hierarchical feature space,” IEEE Trans. Pattern Anal. Mach. Intell. 22 (4), 400–407 (2000).

    Article  Google Scholar 

  17. D. Das, D. Chen, and A. G. Hauptmann, “Improving multimedia retrieval with a video OCR,” in Multimedia Content Access: Algorithms and Systems II, Ed. by T. Gevers et al., Proc. SPIE 6820, 68200B-1–68200B-12 (2008). DOI: 10.1117/12.76693110.1117/12.766931

    Google Scholar 

  18. R. Ebrahimpour, R. D. Vahid, and B. M. Nezhad, “Decision Templates with gradient based features for Farsi handwritten word recognition,” Int. J. Hybrid Inf. Technol. 4 (1), 1–12 (2011).

    Google Scholar 

  19. K. Mikolajczyk, C. Schmid, and A. Zisserman, “Human detection based on a probabilistic assembly of robust part detectors,” in Computer Vision — ECCV 2004, Proc. 8th European Conf. on Computer Vision, Prague, Czech Republic, May 11-14, 2004, Part I, Ed. by T. Pajdla and J. Matas, Lecture Notes in Computer Science (Springer, Berlin, 2004), Vol. 3021, pp. 69–82.

    MATH  Google Scholar 

  20. N. Dalal, B. Triggs, and C. Schmid, “Human detection using oriented histograms of flow and appearance,” in Computer Vision — ECCV 2006, Proc. 9th European Conf. on Computer Vision, Graz, Austria, May 7-13, 2006, Part II, Ed. by A. Leonardis, H. Bischof, A. Pinz, Lecture Notes in Computer Science (Springer, Berlin, 2006), Vol. 3952, pp. 428–441.

    Google Scholar 

  21. Y. Cao and T. Wang, CS224N Project: Automatic Author Name Transliteration via OCR and NLP, Technical Report (Natural Language Processing, Computer Science, Stanford University, 2011).

    Google Scholar 

  22. Y.F. Pan, X. Hou and C.L. Liu, “A robust system to detect and localize texts in natural scene images,” in Proc. 8th IAPR International Workshop on Document Analysis Systems (DAS 2008) (IEEE Computer Society, 2008), pp. 35–42.

    Chapter  Google Scholar 

  23. V. Rasagna, K. J. Jinesh, and C.V. Jawahar, “On multifont character classification in Telugu,” in Information Systems for Indian Languages, Ed. by C. Singh, Communications in Computer and Information Science (Springer, Berlin, 2011), Vol. 139, pp. 86–91.

    Google Scholar 

  24. K. Jayech, M. Mahjoub, and N. Ben Amara, “Arabic handwritten word recognition based on Dynamic Bayesian Network,” Int. Arab J. Inf. Technol. 13 (6B), 1024–1031 (2016).

    Google Scholar 

  25. G. Khaissidi, E. Elfakir, et al., “Segmentation-free word spotting for handwritten Arabic documents,” Int. J. Interact. Multimedia Artif. Intell. 4 (1), 6–10 (2016).

    Article  Google Scholar 

  26. Md. S. Siddique and A. F. Mollah, “Recognition of isolated handwritten Arabic and Urdu numerals along with their variants,” in Int. Conf. on Computer Applications 2016 (ICCA 2016) (ASDF, India, 2016), pp. 1–5.

    Google Scholar 

  27. S. Khorashadizadeh and A. Latif, “Arabic/Farsi handwritten digit recognition using Histogram of Oriented Gradient and Chain Code Histogram,” Int. Arab J. Inf. Technol. 13 (4), 367–474 (2016).

    Google Scholar 

  28. J.H. AlKhateeb, “A database for Arabic handwritten character recognition,” in Int. Conf. on Communication, Management and Information Technology (ICCMIT 2015), Procedia Comp. Sci. 65, 556–561 (2015).

    Google Scholar 

  29. M. Gargouri and S. M. Touj, “Online Arabic handwriting recognition based on classifier combination,” in 2nd Int. Conf. on Automation, Control, Engineering and Computer Science (ACECS-2015), Proceedings of Engineering and Technology (PET) 10, Special issue, 189–193 (2015).

    Google Scholar 

  30. M. Hamdani, P. Doetsch, and H. Ney, “Improvement of context dependent modeling for Arabic handwriting recognition,” in Proc. 14th International Conference on Frontiers in Handwriting Recognition (ICFHR 2014) (IEEE Computer Society, 2014), pp. 494–499.

    Google Scholar 

  31. D. J. Romero, L. M. Seijas, and A. M. Ruedin, “Directional continuous wavelet transform applied to handwritten numerals recognition using neural networks,” J. Comput. Sci. Technol. (JCS&T) 7 (1), 66–71 (2007); Special Issue on Selected Papers from CACIC 2006.

    Google Scholar 

  32. V. Märgner, H. El Abed, and M. Pechwitz, “Offline handwritten Arabic word recognition using HMM -A character based approach without explicit segmentation,” in Actes du 9ème Colloque Int’l Francophone sur l’Ecrit et le Document (CIFED 2006) (Fribourg, uisse, 2006), L. Likforman-Sulem éd. (SDN06 — Semaine du Document Numérique, 2006), pp. 259–264. Available on HAL at https://hal.archives-ouvertes.fr/hal-00112048

    Google Scholar 

  33. A. Lawgali, A. Bouridane, M. Angelova, and Z. Ghassemlooy, “Handwritten Arabic character recognition: Which feature extraction method?,” Int. J. Adv. Sci. Technol. (IJAST) 34, 1–8 (2011).

    Google Scholar 

  34. M. A. Abdullah, L. M. Al-Harigy, and H. H. Al-Fraidi, “Off-line Arabic handwriting character recognition using word segmentation,” J. Computing 4 (3), 40–44 (2012).

    Google Scholar 

  35. G. A. Abandah and K. S. Younis, “Handwritten Arabic character recognition using multiple classifiers based on letter form,” in Proc. 5th IASTED Int’l Conf. on Signal Processing, Pattern Recognition, and Applications (SPPRA 2008) (Innsbruck, Austria, Feb. 13–15, 2008), pp. 128–133.

    Google Scholar 

  36. G. A. Abandah and T. M. Malas, “Feature selection for recognizing handwritten Arabic letters,” Dirasat, Eng. Sci., 37 (2), 242–256 (2010).

    Google Scholar 

  37. S. M. Ismail and S. N. H. S. Abdullah, “Online Arabic handwritten character recognition based on a rule based approach,” J. Comput. Sci. 8 (11), 1859–1868 (2012).

    Article  Google Scholar 

  38. D. G. Lowe, “Distinctive image features from scaleinvariant key points,” Int. J. Comput. Vision 60 (2), 91–110 (2004).

    Article  Google Scholar 

  39. S. Birchfield, Image Processing and Analysis (Cengage Learning, Boston, 2017). ISBN: 978-1-285-17952-0

    Google Scholar 

  40. K. Sravanthi and S. Samiuddin, “Brain tumor detection, demarcation and quantification via MRI,” Int. J. Mag. Eng., Technol., Manage. Res. 2 (8), 672–676 (2015).

    Google Scholar 

  41. K. Zhang, S. Wang, and X. Zhang, “New metric for quality assessment of digital images based on weighted mean square error,” in 2nd Int. Conf. on Image and Graphics, Ed. by W. Sui, Proc. SPIE 4875, 491–497 (2002). DOI: 10.1117/12.47718710.1117/12.477187

    Article  Google Scholar 

  42. D. W. Jacobs, mage Gradients, Class Notes for CMSC 426: Image Processing, Fall 2005 (Department of Computer Science, University of Maryland).

    Google Scholar 

  43. Intel Co., Histogram of Oriented Gradients (HOG) Descriptor, Developer Reference for Intel® Integrated Performance Primitives. Update 3, 2017.

  44. Computer Vision Metrics: Chapter Six (Part D), Embedded Vision Academy, 2011-2017 Embedded Vision.

  45. C. J. C. Burges, “A tutorial on Support Vector Machines for pattern recognition,” Data Min. Knowl. Discovery 2 (2), 121–167 (1998).

    Article  Google Scholar 

  46. C. M. Bishop, Pattern Recognition and Machine Learning, Information Science and Statistics (Springer, New York, 2006).

    MATH  Google Scholar 

  47. S. Du, C. Liu, and L. Xi, “A selective multiclass support vector machine ensemble classifier for engineering surface classification using high definition metrology,” J. Manuf. Sci. Eng. 137 (1), 011003-1–011003-15 (2015). DOI: 10.1115/1.4028165

    Google Scholar 

  48. P. J. Hepworth, A. V. Nefedov, I. B. Muchnik, and K. L. Morgan, “Broiler chickens can benefit from machine learning: Support vector machine analysis of observational epidemiological data,” J. Royal Soc. Interface 9 (73), 1934–1942 (2012).

    Article  Google Scholar 

  49. C.-W. Hsu and C.-J. Lin, “A comparison of methods for multiclass support vector machines,” IEEE Trans. Neural Networks 13 (2), 415–425 (2002).

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Noor A. Jebril.

Additional information

The article is published in the original.

Hussein R. Al-Zoubi received his MSE and Ph.D. in Computer Engineering from the University of Alabama in Huntsville, USA in 2004 and 2007, respectively. Since 2007, he has been working with the Department of Computer Engineering, Hijjawi Faculty for Engineering Technology, Yarmouk University, Jordan. He is currently an associate professor and the chair of the Department of Computer Engineering. His research interests include machine vision, pattern recognition, image processing, computer networks and their applications: wireless and wired, security, multimedia, queuing analysis, and high-speed networks. He is a senior member of IEEE.

Qasem Abu Al-Haija is a senior lecturer of Electrical and Computer Engineering at King Faisal University. He is a Jordanian resident (Married) born on July-16-1982 and proficient in both languages Arabic and English. Eng. Abu Al-Haija received his B.S. in ECE from Mu’tah University in Feb/2005. Then he worked as a network engineer in a leading institute at KSA as well as a lecturer before he joined the graduate program at Jordan University of Science and Technology in Sept/2007 where received his M.Sc. degree in Computer engineering in Dec/2009. His research Interests (Keywords): Information Security and Cryptography, High performance coprocessor and FPGA design, Computer arithmetic and algorithms, Wireless sensor networks.

Noor A. Jebril is a lecturer of Computer Sciences at King Faisal University. She is a Jordanian resident (Married) proficient in both languages Arabic and English. Eng. Noor received his B.S. in software engineering from Philadelphia University in Aug/2009. She joined the graduate program at Yarmouk University of Science and Technology in Aug/2011 where she received her M.Sc. degree in computer engineering before she joins the faculty staff at King Faisal university as a lecturer of computer science department. Here research interests include: Digital pmage processing, Biomedical hardware and software, Computer programming, algorithms and security.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jebril, N.A., Al-Zoubi, H.R. & Abu Al-Haija, Q. Recognition of Handwritten Arabic Characters using Histograms of Oriented Gradient (HOG). Pattern Recognit. Image Anal. 28, 321–345 (2018). https://doi.org/10.1134/S1054661818020141

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S1054661818020141

Keywords

Navigation