Skip to main content
Log in

A unified feature descriptor for generic character recognition based on zoning and histogram of gradients

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Character recognition is one of the most interesting and challenging area of pattern recognition with several real-world application capabilities. Even though machine printed texts are recognized easily by modern day systems, still recognition of handwritten text and natural scene image text is a challenging task for these systems because of the large variation in the writing styles of the writers and uncontrolled imaging conditions respectively. Many modern day systems make use of convolutional neural networks to learn the features on their own instead of handcrafting the features. This research work deals with designing a unified handcrafted feature descriptor based on zoning and histogram of gradients which is common for all formats of printed, handwritten and natural image characters. The effectiveness of this unified feature is tested on Chars74K dataset which contains printed, handwritten and natural image characters and EMNIST dataset which contains handwritten characters. Also, the importance of extracting the outer structure of characters during skeletonization is studied in this work. The outcome shows that the recognition accuracy of the proposed feature descriptor is comparable with the state-of-art methods and even outperforms some of the modern convolutional neural network models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Downton AC, Leedham CG (1990) Preprocessing and presorting of envelope images for automatic sorting using OCR. Pattern Recogn 23(3–4):347–362. https://doi.org/10.1016/0031-3203(90)90022-D

    Article  Google Scholar 

  2. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  3. Manjunath Aradhya VN, Hemantha Kumar G, Noushath S (2008) Multilingual OCR system for South Indian scripts and English documents: an approach based on Fourier transform and principal component analysis. Eng Appl Artif Intell 21(4):658–668. https://doi.org/10.1016/j.engappai.2007.05.009

    Article  Google Scholar 

  4. De Campos TE, Babu BR, Varma M Character recognition in natural images, In: Proceedings of the international conference on computer vision theory and applications (VISAPP). Lisbon, Portugal, February 2009.

  5. Lei Li (2012) Li-liang ZHANG, Jing-fei SU, Handwritten character recognition via direction string and nearest neighbor matching. J Chin Univ Posts Telecommun 19(Supplement 2):160–196. https://doi.org/10.1016/S1005-8885(11)60427-5

    Article  Google Scholar 

  6. Salimi H, Giveki D (2013) Farsi/Arabic handwritten digit recognition based on ensemble of SVD classifiers and reliable multi-phase PSO combination rule. IJDAR 16:371–386. https://doi.org/10.1007/s10032-012-0195-7

    Article  Google Scholar 

  7. Malik P, Dixit R, "Handwritten character recognition using wavelet transform and hopfield network, In: " 2013 International conference on machine intelligence and research advancement, Katra, 2013, pp. 125–129.

  8. Richarz J, Vajda S, Grzeszick R, Fink GA (2014) Semi-supervised learning for character recognition in historical archive documents. Pattern Recogn 47(3):1011–1020. https://doi.org/10.1016/j.patcog.2013.07.013

    Article  Google Scholar 

  9. Sastry PN, Lakshmi TRV, Rao NVK, Rajinikanth TV, Wahab A, "Telugu handwritten character recognition using zoning features, In: “2014 International conference on it convergence and security (ICITCS), Beijing, 2014, pp. 1–4, https://doi.org/10.1109/ICITCS.2014.7021817

  10. Abaynarh M, El Fadili H, Zenkouar L (2015) Enhanced feature extraction of handwritten characters and recognition using artificial neural networks. J Theor Appl Info Technol 72(3):355–365

    Google Scholar 

  11. Sonu Varghese K, James A, Chandran S (2016) A novel tri-stage recognition scheme for handwritten malayalam character recognition. Proc Technol 24:1333–1340. https://doi.org/10.1016/j.protcy.2016.05.137

    Article  Google Scholar 

  12. Tian S, Bhattacharya U, Shijian Lu, Bolan Su, Wang Q, Wei X, Yue Lu (2016) Chew Lim Tan, Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recogn 51:125–134. https://doi.org/10.1016/j.patcog.2015.07.009

    Article  Google Scholar 

  13. Ali M, Foroosh H (2016). A holistic method to recognize characters in natural scenes, In: Proceedings of the 11th joint conference on computer vision, imaging and computer graphics theory and applications - Volume 4 VISAPP: VISAPP, (VISIGRAPP 2016), pages 449–457. https://doi.org/10.5220/0005787904490457.

  14. Cohen G, Afshar S, Tapson J, van Schaik A (2017). EMNIST: an extension of MNIST to handwritten letters. Retrieved from http://arxiv.org/abs/1702.05373

  15. Catunda JPK, D. Silva AT, Berton L, "Car plate character recognition via semi-supervised learning, In: “2019 8th Brazilian conference on intelligent systems (BRACIS), Salvador, Brazil, 2019, pp. 735–740. https://doi.org/10.1109/BRACIS.2019.00132

  16. Lin J, Lotfi A, Akhlaghi V, Tu Z, Gupta RK, "Accelerating local binary pattern networks with software-programmable FPGAs, In: " 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy, 2019, pp. 1112–1117. https://doi.org/10.23919/DATE.2019.8714951

  17. Herwanto HW, Handayani AN, Chandrika KL, Wibawa AP, "Zoning Feature Extraction for Handwritten Javanese Character Recognition, In: 2019 international conference on electrical, electronics and information engineering (ICEEIE), Denpasar, Bali, Indonesia, 2019, pp. 264–268. https://doi.org/10.1109/ICEEIE47180.2019.8981462..

  18. Madakannu A, Selvaraj A (2020) DIGI-Net: a deep convolutional neural network for multi-format digit recognition. Neural Comput Applic 32:11373–11383. https://doi.org/10.1007/s00521-019-04632-9

    Article  Google Scholar 

  19. Chaudhuri BB, Pal U (1998) A complete printed Bangla OCR system. Pattern Recogn 31(5):531–549. https://doi.org/10.1016/S0031-3203(97)00078-2

    Article  Google Scholar 

  20. Yin F, Wang QF, Zhang XY, Liu CL (2013) ICDAR 2013 Chinese handwriting recognition competition. In: 2013 12th International conference on document analysis and recognition (ICDAR), pp 1464–1470. https://doi.org/10.1109/ICDAR.2013.218

  21. Kavitha BR, Srimathi C (2019) Benchmarking on offline Handwritten Tamil Character Recognition using convolutional neural networks. J King Saud Uni - Comput Info Sci. https://doi.org/10.1016/j.jksuci.2019.06.004

    Article  Google Scholar 

  22. Pasha S, Padma MC, "Handwritten Kannada character recognition using wavelet transform and structural features, In: 2015 international conference on emerging research in electronics, computer science and technology (ICERECT), Mandya, 2015, pp. 346–351. https://doi.org/10.1109/ERECT.2015.7499039

  23. Zhu B, Zhou X, Liu C et al (2010) A robust model for on-line handwritten Japanese text recognition. IJDAR 13:121–131. https://doi.org/10.1007/s10032-009-0111-y

    Article  Google Scholar 

  24. Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66. https://doi.org/10.1109/TSMC.1979.4310076

    Article  Google Scholar 

  25. Zhang TY, Suen CY (1984) A fast parallel algorithm for thinning digital patterns. Commun ACM 27(3):236–239

    Article  Google Scholar 

  26. Dalal N, Triggs B, "Histograms of oriented gradients for human detection, In: “2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), San Diego, CA, USA, 2005, pp. 886–893 vol. 1. https://doi.org/10.1109/CVPR.2005.177

  27. Shi C, Wang C, Xiao B, Gao S, Jinlong Hu (2014) End-to-end scene text recognition using tree-structured models. Pattern Recogn 47(9):2853–2866. https://doi.org/10.1016/j.patcog.2014.03.023

    Article  Google Scholar 

  28. Newell AJ, Griffin LD, "Multiscale histogram of oriented gradient descriptors for robust character recognition, In: “2011 International conference on document analysis and recognition, Beijing, 2011, pp. 1085–1089. https://doi.org/10.1109/ICDAR.2011.219

  29. Ho Vu D, Quoc Ngoc L “A feature learning method for scene text recognition, In: “2012 IEEE international symposium on signal processing and information technology (ISSPIT), Ho Chi Minh City, 2012, pp. 000176–000180. https://doi.org/10.1109/ISSPIT.2012.6621282

  30. Gao S, Wang C, Xiao B, Shi C, Zhou W, Zhang Z (2014) Scene text character recognition using spatiality embedded dictionary. IEICE Trans Inf Syst 97(7):1942–1946

    Article  Google Scholar 

  31. Muhammad A, 2016, ‘A study of holistic strategies for the recognition of characters in natural scene images’, Doctor of Philosophy, University of Central Florida Orlando, Florida

  32. Ali M, Foroosh H, "Character recognition in natural scene images using rank-1 tensor decomposition, In: 2016 IEEE international conference on image processing (ICIP), Phoenix, AZ, 2016, pp. 2891–2895. https://doi.org/10.1109/ICIP.2016.7532888

  33. Barrow EJ, 2017, 'The use of deep learning to solve invariance issues in object recognition', Doctor of Philosophy, Coventry University, Coventry, United Kingdom

  34. Phangtriastu MR, Harefa J, Tanoto DF (2017) Comparison between neural network and support vector machine in optical character recognition. Proc Comput Sci 116:351–357. https://doi.org/10.1016/j.procs.2017.10.061

    Article  Google Scholar 

  35. Ahlawat S, Rishi R (2017) Off-line handwritten numeral recognition using hybrid feature set – a comparative analysis. Proc Comput Sci 122:1092–1099. https://doi.org/10.1016/j.procs.2017.11.478

    Article  Google Scholar 

  36. Kavitha D, Radha V (2021) Texnet: A deep convolutional neural network model to recognize text in natural scene images. J Eng Sci Technol 16(2):1782–1799

    Google Scholar 

  37. Oliveira LES, Sabourin R, Bortolozzi F, Suen CY (2002) Automatic recognition of handwritten numerical strings: a recognition and verification strategy. IEEE Trans Pattern Anal Mach Intell 24:1438–1454

    Article  Google Scholar 

  38. Cavalin P, Britto AS, Bortolozzi F, Sabourin R, Oliveira LES An implicit segmentation-based method for recognition of handwritten strings of characters. In: Proceedings of the 2006 ACM symposium on applied computing, Dijon, France, 23–27 April 2006; pp. 836–840.

  39. Radtke PVW, Sabourin R, Wong T Using the RRT algorithm to optimize classification systems for handwritten digits and letters. In: Proceedings of the 2008 ACM symposium on applied computing, Fortaleza, Brazil, 16–20 March 2008; pp. 1748–1752.

  40. Neftci EO, Augustine C, Paul S, Detorakis G (2017) Event-driven random back-propagation: enabling neuromorphic deep learning machines. Front Neurosci 11:324

    Article  Google Scholar 

  41. Ghadekar P, Ingole S, Sonone D Handwritten Digit and Letter Recognition Using Hybrid DWT-DCT with KNN and SVM Classifier. In: Proceedings of the 4th international conference on computing communication control and automation, Pune, India, 2018.

  42. Chooi SL, Aimi Syamimi Binti Ab Ghafar (2021) Handwritten character recognition using convolutional neural network. Progress in engineering application and technology, 2(1), 593–611. Retrieved from https://publisher.uthm.edu.my/periodicals/index.php/peat/article/view/973

Download references

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Arun.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Arun, M., Arivazhagan, S. A unified feature descriptor for generic character recognition based on zoning and histogram of gradients. Neural Comput & Applic 34, 12223–12234 (2022). https://doi.org/10.1007/s00521-022-07110-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07110-x

Keywords

Navigation