Skip to main content

Advertisement

Log in

A low-cost hybrid handwritten Devanagari character classifier

  • S.I. : Low Resource Machine Learning Algorithms (LR-MLA)
  • Published:
Innovations in Systems and Software Engineering Aims and scope Submit manuscript

Abstract

Optical character recognition (OCR) of handwritten documents is a popular field of research due to its application in real life. It is a way of digitizing the documents. Though digitization of the handwritten documents is a complex research problem due to the challenges, such as skewed writing and various writing styles associated with it, still it has a widespread utility in social life as well as preserving biblical information. Starting from banking, hospitals, education and government offices OCR is a solution to save human time and effort. Classification is the most important and complex step, which has the maximum contribution on the accuracy of the OCR system. In this article, we have presented a low-cost, hybrid model which is suitable for recognition of low-resource language like handwritten Devanagari characters. The hybrid classifier is a combination of HOG-CNN-MLP algorithms, where HOG is responsible for extracting gradient features from the character images, whereas CNN layers reinforce the particular feature into a specified version of itself. Final classification is done using a multilayer perceptron (MLP). The main contribution of the article lies in the minimal system requirement and overwhelming performance. The model is tested on two public databases of handwritten Devanagari characters DHCD dataset and Kaggle Devanagari offline handwritten character dataset and achieved accuracy of 98.79 and 97.10%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability

The Kaggle Devanagari offline handwritten character dataset that support the findings of this study are openly available in Kaggle at https://www.kaggle.com/datasets/ashokpant/devanagari-character-dataset reference number [18]. The DHCD dataset can be accessed from the corresponding author of the following paper [19].

References

  1. Chaudhuri BB, Pal U (1998) A complete printed Bangla OCR system. Pattern Recogn 31(5):531–549

    Article  Google Scholar 

  2. Mukherjee J, Parui SK, Roy U (2020) NN-based analytic approach to symbol level recognition for degraded Bengali printed documents. Sādhanā 45(1):1–22

    Article  Google Scholar 

  3. Mukherjee J, Parui SK, Roy U (2021) An unsupervised and robust line and word segmentation method for handwritten and degraded printed document. Trans Asian Low-Resour Lang Inf Process 21(2):1–31

    Google Scholar 

  4. Sharma AK, Thakkar P, Adhyaru DM, Zaveri TH (2019) Handwritten Gujarati character recognition using structural decomposition technique. Pattern Recogn Image Anal 29(2):325–338

    Article  Google Scholar 

  5. Seethalakshmi R, Sreeranjani T, Balachandar T, Singh A, Singh M, Ratan R, Kumar S (2005) Optical character recognition for printed Tamil text using Unicode. J Zhejiang Univer-Sci A 6(11):1297–1305

    Article  Google Scholar 

  6. Narang SR, Jindal MK, Ahuja S, Kumar M (2020) On the recognition of Devanagari ancient handwritten characters using SIFT and Gabor features. Soft Comput 24(22):17279–17289

    Article  Google Scholar 

  7. Schantz HF (1982) The history of OCR, optical character recognition. Manchester Center, Vt., Recognition Technologies Users Association

  8. Paul (1970) The History of OCR, vol 12, number 46. Data processing magazine

  9. Fournier E (1920) The type-reading optophone, our surplus, our ships, and europe’s need, and more, vol 123, number 19. Scientific American Publishing Co., pp. 463–465

  10. d’Albe EF (1914) On a type-reading optophone. Proc R Soc Lond. Ser A, Contain Pap Math and Phys Charact 90(619):373–375

    Google Scholar 

  11. Pruden JB (1912) Cut-off. Google Patents. US Patent 1,018,925

  12. Macchina per leggere pei ciechi (PDF), in La scienza per tutti, Year XXVIII, n\(^\circ \) 2, Milano, Casa Editrice Sozogno, 15 January 1921, p. 20 (italian)

  13. Gustav T (1935) Card controlled machine. Google Patents. US Patent 1,997,157

  14. Guha R, Das N, Kundu M, Nasipuri M, Santosh KC (2020) Devnet: an efficient CNN architecture for handwritten Devanagari character recognition. Int J Pattern Recogn Artif Intell 34(12):2052009

  15. Bisht M, Gupta R (2021) Offline handwritten Devanagari modified character recognition using convolutional neural network. Sādhanā 46(1):1–4

    Article  Google Scholar 

  16. Pande SM, Jha BK (2021) Character recognition system for Devanagari script using machine learning approach. In: 2021 5th international conference on computing methodologies and communication (ICCMC). IEEE, pp 899–903

  17. Kaaniche MB, Brémond F (2009) Tracking HOG descriptors for gesture recognition. In: 2009 6th IEEE international conference on advanced video and signal based surveillance. IEEE, pp 140–145

  18. Pant AK, Panday SP, Joshi SR (2012) Off-line Nepali handwritten character recognition using Multilayer Perceptron and Radial basis function neural networks. In: 2012 3rd Asian himalayas international conference on internet. IEEE, pp 1–5

  19. Acharya S, Pant AK, Gyawali PK (2015) Deep learning based large scale handwritten Devanagari character recognition. In: 2015 9th international conference on software, knowledge, information management and applications (SKIMA). IEEE, pp 1–6

  20. Aktaruzzaman M, Dagnew TM, Rivolta MW, Sassi R (2020) Improved low-cost recognition system for handwritten Bengali numerals. Int J Comput Appl Technol 62(4):375–383

    Article  Google Scholar 

  21. Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2009) A hierarchical approach to recognition of handwritten Bangla characters. Pattern Recogn 42(7):1467–1484

  22. Dongre VJ, Mankar VH (2015) Devanagari offline handwritten numeral and character recognition using multiple features and neural network classifier. In: 2015 2nd international conference on computing for sustainable global development (INDIACom). IEEE, pp 425–431

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jayati Mukherjee.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mukherjee, J., Mishra, S., Tomar, A. et al. A low-cost hybrid handwritten Devanagari character classifier. Innovations Syst Softw Eng (2022). https://doi.org/10.1007/s11334-022-00518-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11334-022-00518-7

Keywords

Navigation