Abstract
Optical character recognition (OCR) of handwritten documents is a popular field of research due to its application in real life. It is a way of digitizing the documents. Though digitization of the handwritten documents is a complex research problem due to the challenges, such as skewed writing and various writing styles associated with it, still it has a widespread utility in social life as well as preserving biblical information. Starting from banking, hospitals, education and government offices OCR is a solution to save human time and effort. Classification is the most important and complex step, which has the maximum contribution on the accuracy of the OCR system. In this article, we have presented a low-cost, hybrid model which is suitable for recognition of low-resource language like handwritten Devanagari characters. The hybrid classifier is a combination of HOG-CNN-MLP algorithms, where HOG is responsible for extracting gradient features from the character images, whereas CNN layers reinforce the particular feature into a specified version of itself. Final classification is done using a multilayer perceptron (MLP). The main contribution of the article lies in the minimal system requirement and overwhelming performance. The model is tested on two public databases of handwritten Devanagari characters DHCD dataset and Kaggle Devanagari offline handwritten character dataset and achieved accuracy of 98.79 and 97.10%, respectively.
Similar content being viewed by others
Data Availability
The Kaggle Devanagari offline handwritten character dataset that support the findings of this study are openly available in Kaggle at https://www.kaggle.com/datasets/ashokpant/devanagari-character-dataset reference number [18]. The DHCD dataset can be accessed from the corresponding author of the following paper [19].
References
Chaudhuri BB, Pal U (1998) A complete printed Bangla OCR system. Pattern Recogn 31(5):531–549
Mukherjee J, Parui SK, Roy U (2020) NN-based analytic approach to symbol level recognition for degraded Bengali printed documents. Sādhanā 45(1):1–22
Mukherjee J, Parui SK, Roy U (2021) An unsupervised and robust line and word segmentation method for handwritten and degraded printed document. Trans Asian Low-Resour Lang Inf Process 21(2):1–31
Sharma AK, Thakkar P, Adhyaru DM, Zaveri TH (2019) Handwritten Gujarati character recognition using structural decomposition technique. Pattern Recogn Image Anal 29(2):325–338
Seethalakshmi R, Sreeranjani T, Balachandar T, Singh A, Singh M, Ratan R, Kumar S (2005) Optical character recognition for printed Tamil text using Unicode. J Zhejiang Univer-Sci A 6(11):1297–1305
Narang SR, Jindal MK, Ahuja S, Kumar M (2020) On the recognition of Devanagari ancient handwritten characters using SIFT and Gabor features. Soft Comput 24(22):17279–17289
Schantz HF (1982) The history of OCR, optical character recognition. Manchester Center, Vt., Recognition Technologies Users Association
Paul (1970) The History of OCR, vol 12, number 46. Data processing magazine
Fournier E (1920) The type-reading optophone, our surplus, our ships, and europe’s need, and more, vol 123, number 19. Scientific American Publishing Co., pp. 463–465
d’Albe EF (1914) On a type-reading optophone. Proc R Soc Lond. Ser A, Contain Pap Math and Phys Charact 90(619):373–375
Pruden JB (1912) Cut-off. Google Patents. US Patent 1,018,925
Macchina per leggere pei ciechi (PDF), in La scienza per tutti, Year XXVIII, n\(^\circ \) 2, Milano, Casa Editrice Sozogno, 15 January 1921, p. 20 (italian)
Gustav T (1935) Card controlled machine. Google Patents. US Patent 1,997,157
Guha R, Das N, Kundu M, Nasipuri M, Santosh KC (2020) Devnet: an efficient CNN architecture for handwritten Devanagari character recognition. Int J Pattern Recogn Artif Intell 34(12):2052009
Bisht M, Gupta R (2021) Offline handwritten Devanagari modified character recognition using convolutional neural network. Sādhanā 46(1):1–4
Pande SM, Jha BK (2021) Character recognition system for Devanagari script using machine learning approach. In: 2021 5th international conference on computing methodologies and communication (ICCMC). IEEE, pp 899–903
Kaaniche MB, Brémond F (2009) Tracking HOG descriptors for gesture recognition. In: 2009 6th IEEE international conference on advanced video and signal based surveillance. IEEE, pp 140–145
Pant AK, Panday SP, Joshi SR (2012) Off-line Nepali handwritten character recognition using Multilayer Perceptron and Radial basis function neural networks. In: 2012 3rd Asian himalayas international conference on internet. IEEE, pp 1–5
Acharya S, Pant AK, Gyawali PK (2015) Deep learning based large scale handwritten Devanagari character recognition. In: 2015 9th international conference on software, knowledge, information management and applications (SKIMA). IEEE, pp 1–6
Aktaruzzaman M, Dagnew TM, Rivolta MW, Sassi R (2020) Improved low-cost recognition system for handwritten Bengali numerals. Int J Comput Appl Technol 62(4):375–383
Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2009) A hierarchical approach to recognition of handwritten Bangla characters. Pattern Recogn 42(7):1467–1484
Dongre VJ, Mankar VH (2015) Devanagari offline handwritten numeral and character recognition using multiple features and neural network classifier. In: 2015 2nd international conference on computing for sustainable global development (INDIACom). IEEE, pp 425–431
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mukherjee, J., Mishra, S., Tomar, A. et al. A low-cost hybrid handwritten Devanagari character classifier. Innovations Syst Softw Eng (2022). https://doi.org/10.1007/s11334-022-00518-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11334-022-00518-7