Abstract
Recognition of handwritten characters has been a challenging task so far. There exist thousands of official languages across the globe which are used to communicate through documentation. Optical character recognition (OCR) being the challenging domain in such context where images of such documents are to be recognized either offline or online. Online and offline recognition of documents refer to the approaches whereby the basic operation of character recognition is performed during documentation itself and recognizing characters from stored documents, respectively. Numerous applications from a range of fields like medical transcription, digitization of ancient manuscripts, language translations, etc. are solely dependent on the task of OCR. In this work, an efficient framework is presented for the purpose of handwritten character recognition that can be well utilized for both offline and online processes. The proposed work takes the handwritten character images as input. It applies set of pre-processing such that the samples become suitable for the feature extraction task. The novelty lies in the process of feature extraction whereby three distinct types of feature are considered based on the shape primitives of the images. These features are global to the sample. Subsequently, local shape features are further extracted out of this shape features after suitable quantization process. These local features are the evidences which can be generically used to recognize test samples. These local feature vectors are dubbed as evidences and are preserved into a glossary dubbed as evidence glossary. The efficiency of the proposed scheme is well justified as it utilizes only a fraction of the feature vector and still it can recognize the characters. Other advantages of the proposed work are scale invariance and rotations invariance. Suitable datasets from two distinct languages, namely Odia and English are utilized for evaluating the efficiency of the framework. Comparison of the performance of the framework with six distinct state-of-the-art machine learning models is conducted whereby it outclass the competent in terms of several performance metrics.
Similar content being viewed by others
Data availability
Data sets have been generated and may be presented upon request.
References
Pal U, Jayadevan R, Sharma N (2012) Handwriting recognition in Indian regional scripts: a survey of offline techniques. ACM Trans Asian Lang Inf Process 11(1):1
Pal U, Wakabayashi T, Kimura F (2007) A system for off-line Oriya handwritten character recognition using curvature feature. In: ICIT, 227–229
Wakabayashi T, Pal U, Kimura F, Miyake Y (2009) F-ratio based weighted feature extraction for similar shape character recognition. In: ICDAR, 196–200
Khotanzad A, Hong Y (1990) Invariant image recognition by zernike moments. IEEE Tran Patt Anal Mach Intell 12:489–497
Bag S, Bhowmick P, Harit G (2011) Recognition of Bengali handwritten characters using skeletal convexity and dynamic programming. In: Emerging applications of information technology, international conference on vol 0, pp 265–268
Chaudhuri B, Pal U, Mitra M (2001) Automatic recognition of printed Oriya script. In: Sixth international conference on document analysis and recognition, pp 795–799
van Der Zant T, Schomaker L, Haak K (2008) Handwritten-word spotting using biologically inspired features. IEEE Trans Patt Anal Mach Intell 30(11):1945–1957
Mishra T, Majhi B, Sa P, Panda S (2014) Model based Odia numeral recognition using fuzzy aggregated features. Front Comput Sci 8(6):916–922
Mishra T, Majhi B, PANDA S (2013) A comparative analysis of image transformations for handwritten Odia numeral recognition. In: Advances in computing, communications and informatics (ICACCI), 2013 International Conference, pp 790–793
Batal I, Hauskrecht M (2009) A supervised time series feature extraction technique using DCT and dwt. In: Machine learning and applications, ICMLA ’09. International conference on 2009, pp 735–739
Dale M, Joshi M (2008) Fingerprint matching using transform features. In: TENCON 2008 - 2008 IEEE Region 10 Conference, pp 1–5
Wang Y, Yang Y, Ding W, Li S (2021) A residual-attention offline handwritten Chinese text recognition based on fully convolutional neural networks. IEEE Access 9:301–310
Mridha MF, Ohi AQ, Shin J, Kabir MM, Monowar MM, Hamid MA (2021) A thresholded Gabor-CNN based writer identification system for Indic scripts. IEEE Access 9:329–341
Zhang X-Y, Yin F, Zhang Y-M, Liu C-L, Bengio Y (2018) Drawing and recognizing Chinese characters with recurrent neural network. IEEE Trans Patt Anal Mach Intell 40(4):849–862
Zhou X-D, Wang D-H, Tian F, Liu C-L, Nakagawa M (2013) Handwritten Chinese/Japanese text recognition using semi-Markov conditional random fields. IEEE Trans Patt Anal Mach Intell 35(10):2413–2426
Pirlo G, Impedovo D (2012) Adaptive membership functions for handwritten character recognition by Voronoi-based image zoning. IEEE Trans Image Process 21(9):3827–3837
de Campos TE, Babu BR, Varma M (2009) Character recognition in natural images. In: Proceedings of the international conference on computer vision theory and applications, Lisbon, Portugal, February
Bieniecki W, Grabowski S, Rozenberg W (2007) Image preprocessing for improving OCR accuracy. In: International conference on perspective technologies and methods in MEMS Design, 2007. MEMSTECH 2007, pp 75–80
Chowdhury T, Naser M, Ahmed F (2013) Computer vision vs human perception: Novel preprocessing technique to reduce inter-character similaity of bangla alphabet. In: 2013 International conference on informatics, electronics vision (ICIEV), pp 1–5
Plamondon R, Bourdeau M, Chouinard C, Suen C (1993) Validation of preprocessing algorithms: a methodology and its application to the design of a thinning algorithm for handwritten characters. In: Proceedings of the second international conference on document analysis and recognition, 1993., pp 262–269
Arrighi T, Rojas J, Soto J, Madrigal C, Londono J (2012) Recognition and classification of numerical labels using digital image processing techniques. InL (2012) XVII symposium of image, signal processing, and artificial vision (STSIVA), pp 252–260
Jang J-H, Hong K-S (1999) Binarization of noisy gray-scale character images by thin line modeling. Patt Recogn 32(5):743–752
Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. Proc. VLDB Endow. 1(2):1542–1552
Cohen P, Heeringa B, Adams N (2002) An unsupervised algorithm for segmenting categorical timeseries into episodes. In: Working Notes of the (2002) ESF Exploratory Workshop on Pattern Detection and Discovery in Data Mining. Springer-Verlag, pp 99–106
Aoe J-I, Morimoto K, Shishibori M, Park K-H (1996) A trie compaction algorithm for a large set of keys. IEEE Trans Knowl Data Eng 8(3):476–491
Funding
This study is supported via funding from Prince sattam bin Abdulaziz University project number (PSAU/2023/R/1444).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflict of interest to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mishra, T.K., Kolhar, M., Mishra, S.R. et al. Local features-based evidence glossary for generic recognition of handwritten characters. Neural Comput & Applic 36, 685–695 (2024). https://doi.org/10.1007/s00521-023-09051-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-09051-5