Abstract
Designing a dynamic and proficient OCR entity is an interesting and captivating region in image processing. The OCR organism purposes to decipher text in images to a machine comprehensible text. A successful degraded document has numerous applications like preservation of history, persistent documents, and many more. For the same, copious policies occur for numerous scripts, languages, and so far for virtuous class papers. Contrariwise, unique restricted versions have been explored for degraded printed Marathi characters. The OCR structure comprises stages like data collection, preprocessing, feature extraction, segmentation, classification, and recognition. For this work, we have developed our dataset size of 4900 images for 49 isolated Marathi characters. Among these, preprocessing phase plays a significant role, especially in the case of degraded characters. This paper focuses work on preprocessing of degraded Marathi characters. Mean Square Error, Mutual information, and Peak signal to noise ratio assessment factors are used for the evaluation of the enriched image. The proposed approach’s effects are attained in MATLAB R2015a. Many researchers have made efforts for developing OCR for degraded printed documents for various languages, whereas very less attempts made for degraded Marathi character recognition. In the case of the degraded Marathi character recognition problem, the recognition accuracy is largely dependent on the preprocessing techniques and how the character information and features are retained. To overcome the limitations of earlier reported systems of loss of information and features retention, a novel and enhanced preprocessing technique is developed specifically for degraded Marathi character recognition. The proposed methodology is giving promising results as compared to other methods which are 35.14. Similarly, the average mutual information value of the proposed technique is 2.71 which is greater than others. The projected practices offered less value of average mean square error and response time. The proposed method for preprocessing will improve the recognition accuracy for degraded isolated Marathi characters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Khobragade RN, Koli NA, Makesar MS (2013) A survey on recognition of Devnagari script. Int J Comput Appl Inf Technol II(I). ISSN: 2278-7720
Patil CH, Mali SM (2019) Handwritten Marathi consonants recognition using multilevel classification. Int J Comput Appl 975:8887
Mali SM, Patil CH (2015) Marathi handwritten numeral recognition using Zernike moments and Fourier descriptors. Int J Comput Appl 975:8887
Patil CH, Mali SM (2015) Segmentation of isolated handwritten Marathi words. Int J Comput Appl 975:8887
Nagane AS, Mali SM (2020) Segmentation of characters from degraded Brahmi script images. Applied computer vision and image processing. Springer, Singapore, pp 326–338
Nagane A, Patil CH, Mali SM (2018) Binarization of degraded Brahmi script estampage images. In: International conference on changing perspective of industries with Industry 4.0
Ramana Murthy OV, Roy S, Narang V, Hanmandlu M (2012) Devanagari character recognition in the wild. Int J Comput Appl (0975-8887) 38(4)
Dhandra BV et al (2006) Word-wise script identification from bilingual documents based on morphological reconstruction. IEEE Trans Pattern Anal Mach Intell 32(12)
Chiu Y-H, Chung K-L, Yang W-N, Huang Y-H, Liao C-H (2012) Parameter-free based two-stage method for binarizing degraded document images. Pattern Recogn 45:4250–4262
Prathima G, Rao GKS (2011) A survey of Nandinagari manuscript recognition system. Int J Sci Technology 1(1). ISSN (online): 2250-141X. www.ijst.co.in
Pal U, Sharma N, Wakabayashi T, Kimura F (2007) Off-line handwritten character recognition of Devanagari script. In: Proceedings of the 9th international conference on document analysis and recognition, Parana, 23–26 Sept 2007, pp 496–500
Ghosh D, Dube T, Shivaprasad AP (2010) Script recognition—a review. IEEE Trans Pattern Anal Mach Intell 32(12)
Ko AH-R et al (2009) Leave-one-out-training and leave-one-out-testing hidden Markov models for a handwritten numeral recognizer: the implications of a single classifier and multiple classifications. IEEE Trans Pattern Anal Mach Intell 31(12)
Pal U, Roy PP (2004) Multioriented and curved text lines extraction from Indian documents. IEEE Trans Syst Man Cybern Part B: Cybern 34(4)
Jindal MK, Sharma RK, Lehal GS (2007) A study of different kinds of degradation in printed Gurmukhi script. In: Proceedings of the international conference on computing: theory and applications, 2007
Mohahmed Althaf MK, Baritha Begum M (2012) Handwritten characters pattern recognition using neural networks. In: International conference on computing and control engineering (ICCCE 2012), 12–13 April 2012
Su B, Lu S, Tan CL (2013) Robust document image binarization technique for degraded document images. IEEE Trans Image Process 22(4)
Ntirogiannis K, Gatos B, Pratikakis I (2011) A performance evaluation methodology for historical document image binarization. IEEE
Sokratis V, Kavallieratou E, Paredes R, Sotiropoulos K (2011) A hybrid binarization technique for document images. Springer
Patvardhan C, Verma AK, Vasantha Lakshmi C (2012) Document image binarization using wavelets for OCR applications. In: ICVGIP ‘12 proceedings of the eighth Indian conference on computer vision, graphics and image processing, Article No. 60. ACM, New York, NY, USA
Neves RFdP, Zanchettin C, Mello CA (2013) An adaptive thresholding algorithm based on edge detection and morphological operations for document images. In: DocEng ‘13 proceedings of the 2013 ACM symposium on document engineering. ACM, New York, NY, USA
Gonzales CR (2013) Digital image processing, 3rd edn. Pearson
Khan MW (2014) A survey: image segmentation techniques. Int J Future Comput Commun 3(2)
Liu X, Duan Z, Xu W (2016) Improved computing method of mutual information in medical image registration. Int J Signal Process Image Process Pattern Recogn 9(4):415–424
Sara U, Akter M, Uddin MS (2019) Image quality assessment through FSIM, SSIM, MSE and PSNR—a comparative study. J Comput Commun 7:8–18
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sonawane, M.S., Dhawale, C.A., Patil, C.H. (2023). Enhanced Preprocessing Technique for Degraded Printed Marathi Characters. In: Kulkarni, A.J., Mirjalili, S., Udgata, S.K. (eds) Intelligent Systems and Applications. Lecture Notes in Electrical Engineering, vol 959. Springer, Singapore. https://doi.org/10.1007/978-981-19-6581-4_25
Download citation
DOI: https://doi.org/10.1007/978-981-19-6581-4_25
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-6580-7
Online ISBN: 978-981-19-6581-4
eBook Packages: Computer ScienceComputer Science (R0)