Abstract—
“MODI lipi” is one of the Indian ancient scripts and is as yet unrecognized script. It is not in use today, but it has importance for historical researchers of not only the ancient Maratha history but history in different regions of India. The recognition of MODI demands transform invariant approach, as MODI documents are deformed severely. Invariant handwritten character recognition is accomplished in the past by employing feature extraction methods, yet there is a scope to improve the results under global transformations. At present, convolutional neural network exhibits only local transform invariance impulsively by convolution-pooling architecture and data augmentation. To achieve global invariance for MODI recognition, the proposed classification framework used CNN-based transfer learning and a global feature extractor histogram of oriented gradient. Additionally, the criterion based on principal component analysis and confusion matrix are introduced to choose the invariant feature and to find classes responsible for poor recognition rate. The proposed classifiers are trained on a self-created handwritten MODI character dataset and tested on transformed MODI dataset. The results showed that the proposed framework is effective to recognize MODI handwritten characters under transformations without data augmentation and network alteration.
Similar content being viewed by others
REFERENCES
N. Aharrane, A. Dahmouni, K. El Moutaouakil, and K. Satori, “A robust statistical set of features for Amazigh handwritten characters,” Pattern Recognit. Image Anal. 27, 41–52 (2017). https://doi.org/10.1134/S1054661817010011
K. Alex, I., Sutskever, and G. E., Hinton, “ImageNet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, Ed. by F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Curran Associates, 2012), pp. 1097–1105.
S. Alirezaee, H. Aghaeinia, K. Faez, and M. Ahmadi, “Off-line recognition of handwritten middle age Persian characters using moment,” Pattern Recognit. Image Anal., 16, 622–631(2006). https://doi.org/10.1134/S1054661806040079
R. Benouini, I. Batioua, K. Zenkouar, A. Zahi, H. EI Fadili, and H. Qjidaa, “Fast and accurate computation of Racah moment invariants for image classification,” Pattern Recognit. 91, 100–110 (2019). https://doi.org/10.1016/j.patcog.2019.02.014
U. Bhattacharya, M. Shridhar, S. K. Parui, P. K. Sen, and B. B. Chaudhuri, “Offline recognition of hand- written Bangla characters: an efficient two-stage approach,” Pattern Anal. Appl. 15, 445–458 (2012). https://doi.org/10.1007/s10044-012-0278-6
A. Bietti, and J. Mairal, “Invariance and stability of deep convolutional representations,” in Proc. 31st Int. Conf. on Neural Information Processing Systems, Long Beach, Calif., 2017, Ed. by U. von Luxburg, I. Guyon, S. Bengio, H. Wallach, and R. Fergus (Curran Associates, Red Hook, N.Y., 2017), pp. 6210–6220.
S. L. Chandure, and V. Inamdar, “Performance analysis of handwritten Devnagari and MODI character recognition system,” in Int. Conf. on Computing, Analytics and Security Trends (CAST), Pune, India, 2016 (IEEE, 2016), pp. 513–516. https://doi.org/10.1109/CAST.2016.7915022
S. Chandure, and V. Inamdar. “Handwritten MODI character recognition using transfer learning with discriminant feature analysis,” IETE J. Res. (2021). https://doi.org/10.1080/03772063.2021.1902867
A. Chaudhuri, K. Mandaviya, P. Badelia, and S. K. Ghosh, “Optical character recognition systems,” in Optical Character Recognition Systems for Different Languages with Soft Computing, Studies in Fuzziness and Soft Computing, vol. 352 (Springer, Cham, 2017), pp. 9–41. https://doi.org/10.1007/978-3-319-50252-6_2
A. Bietti and J. Mairal, “Invariance and stability of deep convolutional representations,” In NIPS 2017-31st Conference on Advances in Neural Information Processing Systems, 2017, pp. 6210–6220.
K.-W. Cheung, D.-Y. Yeung, and R. T. Chin, “A Bayesian framework for deformable pattern recognition with application to handwritten character recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 20, 1382–1388 (1998). https://doi.org/10.1109/34.735813
N. Dalal, and B. Triggs, “Histograms of oriented gradients for human detection,” in IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR’05), San Diego, 2005 (IEEE, 2005), vol. 1, pp. 886–893. https://doi.org/10.1109/CVPR.2005.177
W. Deng, J. Hu, J. Lu, and J. Guo, “Transform-invariant PCA: A unified approach to fully automatic face alignment, representation, and recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 36, 1275–1284 (2014). https://doi.org/10.1109/TPAMI.2013.194
J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in IEEE Conf. on Computer Vision and Pattern Recognition, Miami, Fla., 2009 (IEEE, 2009), pp. 248–255. https://doi.org/10.1109/CVPR.2009.5206848
M. Diem and R. Sablatnig, “Recognition of degraded handwritten characters using local features,” in 10th Int. Conf. on Document Analysis and Recognition, Barcelona, 2009 (IEEE, 2009), pp. 221–225. https://doi.org/10.1109/ICDAR.2009.158
J. Flusser and T. Suk, “Affine moment invariants: A new tool for character recognition,” Pattern Recognit. Lett. 15, 433–436 (1994). https://doi.org/10.1016/0167-8655(94)90092-2
T. Ghosh, H. Al Banna, N. Mumenin, and M. A. Yousuf, “Performance analysis of state of the art convolutional neural network architectures in Bangla handwritten character recognition,” Pattern Recognit. Image Anal. 31, 60–71 (2021). https://doi.org/10.1134/S1054661821010089
J. Gu, A. He, and X. Tian, “RC-CNN: Representation-consistent convolutional neural networks for achieving transformation invariance”, in IEEE Int. Conf. on Systems, Man and Cybernetics (SMC), Bari, Italy, 2019 (IEEE, 2019), pp. 1588–1595. https://doi.org/10.1109/SMC.2019.8914017
C. L. He and C.Y. Suen, “A hybrid multiple classifier system of unconstrained handwritten numeral recognition,” Pattern Recognit. Image Anal. 17, 608–611 (2007). https://doi.org/10.1134/S1054661807040219
W. Hernandez and A. Mendez, “Application of principal component analysis to image compression,” in Statistics: Growing Data Sets and Growing Demand for Statistics, Ed. by T. Göksel (InTechOpen, London, 2018), pp. 107–137. https://doi.org/10.5772/intechopen.75007
N. A. Jebril, H. R. Al-Zoubi, and Q. A. Al. Haija, “Recognition of handwritten Arabic characters using histograms of oriented gradient (HOG),” Pattern Recognit. Image Anal. 28, 321–345 (2018). https://doi.org/10.1134/S1054661818020141
F. Johannes, “Round robin classification,” J. Mach. Learn. Res. 2, 721–747 (2002).
K. Joshi, and M. I. Patel, “Recent advances in local feature detector and descriptor: A literature survey,” Int. J. Multimedia Inf. Retr. 9, 231–247 (2020). https://doi.org/10.1007/s13735-020-00200-3
C. Kan, and M. D. Srinath, “Invariant character recognition with Zernike and orthogonal Fourier–Mellin moments,” Pattern Recognit. 35, 143–154 (2002). https://doi.org/10.1016/S0031-3203(00)00179-5
H. Kandi, D., Mishra, and G. S. Subrahmanyam, “A differential excitation based rotational invariance for convolutional neural networks,” in Proc. Tenth Indian Conference on Computer Vision, Graphics and Image Processing, Guwahati Assam, India, 2016 (Association for Computing Machinery, New York, 2016), p. 70. https://doi.org/10.1145/3009977.3009978
B. R. Kavitha and C. Srimathi, “Benchmarking on offline handwritten Tamil character recognition using convolutional neural networks,” J. King Saud Univ. Comput. Inf. Sci. (2019). https://doi.org/10.1016/j.jksuci.2019.06.004
A. Khan, A. Sohail, U. Zahoora, and A. S. Qureshi, “A survey of the recent architectures of deep convolutional neural networks,” Artif. Intell. Rev. 53, 5455–5516 (2020). https://doi.org/10.1007/s10462-020-09825-6
S. M. Lajevardi and M. Lech, “Averaged Gabor filter features for facial expression recognition,” in Digital Image Computing: Techniques and Applications, Canberra, 2008 (IEEE, 2008), pp. 71–76. https://doi.org/10.1109/DICTA.2008.12
D. Laptev, N. Savinov, J. M. Buhmann, and M. Pollefeys, “Ti-pooling: transformation-invariant pooling for feature learning in convolutional neural networks,” in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016 (IEEE, 2016), pp. 289–297. https://doi.org/10.1109/CVPR.2016.38
B. Li, F. Sun, and Y. Zhang, “Building recognition using gist feature based on locality sensitive histograms of oriented gradients,” Pattern Recognit. Image Anal. 29, 258–267 (2019). https://doi.org/10.1134/S1054661819020044
S. Liao and A. C. Chung, “Texture classification by using advanced local binary patterns and spatial distribution of dominant patterns,” in IEEE Int. Conf. on Acoustics, Speech and Signal Processing–ICASSP ’07, Honolulu, 2007 (IEEE, 2007), vol. 1, pp. 1221–1224. https://doi.org/10.1109/ICASSP.2007.366134
J. Memon, M. Sami, R. A. Khan, and M. Uddin, “Handwritten optical character recognition (OCR): A comprehensive systematic literature review (SLR),” IEEE Access 8, 142642–142668 (2020). https://doi.org/10.1109/ACCESS.2020.3012542
A. Nagender and S. Aneja, “Transfer learning using CNN for handwritten Devanagari character recognition,” in 1st IEEE Int. Conf. on Advances in Information Technology (ICAIT), Chikmagalur, India, 2019 (IEEE, 2019), pp. 293–296. https://doi.org/10.1109/ICAIT47043.2019.8987286
S. R. Narang, M. K. Jindal, and M. Kumar, “Ancient text recognition: a review,” Artif. Intell. Rev. 53, 5517–5558 (2020). https://doi.org/10.1007/s10462-020-09827-4
A. J. Newell and L. D. Griffin, “Multiscale histogram of oriented gradient descriptors for robust character recognition,” in Int. Conf. on Document Analysis and Recognition, Beijing, 2011 (IEEE, 2011), pp. 1085–1089. https://doi.org/10.1109/ICDAR.2011.219
S. R. Pawar and S. N., Jadhav, “A CNN based framework for translation invariant image classification,” in 3rd IEEE Int. Conf. on Recent Trends in Electronics, Information and Communication Technology (RTEICT), Bangalore, India, 2018 (IEEE, 2018), pp. 78–82. https://doi.org/10.1109/RTEICT42901.2018.9012656
D. Peng, Y. Wang, Y., C. Liu, and Z. Chen, “TL-NER: A transfer learning model for Chinese named entity recognition,” Inf. Syst. Front. 22, 1291–1304 (2020). https://doi.org/10.1007/s10796-019-09932-y
S. B. Poodikkalam and P. Loganathan, “Optical character recognition based on local invariant features,” Imaging Sci. J. 68, 214–224 (2020). https://doi.org/10.1080/13682199.2020.1827814
R. Ptucha, F. P. Such, and S. Pillai, F. Brockler, V. Singh, and P. Hutkowski, “Intelligent character recognition using fully convolutional neural networks,” Pattern Recognit. 88, 604–613 (2019). https://doi.org/10.1016/j.patcog.2018.12.017
J. Ryu, M.-H. Yang, and J. Lim, “DFT-based transformation invariant pooling layer for visual classification,” in Computer Vision–ECCV 2018, Ed. by V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, Lecture Notes in Computer Science, vol. 11218 (Springer, Cham, 2018), pp. 84–99. https://doi.org/10.1007/978-3-030-01264-9_6
S. Saeed, S. Naz, and M. I. Razzak, “An application of deep learning in character recognition: an overview,” in Handbook of Deep Learning Applications, Ed. by V. Balas, S. Roy, D. Sharma, and P. Samui, Smart Innovation, Systems and Technologies, vol. 136 (Springer, Cham, 2019), pp. 53–81. https://doi.org/10.1007/978-3-030-11479-4_3
H. Sahbi, “Kernel PCA for similarity invariant shape recognition”, Neurocomputing 70, 3034–3045 (2007). https://doi.org/10.1016/j.neucom.2006.06.007
A. K. Sharma, P. Thakkar, D. M. Adhyaru, and T. H. Zaveri, “Handwritten Gujarati character recognition using structural decomposition technique,” Pattern Recognit. Image Anal. 29, 325–338 (2019). https://doi.org/10.1134/S1054661819010061
X. Shen, X. Tian, A. He, S. Sun, and D. Tao, “Transform-invariant convolutional neural networks for image classification and search,” in Proc. 24th ACM Int. Conf. on Multimedia, Amsterdam, 2016 (Association for Computing Machinery, New York, 2016), pp. 1345–1354. https://doi.org/10.1145/2964284.2964316
P. Sohoni, “Marathi of a Single Type: The demise of the Modi script,” Mod. Asian Stud. 51, 662–685 (2017). https://doi.org/10.1017/S0026749X15000542
P. K. Sonawane and S. Shelke, “Handwritten Devanagari character classification using deep learning,” in Int. Conf. on Information, Communication, Engineering and Technology (ICICET), Pune, India, 2018 (IEEE, 2018), pp. 1–4. https://doi.org/10.1109/ICICET.2018.8533703
D. M. Squire, W. Müller, H. Müller, and T. Pun, “Content-based query of image databases: Inspirations from text retrieval,” Pattern Recognit. Lett. 21, 1193–1198 (2000). https://doi.org/10.1016/S0167-8655(00)00081-7
Y. Tang, L. Peng, Q. Xu, Y. Wang, and A. Furuhata, “CNN based transfer learning for historical Chinese character recognition,” in 12th IAPR Workshop on Document Analysis Systems (DAS), Santorini, Greece, 2016 (IEEE, 2016), pp. 25–29. https://doi.org/10.1109/DAS.2016.52
N. van Noord and E. Postma, “Learning scale-variant and scale-invariant features for deep image classification,” Pattern Recognit. 61, 583–592 (2017). https://doi.org/10.1016/j.patcog.2016.06.005
H. Wu and X. Gu, “Max-pooling dropout for regularization of convolutional neural networks,” in Neural Information Processing. ICONIP 2015, Ed. by S. Arik, T. Huang, W. Lai, and Q. Liu, Lecture Notes in Computer Science, vol. 9489 (Springer, Cham, 2015), pp. 46–54. https://doi.org/10.1007/978-3-319-26532-2_6
Y. Zhu, F. Zhuang, J. Yang, X. Yang, and Q. He, “Adaptively transfer category-classifier for handwritten chinese character recognition,” in Advances in Knowledge Discovery and Data Mining. PAKDD 2019, Ed. by Q. Yang, Z. H. Zhou, Z. Gong, M. L. Zhang, and S. J. Huang, Lecture Notes in Computer Science, vol. 11439 (Springer, Cham, 2019), pp. 110–122. https://doi.org/10.1007/978-3-030-16148-4_9
Z. Zivkovic and J. Verbeek, “Transformation invariant component analysis for binary images,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), New York, 2006 (IEEE, 2006), pp. 254–259. https://doi.org/10.1109/CVPR.2006.316
ACKNOWLEDGMENTS
The authors express gratitude to Dr. Vishwanath Karad MITWPU, Pune, Maharashtra, India for providing the GeForce GTX TITAN by NVIDIA for experimentations.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
COMPLIANCE WITH ETHICAL STANDARDS
This article is a completely original work of its authors; it has not been published before and will not be sent to other publications until the PRIA Editorial Board decides not to accept it for publication.
Conflict of Interest
The authors declare that they have no conflicts of interest.
Additional information
Savitri Nathrao Jadhav received her post graduate degree from University of Pune, Maharashtra, India, in 2009. She is currently working as an Assistant Professor in the School of Electronics and Communication Engineering, Dr. Vishawanath Karad MITWPU, Pune, India. She is an academician having a total of 14 years of experience. She is a member of SAE-India.
Vandana S. Inamdar received her Ph.D degree from Savitribai Phule Pune University, Pune India. She has worked as an Associate Professor in the Department of Computer Engineering and Information Technology, College of Engineering, Pune, India. She is currently as an Associate Professor in the Department of Computer Engineering, Govt. College of Engineering and research, Awasari (Khurd), Pune, India. She has 28 years of academic experience and around forty publications to her credit. She is a Member of IEEE Signal Processing Society, CSI India, IETE, and ISTE.
Rights and permissions
About this article
Cite this article
Savitri Jadhav, Vandana Inamdar Convolutional Neural Network and Histogram of Oriented Gradient Based Invariant Handwritten MODI Character Recognition. Pattern Recognit. Image Anal. 32, 402–418 (2022). https://doi.org/10.1134/S1054661822020109
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1054661822020109