Abstract
Classification of handwritten and printed text in pre-printed documents enhances the performance of optical character recognition technologies. The objective of work presented lies in devising an approach to perform automatic classification of printed and handwritten text at word level, which is inherently found in pre-printed documents. The proposed work consists of three stages to perform the classification of printed and handwritten words in Telugu pre-printed documents. The stage one encompasses the feature computation from the segmented words, stage two determines text discrimination coefficient, and finally, the classification of printed and handwritten text using a decision model is accomplished in stage three. The statistical and geometrical moment features are computed with respect to the text block under consideration, and furthermore, these features are employed for determination of text discrimination coefficient. The results of experimentation are proved to be promising and robust with an accuracy of around 98.2 %.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shobha Rani N., Vasudev T.: A Generic Line Elimination Methodology using Circular Masks for Printed and Handwritten Document Images, Emerging research in computing, information, communication and applications, Elsevier Science and technology, vol. 3, (2014).
Sadagopan Srinivasan., Li Zhao., Lin Sun., Zhen Fang, Peng Li.., Tao Wang., Ravishankar Iyer., Ramesh Illikkal., Dong Liu.,:Performance Characterization and Acceleration of Optical Character Recognition on Handheld Platforms, IEEE International Symposium on Workload Characterization (IISWC), (2010).
Suman V Patgar., Vasudev T.,: An unsupervised intelligent system to detect fabrication in photocopy document using geometric moments and gray level co-occurrence matrix, International journal of computer applications, Vol. 74(12), 29–34, (2013).
Mark A Walch., Donald T Gantz.,: Pictographic recognition technology applied to distinctive characteristics of handwritten Arabic text, Proceedings of Symposium on Document Image Understanding Technology, 173–186, (2005).
Ranjeet Srivastava., Ravi Kumar Tewari., Shashi Kant.,: Separation of machine printed and handwritten text for Hindi documents, International Research Journal of Engineering and Technology (IRJET), Vol. 2(2), pp. 704–708, (2015).
M.S. Shirdhonkar., Manish B Kokare.,: Discrimination between printed and handwritten text in documents, International journal of Computer Applications, Recent Trends in Image Processing and Pattern Recognition”, 131–134, (2010).
Rajesh Pathak., Ravi Kumar Tewari.,: Distinction between machine printed text and handwritten Text in a document, International Journal of Scientific Engineering and Research (IJSER), Vol. 3(7), pp. 13–17, (2015).
Lincoln Faria da Silva., Aura Conci., Angel Sanchez.,: Automatic discrimination between printed and handwritten text in documents, XXII Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI), (2009).
Mallikarjun Hangarge., K.C. Santosh., Srikanth Doddamani., Rajmohan Pardeshi.,: Statistical Texture Features based Handwritten and Printed Text Classification in South Indian Documents, International Conference on Emerging Trends in Electrical, Communications and Information Technologies, Elsevier, vol. 1(32), 215–221, (2012).
Samir Malakara., Rahul Kumar Dasa., Ram Sarkarb., Subhadip Basub., Mita Nasipuri.,: Handwritten and printed word identification using gray-scale feature vector and decision tree classifier, International Conference on Computational Intelligence: Modeling Techniques and Applications(CIMTA), Procedia Technology 10, 831–839, (2013).
Simon Xinmeng Lia.,: Image analysis by moments, Thesis, Department of electrical and computer engineering, University of Manitoba, Winnipeg, Canada, (1993).
Yan Qiu Chen., Mark X. Nixon., David W. Thomas.,: Statistical geometrical features for texture classification, Pattern recognition, Elsevier Science, vol. 28(4), 537–552, (1995).
Mohammed Javed., P. Nagabhushan., B.B. Chaudhuri.,: Extraction of projection profile, run-histogram and entropy features straight from run-length compressed text documents, IAPR Asian conference on pattern recognition, IEEE proceedings, 813–817, (2013).
Li. S., Moon chuan Lee., Chi Man Pun.,: Complex zernike moments features for shape based image retrieval, IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, Vol. 39(1), 227–237, (2008).
Franz Faul., Edgar Erdfelder., Axel Buchner., Albert-Georg Lang.,: statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses, Behavior Research Methods, Springer, Vol. 41(4), 1149–1160, (2009).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media Singapore
About this paper
Cite this paper
Rani, N.S., Vasudev, T. (2017). An Unsupervised Classification of Printed and Handwritten Telugu Words in Pre-printed Documents Using Text Discrimination Coefficient. In: Satapathy, S., Prasad, V., Rani, B., Udgata, S., Raju, K. (eds) Proceedings of the First International Conference on Computational Intelligence and Informatics . Advances in Intelligent Systems and Computing, vol 507. Springer, Singapore. https://doi.org/10.1007/978-981-10-2471-9_67
Download citation
DOI: https://doi.org/10.1007/978-981-10-2471-9_67
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-2470-2
Online ISBN: 978-981-10-2471-9
eBook Packages: EngineeringEngineering (R0)