Odia Running Text Recognition Using Moment-Based Feature Extraction and Mean Distance Classification Technique

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 309)

Abstract

Optical character recognition (OCR) is a process of automatic recognition of character from optically scanned documents for the purpose of editing, indexing, searching, as well as reduction in storage space. Development of OCR for an Indian script is an active area of research today because the presence of a large number of letters in the alphabet set, their sophisticated combinations, and the complicated grapheme’s they formed is a great challenge to an OCR designer. We are trying to develop the OCR system for Odia language, which is used as official language of Odisha (formerly known as Orissa). In this paper, we attempt to recognize the vowels, consonants, matras, and compound characters of running Odia script. At first, the given scanned text is segmented into individual Odia symbols, then, extract corresponding feature vectors, using two-dimensional moments and Hough transform (based on topological and geometrical properties), which are used to classify and recognize the symbol. We found that the proposed model can recognize up to 100 % running test having no touched characters.

Keywords

Optical character recognition Odia language Matras Juktakhyara Image processing Feature extraction Recognition 

References

  1. 1.
    Pal, U., Chaudhuri, B.B.: Indian script character recognition: a survey. J. Pattern Recogn. 37, 1887–1899 (2004)CrossRefGoogle Scholar
  2. 2.
    Dongre, V.J., Mankar, V.H.: A review of research on Devnagari character recognition. Int. J. Comput. Appl. (0975–8887) 12(2), 8–14 (2010)Google Scholar
  3. 3.
    Kumar, M.P., Ravikiran, S.S., Nayani, A., Jawahar, C.V., Narayanan, P.J.: Tools for developing OCRs for Indian scripts. CVIT, pp. 1–6 (2011)Google Scholar
  4. 4.
    Jayadevan, R., Kolhe, S.R., Patil, P.M., Pal, U.: Offline recognition of Devanagari script: a survey. IEEE Trans. Syst. Man Cybern 41(6), 2011 (2011)Google Scholar
  5. 5.
    Chaudhuri, B.B., Pal, U., Mitra, M.: Automatic recognition of printed Oriya script. Special Issue Sadhana, Printed in India 27(1), 23–34 (2002)Google Scholar
  6. 6.
    Mohanty, S., Behera, H.K.: A complete OCR development system for Oriya Script. In: Proceeding of SIMPLE RC-ILTS-Oriya, vol. 4 (2004)Google Scholar
  7. 7.
    Mohanty, S., Bebartta, H.N.D.: A novel approach for Bilingual (English–Oriya) script identification and recognition in a printed document Sangh. Int. J. Image Process. (IJIP) 4(2), 175–191 (2010)Google Scholar
  8. 8.
    Pall, U., Wakabayashi, T., Kimura, F.: A system for off-line Oriya handwritten character recognition using curvature feature. In: 10th International Conference on Information Technology (ICIT), IEEE Computer Society, pp. 227–229 (2007)Google Scholar
  9. 9.
    Meher, S., Basa, D.: An intelligent scanner with handwritten Odia character recognition capability. In: Fifth International Conferrence On Sensing Technology, IEEE Computer Society, pp. 53–59 (2011)Google Scholar
  10. 10.
    Nayak, M., Nayak, A.K.: Odia characters recognition by training tesseract OCR engine. International Conference in Distributed Computing and Internet Technology (ICDCIT-2014), published in Int. J. Comput. Appl. (0975–8887), pp. 25–30 (2013)Google Scholar
  11. 11.
    Sridevi, N., Subashini, P.: Moment based feature extraction for classification of handwritten ancient Tamil document. Int. J. Emerg. Trends Eng. Dev. 7(2), 106–115 (2012)Google Scholar

Copyright information

© Springer India 2015

Authors and Affiliations

  1. 1.Siksha ‘O’ Anusandhan UniversityBhubaneswarIndia

Personalised recommendations