Advertisement

Design of a Bilingual Kannada–English OCR

  • R.S. Umesh
  • Peeta Basa Pati
  • A.G. Ramakrishnan
Chapter
Part of the Advances in Pattern Recognition book series (ACVPR)

Abstract

India is a land of many languages and consequently one often encounters documents that contain elements in multiple languages and scripts. This chapter presents an approach towards designing a bilingual OCR that can process documents containing both English and Kannada scripts which are used by the Kannada language of the southern Indian state of Karnataka. We report an efficient script identification scheme for discriminating Kannada from Roman script. We also propose a novel segmentation and recognition scheme for Kannada, which could possibly be applied to many other Indian languages as well.

Keywords

Support Vector Machine Discrete Cosine Transform Test Pattern Text Line Discrete Cosine Transform Coefficient 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    B. Vijayakumar and A. G. Ramakrishnan, Machine recognition of printed Kannada text, Document Analysis Systems V, D. Lopresti, J. Hu, and R. Kashi (Eds.), Lecture Notes in Computer Science 2423, Springer Verlag, Berlin, 2002 pp. 37–48.Google Scholar
  2. 2.
    T. V. Ashwin and P. S. Shastry, A font and size independent ocr system for printed kannada documents using support vector machines, Sadhana. 27(1): 35–58, February 2002.CrossRefGoogle Scholar
  3. 3.
    P. B. Pati, Analysis of multi-lingual documents with complex layout & content, PhD Thesis, Indian Institute of Science, Bangalore, INDIA, 2007.Google Scholar
  4. 4.
    P. B. Pati and A. G. Ramakrishnan, Word level multi-script identification, Pattern Recognition Letters, 2008, doi:10.1016/j.patrec.2008.01.027Google Scholar
  5. 5.
    P. B. Pati, S. S. Raju, N. K. Pati, and A. G. Ramakrishnan, Gabor filters for document analysis in Indian Bilingual Documents, Proceedings of First International Conference on Intelligent Sensing and Information Processing (ICISIP-04), IEEE Publications, Chennai, India, 2004, pp.123–126.Google Scholar
  6. 6.
    D. J. Field, Relation between the statistics of natural images and the response properties of cortical cells, Journal of Optical Society of America A. 4(12): 2379–2394, 1987.CrossRefGoogle Scholar
  7. 7.
    K. R. Rao and P. Yip, Discrete Cosine Transform: Algorithms, Advantages, Applications. Academic Press, New York, 1990.Google Scholar
  8. 8.
    R. Muralishankar, A. G. Ramakrishnan, and P. Prathibha, Modification of pitch using DCT in the source domain, Speech Communication. 42:143–154, 2004.CrossRefGoogle Scholar
  9. 9.
    P. B. Pati, Machine recognition of printed Odiya text documents, Master’s thesis, Indian Institute of Science, Bangalore, INDIA, 2001.Google Scholar
  10. 10.
    C. J. C. Burges, A tutorial on support vector machines for pattern recognition, Data Mining and Knowledge Discovery. 2(2): 955–974, 1998.CrossRefGoogle Scholar
  11. 11.
    R. Collobert and S. Bengio, On The convergence of SVMTorch, an algorithm for large scale regression problems, Tech. Rep., Dalle Molle Institute for Perceptual Artificial Intelligence, Martigny, Switzerland, 2000.Google Scholar
  12. 12.
    Peeta Basa Pati and A. G. Ramakrishnan, OCR in indian scripts: A survey, IETE Technical Review. 22(3): 217–227, May–June 2005.Google Scholar
  13. 13.
    Chih-Wei Hsu and Chih-len Lin. A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks. 13: 415–425, March 2002.CrossRefGoogle Scholar
  14. 14.
    Bruno T. Messmer and Horst Bunke, Efficient subgraph isomorphism detection: A decomposition approach, Transactions on Knowledge and Data Engineering 12(2): 307–323, 2000.CrossRefGoogle Scholar
  15. 15.
    Chih-Chung Chang and Chih-Jen Lin, LIBSVM: A Library for Support Vector Machines, Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm
  16. 16.
    J. Flusser, Moment invariants in image analysis. In Proceedings of World Academy of Science, Engineering and Technology. 11: 196–201, Feb 2006.Google Scholar
  17. 17.
    C. De Boor. A Practical Guide to Splines. Springer-Verlag, 1978.Google Scholar
  18. 18.
    R. L. R. Thomas H. Cormen, and Charles E. Leiserson. Introduction to Algorithms. MIT Press/McGraw-Hill, 1990.Google Scholar
  19. 19.
    D. Lopresti and G. Wilfong. A fast technique for comparing graph representations with applications to performance evaluation. International Journal on Document Analysis and Recognition. 6(4): 219–229, April 2003.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2009

Authors and Affiliations

  • R.S. Umesh
    • 1
  • Peeta Basa Pati
    • 1
  • A.G. Ramakrishnan
    • 1
  1. 1.Department of Electrical EngineeringIndian Institute of ScienceBangaloreIndia

Personalised recommendations