Language Independent Skew Estimation Technique Based on Gaussian Mixture Models: A Case Study on South Indian Scripts

  • V. N. Manjunath Aradhya
  • Ashok Rao
  • G. Hemantha Kumar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4815)

Abstract

During document scanning, skew is inevitably introduced into the incoming document image. Presence of additional modified characters, which get plugged in as extensions and remain as disjointed protrusions of a main character is really challenging in estimating inclination in skewed documents made up of texts in south Indian languages (Kannada, Telugu, Tamil and Malayalam). In this paper, we present a novel script independent (for south Indian) skew estimation technique based on Gaussian Mixture Models (GMM). The Expectation-Maximization (EM) algorithm is used to learn the mixture of Gaussians. Subsequently the cluster means are subjected to moments to estimate the skew angle. Experiments on printed and handwritten documents corrupted by noise is done. Our method shows significantly improved performance as compared to other existing methods.

Keywords

Gaussian Mixture Model Document Image Text Line Pattern Recognition Letter English Document 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Basu, S., Chaudhuri, C., Kundu, M., Narsipuri, M., Basu, D.K.: Text line extraction from multi-skewed handwritten documents. Pattern Recognition 40(6), 1825–1839 (2007)MATHCrossRefGoogle Scholar
  2. 2.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)MATHGoogle Scholar
  3. 3.
    Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 2nd edn. Pearson Education (2002)Google Scholar
  4. 4.
    Yan, H.: Skew correction of document images using interline cross-correlation. Computer Vision, Graphics, and Image Processing 55, 538–543 (1993)Google Scholar
  5. 5.
    Kapoor, R., Deepak, Kamal.: A new algorithm for skew detection and correction. Pattern Recognition Letters 25, 1215 (2004)CrossRefGoogle Scholar
  6. 6.
    Lu, Y., Tan, C.L.: A nearest neighbor chain based approach to skew estimation in document images. Pattern Recognition Letters 24, 2315–2323 (2003)CrossRefGoogle Scholar
  7. 7.
    Manjunath Aradhya, V.N., Hemantha Kumar, G., Shivakumara, P.: An accurate and efficient skew estimation technique for south indian documents. International Journal of Robotics and Automation (in press)Google Scholar
  8. 8.
    Baird, H.S.: The skew angle of printed documents. In: Proceedings of Conference Society of Photographic Scientists and Engineers, pp. 14–21 (1987)Google Scholar
  9. 9.
    Hou, H.S.: Digital Document Processing. Wisely, New York (1983)Google Scholar
  10. 10.
    Le, D.S., Thoma, G.R., Wechsler, H.: Automatic page orientation and skew angle detection for binary document images. Pattern Recognition 27, 1325–1344 (1994)CrossRefGoogle Scholar
  11. 11.
    Srihari, S.N., Govindaraju, V.: Analysis of textual images using the hough transform. Machine Vision and Applications 2, 141–153 (1989)CrossRefGoogle Scholar
  12. 12.
    Akiyama, T., Hagita, N.: Automated entry system for printed documents. Pattern Recognition 23(11), 1141–1158 (1990)CrossRefGoogle Scholar
  13. 13.
    Pal, U., Chaudhuri, B.B.: An improved document skew angle estimation technique. Pattern Recognition Letters 17, 899–904 (1996)CrossRefGoogle Scholar
  14. 14.
    Postl, W.: Detection of linear oblique structures and skew scan in digitized documents. In: Proceedings 8th International Conference on Pattern Recognition, pp. 687–689 (1986)Google Scholar
  15. 15.
    Yang, C., Wang, S., Heng, L.: Skew detection and correction in document images based on straight-line fitting. Pattern Recognition Letters 24, 1871 (2003)CrossRefGoogle Scholar
  16. 16.
    Yu, B., Jain, A.K.: A robust and fast skew detection algorithm for generic documents. IEEE Transactions on Pattern Analalysis and Machine Intelligence 29(10), 1599–1629 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • V. N. Manjunath Aradhya
    • 1
  • Ashok Rao
    • 2
  • G. Hemantha Kumar
    • 1
  1. 1.Dept of Studies in Computer Science,University of Mysore, Mysore - 570 006India
  2. 2.Dept of Electronics and Communication, S.J. College of Engineering, MysoreIndia

Personalised recommendations