Advertisement

Digital Image Enhancement of Indic Historical Manuscripts

  • Zhixin ShiEmail author
  • Srirangaraj Setlur
  • Venu Govindaraju
Chapter
Part of the Advances in Pattern Recognition book series (ACVPR)

Abstract

Historical documents in Indic scripts can be found on a wide range of media such as paper, palm leaves, and parchment. Palm leaves are believed to be one of the earliest forms of writing media and their use as writing material has been recorded in various parts of the world including India. Ancient palm leaf manuscripts relating to religion, science, medicine, astronomy are still available for reference today due to many ongoing efforts for preservation of ancient documents by libraries and universities around the world. These manuscripts typically last a few centuries but with time the leaves degrade and the writing becomes illegible. Image processing techniques can help enhance the images of these manuscripts so as to enable readability of the written text. In this chapter, we propose methods for enhancing digital images of palm leaf and other historical manuscripts. We approximate the background of a gray-scale image using piece-wise linear and nonlinear models. Normalization algorithms are used on the color channels of the palm leaf image to obtain an enhanced gray-scale image. Experimental results show significant improvement in readability. An adaptive local connectivity map is used to try to segment lines of text from the enhanced images with the objective of facilitating techniques such as keyword spotting or partial OCR and thereby making it possible to index these documents for retrieval from a digital library.

Keywords

Image enhancement Image processing Document pre-processing Historical documents Palm leaf manuscripts Indic scripts OCR Text line extraction 

References

  1. 1.
    Leedham, G., Varma, S., Patankar, A., Govindaraju, V.: Separating text and background in degraded document images–a comparison of global thresholding techniques for multi-stage thresholding. In: Proceedings Eighth International Workshop on Frontiers of Handwriting Recognition (September 2002)Google Scholar
  2. 2.
    Otsu, N.: A threshold selection method from gray level histogram. IEEE Transactions in Systems, Man, and Cybernetics 9 (1979) 62–66CrossRefGoogle Scholar
  3. 3.
    Kapur, J.N., Sahoo, P.K., Wong, A.K.C.: A new method for gray-level picture thresholding using the entropy of the histogram. Computer Vision, Graphics, and Image Processing 29 (1985) 273–285CrossRefGoogle Scholar
  4. 4.
    Kittler, J. Illingworth, J.: Minimum error thresholding. Pattern Recognition 19(1) (1986) 41–47CrossRefGoogle Scholar
  5. 5.
    Mello, C.A.B., Lins, R.D.: Image segmentation of historical documents. In: Visual 2000, Mexico City, Mexico (September 2000)Google Scholar
  6. 6.
    Wang, Q., Tan, C.: Matching of double-sided document images to remove interference. In: IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA (2001)Google Scholar
  7. 7.
    Wang, Q., Xia, T., Li, L., Tan, C.: Document image enhancement using directional wavelet. In: Proceedings of the 2003 IEEE Conference on Computer Vision and Pattern Recognition, Madison, WI (June 2003)Google Scholar
  8. 8.
    Mello, C.A.B., Lins, R.D.: Generation of images of historical documents by composition. In: ACM Symposium on Document Engineering, McLean, VA (2002)Google Scholar
  9. 9.
    Shi, Z., Govindaraju, V.: Historical document image enhancement using background light intensity normalization. 17th International Conference on Pattern Recognition, Cambridge, UK (23–26 August 2004)Google Scholar
  10. 10.
    Shi, Z., Govindaraju, V.: Line separation for complex document images using fuzzy runlength. In: DIAL ’04: Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL’04), IEEE Computer Society, Los Alamitos, CA (2004) 306Google Scholar
  11. 11.
    Srihari, S., Govindaraju, V.: Analysis of textual images using the Hough transform. Machine Vision and Applications 2 (1989) 141–153CrossRefGoogle Scholar
  12. 12.
    Manmatha, R., Srimal, N.: Scale space technique for word segmentation in handwritten documents. In: SCALE-SPACE ’99: Proceedings of the Second International Conference on Scale-Space Theories in Computer Vision, Springer-Verlag, London (1999) 22–33CrossRefGoogle Scholar
  13. 13.
    Giuliano, E., Paitra, O., Stringer, L.: Electronic character reading system. U.S. Patent No. 4,047,15 (September 1977)Google Scholar
  14. 14.
    Niblack, W.: An Introduction to Digital Image Processing. Prentice Hall, English Cliffs, NJ (1986)Google Scholar

Copyright information

© Springer-Verlag London Limited 2009

Authors and Affiliations

  • Zhixin Shi
    • 1
    Email author
  • Srirangaraj Setlur
    • 2
  • Venu Govindaraju
    • 2
  1. 1.Department of Computer Science and EngineeringUniversity at BuffaloBuffaloUSA
  2. 2.Department of Computer Science and Engineering Center for Unified Biometrics and SensorsUniversity at BuffaloAmherstUSA

Personalised recommendations