Historical Handwritten Document Image Segmentation Using Morphology

Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 298)

Abstract

Automatic recovery of text from historical documents is a difficult task due to their degradation because of different types of noise. Applying a global threshold or a chosen threshold based on visual intuition misses the finer handwritten text with low intensity values. These low intensity text are actually considered as a part of background when applying global threshold and are neglected. A single threshold is unable to segment the whole image clearly as various levels of intensities are present in text because of degradation. For restoration of missing texts we propose a thresholding algorithm based on mathematical morphology, which generates very fine adaptive threshold. After applying global threshold, left out background image consists of some mixed image background and handwritten text intensities on which we apply mathematical morphology (opening and closing), which produces a smooth contour and gives an adaptive threshold. The resultant thresholded image have clear uniform background and foreground with enhanced character appearance.

Keywords

Historical text segmentation Adaptive thresholding Mathematical morphology Opening and closing 

References

  1. 1.
    Otsu N (1978) A threshold selection method from grey level histogram. IEEE Trans Syst Man Cybern SMC8:62–66Google Scholar
  2. 2.
    Pun T (1989) A new method for gray-level picture threshoding using entropy of the histogram. Signal Process 2:223–237CrossRefGoogle Scholar
  3. 3.
    Pun T (1981) Entropy thresholding: a new approach. Comput Vis Graphics Image Process 16:210–239CrossRefGoogle Scholar
  4. 4.
    Leedham G, Varma S, Patankar A, Govindaraju V (2002) Separating text and background in de-graded document images—a comparison of global thresholding techniques for multi-stage thresholding. In: Proceedings of eighth international workshop on frontiers of handwriting recognition, Sept 2002, pp 244–249Google Scholar
  5. 5.
    Mallikarjunaswamy BP, Karunakara K (2011) Graph based approach for background elimination and segmentation of the image. Res J Comput Syst Eng 02(02)Google Scholar
  6. 6.
    Mello CAB, Lins RD (2002) Generation of images of historical documents by composition. In: ACM symposium on document engineering, McLean, VA, USA, p 127–133Google Scholar
  7. 7.
    Leedham G, Yan C, Takru K, Tan JHN, Mian L (2003) Comparison of some thresholding algorithims for text/background segmentation in difficult document images. In: Proceedings of the seventh international conference on document analysis and recognition(ICDAR 2003), IEEEGoogle Scholar
  8. 8.
    Yan C, Leedham G (2004) Decompose-threshold approach to handwriting extraction in degraded historical document images. In: Proceedings of the 9th international workshop on frontiers in handwriting recognition (IWFHR-9 2004), IEEEGoogle Scholar
  9. 9.
    Shi Z, Govindaraju V (2004) Historical document image enhancement using background light intensity normalization, ICPR 2004. In: 17th international conference on pattern recognition, Cambridge, United Kingdon, 23–26 Aug 2004Google Scholar
  10. 10.
    Wang Z, Bovik AC, Sheikh HR (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612Google Scholar

Copyright information

© Springer India 2014

Authors and Affiliations

  1. 1.BIT, Mesra (Kolkata Campus)RanchiIndia

Personalised recommendations