An Adaptive Binarization Technique for Low Quality Historical Documents

  • Basilios Gatos
  • Ioannis Pratikakis
  • Stavros J. Perantonis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3163)

Abstract

Historical document collections are a valuable resource for human history. This paper proposes a novel digital image binarization scheme for low quality historical documents allowing further content exploitation in an efficient way. The proposed scheme consists of five distinct steps: a pre-processing procedure using a low-pass Wiener filter, a rough estimation of foreground regions using Niblack’s approach, a background surface calculation by interpolating neighboring background intensities, a thresholding by combining the calculated background surface with the original image and finally a post-processing step in order to improve the quality of text regions and preserve stroke connectivity. The proposed methodology works with great success even in cases of historical manuscripts with poor quality, shadows, nonuniform illumination, low contrast, large signal- dependent noise, smear and strain. After testing the proposed method on numerous low quality historical manuscripts, it has turned out that our methodology performs better compared to current state-of-the-art adaptive thresholding techniques.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Rosenfeld, A., Kak, A.C.: Digital Picture Processing, 2nd edn. Academic Press, New York (1982)Google Scholar
  2. 2.
    Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Systems Man Cybernet. 9(1), 62–66 (1979)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Kittler, J., Illingworth, J.: On threshold selection using clustering criteria. IEEE Trans. Systems Man Cybernet. 15, 652–655 (1985)Google Scholar
  4. 4.
    Brink, A.D.: Thresholding of digital images using two-dimensional entropies. Pattern Recognition 25(8), 803–808 (1992)CrossRefGoogle Scholar
  5. 5.
    Yan, H.: Unified formulation of a class of image thresholding techniques. Pattern Recognition 29(12), 2025–2032 (1996)CrossRefGoogle Scholar
  6. 6.
    Sahoo, P.K., Soltani, S., Wong, A.K.C.: A survey of Thresholding Techniques. Computer Vision, Graphics and Image Processing 41(2), 233–260 (1988)CrossRefGoogle Scholar
  7. 7.
    Kim, I.K., Park, R.H.: Local adaptive thresholding based on a water flow model. In: Second Japan-Korea Joint Workshop on Computer Vision, Japan, pp. 21–27 (1996)Google Scholar
  8. 8.
    Niblack, W.: An Introduction to Digital Image Processing, pp. 115–116. Prentice Hall, Englewood Cliffs (1986)Google Scholar
  9. 9.
    Yang, J., Chen, Y., Hsu, W.: Adaptive thresholding algorithm and its hardware implementation. Pattern Recognition Lett. 15(2), 141–150 (1994)MATHCrossRefGoogle Scholar
  10. 10.
    Parker, J.R., Jennings, C., Salkauskas, A.G.: Thresholding using an illumination model. In: ICDAR 1993, pp. 270–273 (1993)Google Scholar
  11. 11.
    Sauvola, J., Pietikainen, M.: Adaptive Document Image Binarization. Pattern Recognition 33, 225–236 (2000)CrossRefGoogle Scholar
  12. 12.
    Chang, M., Kang, S., Rho, W., Kim, H., Kim, D.: Improved binarization algorithm for document image by histogram and edge detection. In: ICDAR 1995, pp. 636–643 (1995)Google Scholar
  13. 13.
    Trier, O.D., Jain, A.K.: Goal-Directed Evaluation of Binarization Methods. IEEE Trans. on Patt. Anal. and Mach. Intell. 17(12), 1191–1201 (1995)CrossRefGoogle Scholar
  14. 14.
    Eikvil, L., Taxt, T., Moen, K.: A fast adaptive method for binarization of document images. In: Int. Conf. Document Analysis and Recognition, France, pp. 435–443 (1991)Google Scholar
  15. 15.
    Seeger, M., Dance, C.: Binarising Camera Images for OCR. In: Sixth International Conference on Document Analysis and Recognition (ICDAR 2001), Seattle, Washington, pp. 54–58 (2001)Google Scholar
  16. 16.
    Jain, A.: Fundamentals of Digital Image Processing. Prantice Hall, Englewood Cliffs (1989)MATHGoogle Scholar
  17. 17.
    Schilling, R.J.: Fundamentals of Robotics Analysis and Control. Prentice-Hall, Englewood Cliffs (1990)Google Scholar
  18. 18.
  19. 19.
    Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Dokl. 6, 707–710 (1966)MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Basilios Gatos
    • 1
  • Ioannis Pratikakis
    • 1
  • Stavros J. Perantonis
    • 1
  1. 1.Computational Intelligence Laboratory, Institute of Informatics and TelecommunicationsNational Research Center “Demokritos”AthensGreece

Personalised recommendations