Restoring Ink Bleed-Through Degraded Document Images Using a Recursive Unsupervised Classification Technique

  • Drira Fadoua
  • Frank Le Bourgeois
  • Hubert Emptoz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3872)

Abstract

This paper presents a new method to restore a particular type of degradation related to ancient document images. This degradation, referred to as “bleed-through”, is due to the paper porosity, the chemical quality of the ink, or the conditions of digitalization. It appears as marks degrading the readability of the document image. Our purpose consists then in removing these marks to improve readability. The proposed method is based on a recursive unsupervised segmentation approach applied on the decorrelated data space by the principal component analysis. It generates a binary tree that only the leaves images satisfying a certain condition on their logarithmic histogram are processed. Some experiments, done on real ancient document images provided by the archives of “Chatillon-Chalaronne” illustrate the effectiveness of the suggested method.

Keywords

Document Image Thresholding Technique Restoration Method Recursive Approach Handwritten Document 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Baird, H.S.: State of the Art of Document Image Degradation Modelling. In: IAPR 2000 Workshop on Document Analysis Systems, Brazil (December 2000) (invited talk)Google Scholar
  2. 2.
    Leedham, G., Varma, S., Patankar, A., Govindaraju, V.: Separating text and background in degraded document images – a comparison of global thresholding techniques for multi-stage thresholding. In: Proceedings of the 8th international workshop on frontiers in handwriting recognition, Canada, August 2002, pp. 244–249 (2002)Google Scholar
  3. 3.
    Sharma, G.: Cancellation of show-through in duplex scanning. In: International Conference on Image Processing (ICIP), September 2000, vol. 2, pp. 609–612 (2000)Google Scholar
  4. 4.
    Dubois, E., Pathak, A.: Reduction of bleed-through in scanned manuscripts documents. In: Proceedings of the IS&T conference on image processing, image quality, image capture systems, Montreal, Canada, April 2001, pp. 177–180 (2001)Google Scholar
  5. 5.
    Tan, C.L., Cao, R., Shen, P.: Restoration of Archival Documents Using a Wavelet Technique. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 1399–1404 (2002)CrossRefGoogle Scholar
  6. 6.
    Tan, C.L., Cao, R., Shen, P., Chee, J., Chang, J.: Text extraction from historical handwritten documents by edge detection. In: 6th International Conference on Control, Automation, Robotics and Vision, ICARCV 2000, Singapore (December 2000)Google Scholar
  7. 7.
    Wang, Q., Xia, T., Tan, C.L., Li, L.: Directional Wavelet Approach to Remove Document Image Interference. In: ICDAR 2003, Edinburgh, Scotland, August 2003, pp. 736–740 (2003)Google Scholar
  8. 8.
    Tonazzini, A., Salerno, E., Mochi, M., Bedini, L.: Bleed-through removal from degraded documents using a color decorrelation method. In: Marinai, S., Dengel, A.R. (eds.) DAS 2004. LNCS, vol. 3163, pp. 229–240. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  9. 9.
    Gatos, B., Pratikakis, I., Perantonis, S.J.: An Adaptive Binarization Technique for Low Quality Historical Documents. In: Marinai, S., Dengel, A.R. (eds.) DAS 2004. LNCS, vol. 3163, pp. 102–113. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  10. 10.
    Smigiel, E., belaid, A., Hamza, H.: Self-organizing Maps and Ancient Documents. In: Marinai, S., Dengel, A.R. (eds.) DAS 2004. LNCS, vol. 3163, pp. 125–134. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Leydier, Y., Le Bourgeois, F., Emptoz, H.: Serialized k-means for adaptative color image segmentation – application to document images and others. In: Marinai, S., Dengel, A.R. (eds.) DAS 2004. LNCS, vol. 3163, pp. 252–263. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  12. 12.
    Hartigan, J.A., Wang, M.A.: A K-means clustering algorithm. Applied Statistics 28, 100–108 (1979)MATHCrossRefGoogle Scholar
  13. 13.
    Chris, D., Xiaofeng, H.: K-means Clustering via Principal Component Analysis. In: Proc. of Int’l Conf. Machine Learning (ICML 2004), Canada (July 2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Drira Fadoua
    • 1
  • Frank Le Bourgeois
    • 1
  • Hubert Emptoz
    • 1
  1. 1.LIRISINSA de LYONVilleurbanneFrance

Personalised recommendations