Advertisement

Fast correction of bleed-through distortion in grayscale documents by a blind source separation technique

  • Anna TonazziniEmail author
  • Emanuele Salerno
  • Luigi Bedini
Original Paper

Abstract

Ancient documents are usually degraded by the presence of strong background artifacts. These are often caused by the so-called bleed-through effect, a pattern that interferes with the main text due to seeping of ink from the reverse side. A similar effect, called show-through and due to the nonperfect opacity of the paper, may appear in scans of even modern, well-preserved documents. These degradations must be removed to improve human or automatic readability. For this purpose, when a color scan of the document is available, we have shown that a simplified linear pattern overlapping model allows us to use very fast blind source separation techniques. This approach, however, cannot be applied to grayscale scans. This is a serious limitation, since many collections in our libraries and archives are now only available as grayscale scans or microfilms. We propose here a new model for bleed-through in grayscale document images, based on the availability of the recto and verso pages, and show that blind source separation can be successfully applied in this case too. Some experiments with real-ancient documents arepresented and described.

Keywords

Grayscale document restoration Bleed-through cancellation Blind source separation Independent component analysis 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Leedham, G., Varma, S., Patankar, A., Govindaraju, V.: Separating text and background in degraded document images—a comparison of global thresholding techniques for multi-stage thresholding. In: Proceedings of the 8th International Workshop on Frontiers in Handwriting Recognition, Niagara on the Lake, Canada, pp. 244–249 (2002)Google Scholar
  2. 2.
    Govindaraju, V., Srihari, N.: Separating handwritten text from overlapping nontextual contours. In: Proceedings of the International Workshop on Frontiers in Handwriting Recognition, Chateau de Bonas, France, pp. 111–119 (1991)Google Scholar
  3. 3.
    Franke, K., Köppen, M.: A computer-based system to support forensic studies on handwritten documents. IJDAR 3, 218–231 (2001)CrossRefGoogle Scholar
  4. 4.
    Sharma, G.: Show-through cancellation in scans of duplex printed documents. IEEE Trans. Image Process. 10(5), 736–754 (2001)CrossRefGoogle Scholar
  5. 5.
    Dubois, E., Pathak, A.: Reduction of bleed-through in scanned manuscript documents. In: Proceedings of the IS&T Image Processing, Image Quality, Image Capture Systems Conference, Montreal, Canada, pp. 177–180 (2001)Google Scholar
  6. 6.
    Tan, C.L., Cao, R., Peiyi, S.: Restoration of archival documents using a wavelet technique. IEEE Trans. Pattern Anal. Machine Intell. 24, 1399–1404 (2002)CrossRefGoogle Scholar
  7. 7.
    Dano, P.: Joint restoration and compression of document images with bleed-through distortion. Master thesis, Ottawa-Carleton Institute for Electrical and Computer Engineering, School of Information Technology and Engineering, University of Ottawa (2003)Google Scholar
  8. 8.
    Nishida, H., Suzuki, T.: Correcting of show-through effects on document images by multiscale analysis. In: Proceedings of the 16th Conference on Pattern Recognition, Quebec City, Canada, pp. 65–68 (2002)Google Scholar
  9. 9.
    Nishida, H., Suzuki, T.: A multiscale approach to restoring scanned color document images with show-through effects. In: Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR 2003) (2003)Google Scholar
  10. 10.
    Tonazzini, A., Bedini, L., Salerno, E.: Independent component analysis for document restoration. IJDAR 7(1), 17–27 (2004)CrossRefGoogle Scholar
  11. 11.
    Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. Wiley, New York (2001)Google Scholar
  12. 12.
    Tonazzini, A., Salerno, E., Mochi, M., Bedini, L.: Bleed-through removal from degraded documents using a color decorrelation method. In: Document Analysis Systems VI, LNCS 3163, pp. 229–240. Springer, Berlin Heidelberg New York (2004)Google Scholar
  13. 13.
    Tonazzini, A., Salerno, E., Mochi, M., Bedini, L.: Blind source separation techniques for detecting hidden texts and textures in document images. In: Image Analysis and Recognition, LNCS 3212, Part II, pp. 241–248. Springer, Berlin Heidelberg New York (2004)Google Scholar
  14. 14.
    Salerno, E., Tonazzini, A., Bedini, L.: Digital image analysis to enhance underwritten text in the Archimedes palimpsest. IJDAR (submitted)Google Scholar
  15. 15.
    Cichocki, A., Amari, S.-I.: Adaptive Blind Signal and Image Processing. Wiley, New York (2002)Google Scholar
  16. 16.
    Bell, A.J., Sejnowski, T.J.: An information maximization approach to blind separation and blind deconvolution. Neural Comput. 7, 1129–1159 (1995)PubMedGoogle Scholar
  17. 17.
    Ohta, Y., Kanade, T., Sakai, T.: Color information for region segmentation. Comput. Graph. Vis. Image Process. 13, 222–241 (1980)CrossRefGoogle Scholar
  18. 18.
    Hyvärinen, A., Oja, E.: Independent component analysis: algorithms and applications. Neural Netw. 13, 411–430 (2000)CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag 2006

Authors and Affiliations

  • Anna Tonazzini
    • 1
    Email author
  • Emanuele Salerno
    • 1
  • Luigi Bedini
    • 1
  1. 1.Istituto di Scienza e Tecnologie dell'Informazione - CNRVia G. MoruzziPisaItaly

Personalised recommendations