Advertisement

Border Noise Removal and Clean Up Based on Retinex Theory

  • Marian Wagdy
  • Ibrahima Faye
  • Dayang Rohaya
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 285)

Abstract

Conversion from gray scale or color document image into binary image is the main step in most of Optical Character Recognition (OCR) systems and document analysis. After digitization, document images often suffer from poor contrast, noise, uniform lighting, and shadow. Also when a page of book is digitized using a scanner or a camera, a border noise, which is an unwanted text coming from the adjacent page, may appear. In this paper we present a simple and efficient document image clean up by border noise removal and enhancement based on retinex theory and global threshold. The proposed method produces high quality results compared to the previous works.

Keywords

Binarization Thresholding Border noise Retinex theory 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Y. Chen and G. Leedham, “Decompose Algorithm for Thresholding Degraded Historical Document Images” IEEE Proceedings on Vision, Image and Signal Processing, vol. 152 No.6, pp. 702–714, 2005.Google Scholar
  2. 2.
    G. Agam, G. Bal, G. Frieder, and O. Frieder, “Degraded Document Image Enhancement” in Document Recognition and Retrieval XIV, Proc. SPIE, vol. 6500, pp. 65000C-1 - 65000C-11, 2007.Google Scholar
  3. 3.
    J. M. White and G. D. Rohrer, “Image Thresholding for Optical Character Recognition and Other Applications Requiring Character Image Extraction” IBM Journal of Research and Development vol. 27, No. 4, pp. 400-411, 1983.Google Scholar
  4. 4.
    L. Gorman “Binarization and Multithresholding of Document Image Using Connectivity” CVGIP, Graph. Models Image Processing, vol. 56, No. 6, pp. 496-506, 1994.Google Scholar
  5. 5.
    R. Cattoni, T. Coianiz, S. Messelodi, and CM Modena, “Geometric Layout Analysis Techniques for Document Image Understanding: a Review”, ITC-irst Technical Report 9703 (09), 1998.Google Scholar
  6. 6.
    P. Viola and M. J. Jones, “Robust Real-Time Face Detection,” Int. Journal of Computer Vision, vol. 57, No. 2, pp. 137– 154, 2004.Google Scholar
  7. 7.
    F. Shafait, D. Keysers, and T. M. Breuel, “Performance Comparison of Six Algorithms for Page Segmentation,” in 7th IAPR Workshop on Document Analysis Systems, pp. 368–379, 2006.Google Scholar
  8. 8.
    N. Otsu, “A Threshold Selection Method FromGray-Level Histograms,” IEEE Trans. Systems, Man, and Cybernetics, vol. 9, No. 1, pp. 62–66, 1979.Google Scholar
  9. 9.
    Y. Solihin, and C. G. Leedham, “Integral Ratio: A New Class of Global Thresholding Techniques for Handwriting Images”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, No. 8, pp. 761 – 768, 1999.Google Scholar
  10. 10.
    W. Niblack “An Introduction to Digital Image Processing” Prentice-Hall, Englewood Cliffs, New Jersey, 1986.Google Scholar
  11. 11.
    J. Sauvola and M. Pietikainen, “Adaptive Document Image Binarization,” Proc. of Pattern Recognition, vol. 33, No. 2, pp. 225–236, 2000.Google Scholar
  12. 12.
    T.Romen “A New Local Adaptive Thresholding Technique in Binarization” IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 6, No. 2, pp. 271-277,2011.Google Scholar
  13. 13.
    J. G. Kuk, and N. I. Cho, “Feature Based Binarization of Document Images Degraded by Uneven Light Condition” in 10th inter. Conf. On Document Analysis and Recognition (ICDAR), pp. 748-752, 2009.Google Scholar
  14. 14.
    I. K. Kim, D. W. Jung, and R. H. Park, “Document Image Binarization Based on Topographic Analysis Using a Water Fow Model” Proc. of Pattern Recognition, vol. 35, pp. 265–277, 2002.Google Scholar
  15. 15.
    Bolan Su, Shijian Lu, and Chew Lim Tan “Binarization of Historical Document Images Using the Local Maximum and Minimum” 9th IAPR International Workshop on Document Analysis Systems, pp. 159-166, 2010.Google Scholar
  16. 16.
    Baird, H.S.: Background structure in document images. In: Bunke, H. Wang, P., B aird, H.S. (eds.) Document Image Analysis. World Scientific, Singapore, pp. 17–34 (1994).Google Scholar
  17. 17.
    Breuel, T.M.: Two geometric algorithms for layout analysis. In: Proceedings of Document Analysis Systems. Lecture Notes in Computer Science, vol. 2423, Princeton, NY, USA, pp. 188–199 (2002).Google Scholar
  18. 18.
    O’Gorman, L.: The document spectrum for page layout analysis. IEEE Trans. Pattern Anal. Mach. Intell. 15(11), 1162– 1173 (1993).Google Scholar
  19. 19.
    S. Mao and T. Kanungo, “Empirical Per formance Evaluation Methodology and Its Application to Page Segmentation Algorithms,” IEEE Trans. Pattern Analysis and M achi ne Intelligence, vol. 23, no. 3, pp. 242-256, Mar. 2001.Google Scholar
  20. 20.
    F. Shafait, D. Keysers, and T.M. Breuel, “Performance Evaluation and Benchmarking of Six Page Segmentation Algorithms,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp. 941-954, June 2008.Google Scholar
  21. 21.
    F. Shafait, D. Keyser s, and T.M. B reuel, “Pixel-Accurate Representation and Evaluation of Page Segmentation in Document Images,” Proc. 18th Int’l Conf. Pattern Recognition, pp. 872-875, Aug. 2006.Google Scholar
  22. 22.
    N. Stamatopoulos, B.Gatos, and A. K esidis, “Automatic Borders Detection of Camera DocumentImages,” Proc. Second I nt’l Workshop Camera-Based Document Analys is and Recognition, pp. 71-78, Sept. 2007.Google Scholar
  23. 23.
    F. Shafait, J. van B euseko m, D. Keysers, and T.M.Breuel, “Do cumentCleanup Using Page Frame Detectio n,” Int’l J. Document Analysis and Recognition, vol. 11, no. 2, pp. 81-96, 2008.Google Scholar
  24. 24.
    F. Shafait, J. van B eusekom, D. K eysers, and T.M. B reuel, “Page Frame Detection for Marginal Noise Removal from S canned Documents,” Proc. Scandinavian Conf. I mage Analys is, pp. 651-660, June 2007.Google Scholar
  25. 25.
    Edwin H. Land, “The Retinex Theory of Color Vision,” Scientific American, Vol. 237, No. 6, pp. 108-128, 1977.Google Scholar
  26. 26.
    Kuo-Chin Fan, Yuan-Kai Wang, Tsann-Ran Lay, “Marginal Noise Removal of Document Images”, Pattern Recognition, 35(11), 2002, pp. 2593-2611.Google Scholar

Copyright information

© Springer Science+Business Media Singapore 2014

Authors and Affiliations

  1. 1.Centre of Intelligent Signal and Imaging Research (CISIR)Universiti Teknologi PetronasSeri IskandarMalaysia
  2. 2.Department of Computer and Information SciencesUniversiti Teknologi PetronasSeri IskandarMalaysia
  3. 3.Department of Fundamental and Applied SciencesUniversiti Teknologi PetronasSeri IskandarMalaysia

Personalised recommendations