A Comparative Study of Different Approaches of Noise Removal for Document Images

  • Brijmohan Singh
  • Mridula
  • Vivek Chand
  • Ankush Mittal
  • D. Ghosh
Conference paper
Part of the Advances in Intelligent and Soft Computing book series (AINSC, volume 130)

Abstract

There has been intensive research carried out in the field of OCR (Optical Character Recognition). Lots of work has been done and articles have been published. Noise is one of the important factors which have to be handled at the stage of preprocessing before applying other steps of OCR systems. Noise is undesirable signal because it obscures the subject of the image. This paper presents the comparative study of the five noise removal approaches: Weiner, Median, Wavelet, Contourlet, and Curvelet for document images. The different approaches of noise removal were compared visually and by employing Peak Signal to Noise Ratio (PSNR), F-measure and NRM evaluation measures.

Keywords

OCR Noise Curvelet Wavelet Contourlet Weiner filter Median filter 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gatos, B., Mantzaris, S.L., Perantonis, S.J., Tsigris, A.: Automatic page analysis for the creation of a digital library from newspaper archives. Int. J. Digit. Libr. 3, 77–84 (2000)Google Scholar
  2. 2.
    Peerawit, W., Kawtrakul, A.: Marginal noise removal from document images using edge density. In: Proceeding of 4th Information and Computer Engineering Postgraduate Workshop, Phuket, Thailand (January 2004)Google Scholar
  3. 3.
    Ye, X., Cheriet, M., Suen, C.Y.: A generic method of cleaning and enhancing handwritten data from business forms. Int. J. Doc. Anal. Recog. 4, 84–96 (2001)CrossRefGoogle Scholar
  4. 4.
    Jain, A.K.: Fundamentals of Digital Image Processing. Prentice-Hall, Englewood Cliffs (1989)MATHGoogle Scholar
  5. 5.
    Kavallieratou, E., Stamatatos, E.: Improving the quality of degraded document images. In: Proceedings of the Second International Conference on Document Image Analysis for Libraries, pp. 330–339. IEEE (2006)Google Scholar
  6. 6.
    Cao, H., Govindaraju, V.: Handwritten carbon form pre-processing based on markov random field. In: Proceeding of Computer Vision and Pattern Recognition, pp. 1–7. IEEE (2007)Google Scholar
  7. 7.
    Lins, R.D., Silva, G.F.P., Simske, S.J., Fan, J., Shaw, M., Sá, P., Thielo, M.: Image classification to improve printing quality of mixed type documents. In: Proceeding of International Conference on Document Analysis and Recognition, pp. 1106–1110. IEEE Press, Barcelona (2009)CrossRefGoogle Scholar
  8. 8.
    Lins, R.D.: A Taxonomy for Noise in Images of Paper Documents - the Physical Noises. In: Kamel, M., Campilho, A. (eds.) ICIAR 2009. LNCS, vol. 5627, pp. 844–854. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  9. 9.
    Lins, R.D., Banerjee, S., Thielo, M.: Automatically detecting and classifying noises in document images. In: Proceeding of ACM Symposium on Applied Computing, vol. 3, pp. 33–39 (2010)Google Scholar
  10. 10.
    Fan, K.C., Wang, Y.K., Lay, T.R.: Marginal noise removal of document images. Pattern Recognition 35(11), 2593–2611 (2002)MATHCrossRefGoogle Scholar
  11. 11.
    Zheng, Y., Liu, C., Ding, X., Pan, S.: Form frame line detection with directional single-connected chain. In: Proceeding of Sixth International Conference on Document Analysis and Recognition, pp. 699–703 (2001)Google Scholar
  12. 12.
    Ali, M.: Background noise detection and cleaning in document images. In: Proceeding of 13th International Conference on Pattern Recognition, vol. 3, pp. 758–762 (1996)Google Scholar
  13. 13.
    Bernsen, J.: Dynamic thresholding of grey-level images. In: Proceeding of 8th International Conference on Pattern Recognition, pp. 1251–1255 (1986)Google Scholar
  14. 14.
    Niblack, W.: An Introduction to Digital Image Processing, pp. 115–116. Prentice Hall (1986)Google Scholar
  15. 15.
    Schilling, R.J.: Fundamentals of Robotics Analysis and Control. Prentice-Hall, Englewood Cliffs (1990)Google Scholar
  16. 16.
    O’Gorman, L.: Image and document processing techniques for the right pages electronic library system. In: Proceeding of 11th International Conference on Pattern Recognition, pp. 260–263 (1992)Google Scholar
  17. 17.
    Rudin, L., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Physica D 60, 259–268 (1992)MATHCrossRefGoogle Scholar
  18. 18.
    Story, G.A., O’Gorman, L., Fox, D., Schaper, L.L., Jagadish, H.V.: The right pages image-based electronic library for alerting and browsing. Computer 25(9), 17–26 (1992)CrossRefGoogle Scholar
  19. 19.
    Ali, M.B.J.: Background noise detection and cleaning in document images. In: Proceeding of International Conference on Pattern Recognition, Vienna, Austria, pp. 758–762 (1996)Google Scholar
  20. 20.
    Liang, J., Haralick, R.: Document image restoration using binary morphological filters. In: Proceeding of SPIE Document Recognition III, San Jose, CA, vol. 2660, pp. 274–285 (1996)Google Scholar
  21. 21.
    Buades, A., Coll, B., Morel, J.M.: A review of image denoising algorithms. SIAM- Multiscale Modeling and Simulation 4, 490–530 (2005)MathSciNetMATHCrossRefGoogle Scholar
  22. 22.
    Loce, R.P., Dougherty, E.R.: Enhancement and restoration of digital documents – Statistical Design of Nonlinear Algorithms. SPIE Optical Engineering Press (1997)Google Scholar
  23. 23.
    Jain, A.K., Yu, B.: Document representation and its application to page decomposition. IEEE Transaction on Pattern Analysis and Machine Intelligence 20(3), 294–308 (1998)CrossRefGoogle Scholar
  24. 24.
    Chinnasarn, K., Rangsanseri, Y., Thitimajshima, P.: Removing salt-and-pepper noise in text/graphics images. In: Proceeding of IEEE Asia-Pacific Conference on Circuits and Systems, Chiangmai, pp. 459–462 (1998)Google Scholar
  25. 25.
    Cheriet, M.: Extraction of handwritten data from noisy gray-level images using a multi-scale approach. In: Proceeding of Vision Interface, Vancouver, BC, Canada, vol. 1, pp. 389–396 (1998)Google Scholar
  26. 26.
    Don, H.S.: A noise attributes thresholding method for document image binarization. International Journal on Document Image Analysis and Recognition 4(2), 131–138 (2000)CrossRefGoogle Scholar
  27. 27.
    Nishiwaki, D., Hayashi, M., Sato, A.: Robust Frame Extraction and Removal for Processing form Documents. In: Blostein, D., Kwon, Y.-B. (eds.) GREC 2001. LNCS, vol. 2390, pp. 36–45. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  28. 28.
    Gonzalez, R.C., Woods, R.E.: Digital image processing (DIP/3e), 3rd edn. Pearson Education, AsiaGoogle Scholar
  29. 29.
    Siyuan, C., Xiangpeng, C.: The Second-generation Wavelet Transform and its Application in denoising of Seismic Data. Applied Geophysics 2(2), 70–74 (2005)CrossRefGoogle Scholar
  30. 30.
    Do, M.N., Vetterli, M.: The contourlet transform: An Efficient Directional Multiresolution Image Representation. IEEE Transactions on Image Processing 14, 2091–2106Google Scholar
  31. 31.
    Hostalkova, E., Prochazka, A.: Wavelet Signal and Image Denoising. Institute of Chemical Technology. Department of Computing and Control EngineeringGoogle Scholar
  32. 32.
    Candès, E.J., Donoho, D.L.: Curvelets- A Surprisingly Adaptive Representation for Object with Edges, pp. 105–120. Vanderbilt University Press, Nashville (2000)Google Scholar
  33. 33.
    Starck, J.L., Candès, E.J., Donoho, D.L.: The curvelet transform for image Denoising. IEEE Transactions on Image Processing 11(6), 670–684 (2002)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer India Pvt. Ltd. 2012

Authors and Affiliations

  • Brijmohan Singh
    • 1
  • Mridula
    • 3
  • Vivek Chand
    • 1
  • Ankush Mittal
    • 2
  • D. Ghosh
    • 1
  1. 1.Research CellCollege of Engineering RoorkeeRoorkeeIndia
  2. 2.Graphic Era UniversityDehradunIndia
  3. 3.IIT RoorkeeRoorkeeIndia

Personalised recommendations