Information Density Based Image Binarization for Text Document Containing Graphics
In this work, a new clustering based binarization technique has been proposed. Clustering is done depending on the information density of the input image. Here input image is considered as a set of text, images as foreground and some random noises, marks of ink, spots of oil, etc. in the background. It is often quite difficult to separate the foreground from the background based on existing binarization technique. The existing methods offer good result if the input image contains only text. Experimental results indicate that this method is particularly good for degraded text document containing graphic images as well. USC-SIPI database is used for testing phase. It is compared with iterative partitioning, Otsu’s method for seven different metrics.
KeywordsIterative partitioning NTSC color format Wiener filter Binarization Entropy
- 2.Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Pearson Education India, New Delhi (2009)Google Scholar
- 3.Namboodiri, A.M., et al.: Document structure and layout analysis. In: Chaudhuri, B.B. (ed.) Digital Document Processing. Springer, London (2007)Google Scholar
- 4.Dinan, R.F., Dubil, J.F., Malin, J.R., Rodite, R.R., Rohe, C.F., Rohrer, G.D.: Document image processing system. US Patent 4,888,812, 19 December 1989Google Scholar
- 5.Jaimes, A., Mintzer, F.C., Rao, A.R., Thompson, G.: Segmentation and automatic descreening of scanned documents. In: Electronic Imaging 1999, International Society for Optics and Photonics, pp. 517–528 (1998)Google Scholar
- 7.Parker, J.R., Jennings, C., Salkauskas, A.G.: Thresholding using an illumination model. In: Proceedings of the Second International Conference on Document Analysis and Recognition, 1993, pp. 270–273. IEEE (1993)Google Scholar
- 9.Yanowitz, S.D., Bruckstein, A.M.: A new method for image segmentation. In: 9th International Conference on Pattern Recognition, 1988, pp. 270–275. IEEE (1988)Google Scholar
- 13.Otsu, N.: A threshold selection method from gray-level histograms. Automatica 11(285–296), 23–27 (1975)Google Scholar
- 14.California, S.: USC-SIPI image database, University of Southern California. http://sipi.usc.edu/database/
- 19.Chaki, N., Shaikh, S.H., Saeed, K.: Exploring Image Binarization Techniques. SCI, vol. 560. Springer, New Delhi (2014)Google Scholar
- 20.Su, B., Lu, S., Tan, C.L.: Binarization of historical document images using the local maximum and minimum. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp. 159–166. ACM (2010)Google Scholar
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.