Abstract
Document binarization is an important technique in document image analysis and recognition. Generally, binarization methods are ineffective for degraded images. Several binarization methods have been proposed; however, none of them are effective for historical and degraded document images. In this paper, a new binarization method is proposed for degraded document images. The proposed method based on the variance between pixel contrast, it consists of four stages: pre-processing, geometrical feature extraction, feature selection, and post-processing. The proposed method was evaluated based on several visual and statistical experiments. The experiments were conducted using five International Document Image Binarization Contest benchmark datasets specialized for binarization testing. The results compared with five adaptive binarization methods: Niblack, Sauvola thresholding, Sauvola compound algorithm, NICK, and Bataineh. The results show that the proposed method performs better than other methods in all binarization cases.
Similar content being viewed by others
References
Kefali A, Sari T, Sellami M (2010) Evaluation of several binarization techniques for old Arabic documents images. In: 1st International symposium on modeling and implementing complex systems “MISC’2010”, pp 88–99
Khurshid K, Siddiqi I, Faure C, Vincent N (2010) Comparison of Niblack Inspired Binarization Methods for Ancient Documents. In: 16th International conference on Document Recognition and Retrieval, pp 1–10
Stathis P, Kavallieratou E, Papamarkos N (2008) An evaluation technique for binarization algorithms. J Univers Comput Sci 14:3011–3030
Otsu N (1979) A thresholding selection method from gray-scale histogram. IEEE Trans Syst Man Cybern 9:62–66
Niblack W (1985) An introduction to digital image processing. Prentice Hall, pp 115–116
Sauvola J, Seppanen T, Haapakoski S, Pietikainen M (1997) Adaptive document binarization. In: Fourth international conference document analysis and recognition (ICDAR), pp 147–152
Bataineh B, Abdullah SNHS, Omer K (2011) An adaptive local binarization method for document images based on a novel thresholding method and dynamic windows. Pattern Recogn Lett 32:1805–1813
Chou C, Lin W, Chang F (2010) A binarization method with learning-built rules for document images produced by cameras. Pattern Recognit 43:1518–1530
Gatos B, Pratikakis I, Perantonis S (2006) Adaptive degraded document image binarization. Pattern Recognit 39:317–327
Sauvola J, Pietikainen M (2000) Adaptive document image binarization. Pattern Recognit 33:225–236
Howe R (2011) A laplacian energy for document binarization, “ICDAR 2011”. In: International conference on document analysis and recognition, pp 6–10
Gatos B, Ntirogiannis K, Pratikakis I (2009) ICDAR 2009 document image binarization contest. In: proceedings 10th international conference on document analysis and recognition, pp 1375–1382
Gatos B, Ntirogiannis K, Pratikakis I, DIBCO 2009 (2009) Document image binarization contest. Int J Doc Anal Recognit 14(2011):35–44
Pratikakis I, Gatos B, Ntirogiannis K (2010) H-DIBCO 2010—handwritten document image binarization competition. In: 12th international conference on frontiers in handwriting recognition, pp 727–732
Pratikakis I, Gatos B, Ntirogiannis K (2011) ICDAR 2011 document image binarization contest (DIBCO 2011). In: International conference on document analysis and recognition “ICDAR2011”, pp 1506–1510
Pratikakis I, Gatos B, Ntirogiannis K (2012) ICFHR 2012 competition on handwritten document image binarization (H-DIBCO 2012). In: 2012 international conference on frontiers in handwriting recognition, pp 813–818
Bassiou N, Kotropoulos C (2007) Color image histogram equalization by absolute discounting back-off. Comput Vis Image Underst 107:108–122
Gonzalez RC, Woods RE (2007) Digital image processing, 3rd ed. Prentice Hall, Upper Saddle River, NJ
Bar-Yosef I, Beckman I, Kedem K, Dinstein I (2007) Binarization, character extraction, and writer identification of historical Hebrew calligraphy documents. Int J Doc Anal Recognit 9:89–99
Chichilnisky E, Kalmar JRS (2002) Functional asymmetries in ON and OFF ganglion cells of primate retina. J Neurosci 22:2737–2747
Fiorentini A (2004) Brightness and lightness. In: The visual neurosciences, vol 2. MIT Press, Cambridge, pp 881–891
Vonikakis V, Andreadis I, Papamarkos N (2011) Robust document binarization with OFF center-surround cells. Pattern Anal Appl 14:219–234
Konstantinidis K, Vonikakis V, Panitsidis G, Andreadis I (2011) A center-surround histogram for content based image retrieval. Pattern Anal Appl 14:251–260
Shapiro LG, Stockman G (2001) Computer vision, 1st edn. Prentice Hall PTR, Upper Saddle River, NJ
Chiu Y, Chung K, Yang W, Huang Y, Liao C (2012) Parameter-free based two-stage method for binarizing degraded document images. Pattern Recogn 45:4250–4262
Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45:427–437
Narendra, Patrenahalli M (1981) A separable median filter for image noise smoothing. In: IEEE transactions on pattern analysis and machine intelligence, pp 20–29
Wang J, Lin L (1997) Improved median filter using minmax algorithm for image processing. Electron Lett 33:1362–1363
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bataineh, B., Abdullah, S.N.H.S. & Omar, K. Adaptive binarization method for degraded document images based on surface contrast variation. Pattern Anal Applic 20, 639–652 (2017). https://doi.org/10.1007/s10044-015-0520-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-015-0520-0