Skip to main content

Document Image Binarization

  • Chapter
  • First Online:
Document Layout Analysis

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

  • 142 Accesses

Abstract

Documents often get affected by noise due to various issues like casual handling occurred during storing, or image acquisition processes. These noises not only affect the quality of the documents but also degrade the visual appearance of the content. Therefore, these have high potential to downgrade the final outcome. To suppress the noise, occur in an input document image at the initial stage, it is essential to use an efficient pre-processing before further analysis. Pre-processing converts the input image to a specific form, suitable for further analysis without disturbing the knowledge content. Among all, binarization is very popular and commonly performed pre-processing before many document image processing tasks. Therefore, in the present chapter, a detailed discussion on document image binarization and the recent progresses in this domain are presented. Additionally, in this chapter, various degradations related to document image, and various noise models are also described to make the discussion complete.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Likforman-Sulem, L., Zahour, A., Taconet, B.: Text line segmentation of historical documents: a survey. Int. J. Doc. Anal. Recognit. 9, 123–138 (2007)

    Article  Google Scholar 

  2. van Kempen, G.M.P., van Vliet, L.J., Verveer, P.J., van der Voort, H.T.M.: A quantitative comparison of image restoration methods for confocal microscopy. J. Microsc. 185(3), 354–365 (1997)

    Article  Google Scholar 

  3. Sulaiman, A., Omar, K., Nasrudin, M.F.: A database for degraded Arabic historical manuscripts. In: 2017 6th International Conference on Electrical Engineering and Informatics (ICEEI), pp. 1–6 (2017)

    Google Scholar 

  4. Mustafa, W.A., Yazid, H.: Illumination and contrast correction strategy using bilateral filtering and binarization comparison. J. Telecommun. Electron. Comput. Eng. 8(1), 67–73 (2016)

    Google Scholar 

  5. Sulaiman, A., Omar, K., Nasrudin, M.F.: Degraded historical document binarization: a review on issues, challenges, techniques, and future directions. J. Imaging. 5(4), 48 (2019)

    Article  Google Scholar 

  6. Dougherty, G.: Digital Image Processing for Medical Applications. Cambridge University Press (2009)

    Book  Google Scholar 

  7. Verma, R., Ali, J.: A comparative study of various types of image noise and efficient noise removal techniques. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3(10) (2013)

    Google Scholar 

  8. Astola, J., Kuosmanen, P.: Fundamentals of Nonlinear Digital Filtering. CRC Press (2020)

    Book  MATH  Google Scholar 

  9. Dhananjay, K.T.: Digital Image Processing (Using MATLAB Codes) (Jul 2013)

    Google Scholar 

  10. Kamboj, P., Rani, V.: A brief study of various noise model and filtering techniques. J. Glob. Res. Comput. Sci. 4(4), 166–171 (2013)

    Google Scholar 

  11. Singh, P., Shree, R.: A comparative study to noise models and image restoration techniques. Int. J. Comput. Appl. 149(1), 18–27 (2016)

    Google Scholar 

  12. Farahmand, A., Sarrafzadeh, H., Shanbehzadeh, J.: Document Image Noises and Removal Methods (2013)

    Google Scholar 

  13. Pratikakis, I., Zagoris, K., Barlas, G., Gatos, B.: ICDAR2017 competition on document image binarization (DIBCO 2017). In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1395–1403 (2017)

    Chapter  Google Scholar 

  14. Pratikakis, I., Zagoris, K., Karagiannis, X., Tsochatzidis, L., Mondal, T., Marthot-Santaniello, I.: ICDAR 2019 competition on document image binarization (DIBCO 2019). In: International Conference on Document Analysis and Recognition (ICDAR), pp. 1547–1556 (2019)

    Google Scholar 

  15. Binmakhashen, G.M., Mahmoud, S.A.: Document layout analysis: a comprehensive survey. ACM Comput. Surv. 52(6), 1–36 (2019)

    Article  Google Scholar 

  16. Gatos, B., Ntirogiannis, K., Pratikakis, I.: ICDAR 2009 document image binarization contest (DIBCO 2009). In: 10th International Conference on Document Analysis and Recognition, 2009. ICDAR’09, pp. 1375–1382 (2009)

    Google Scholar 

  17. Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICDAR 2013 document image binarization contest (DIBCO 2013). In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1471–1476 (2013)

    Chapter  MATH  Google Scholar 

  18. Lins, R.D., de Almeida, M.M., Bernardino, R.B., Jesus, D., Oliveira, J.M.: Assessing binarization techniques for document images. In: Proceedings of the 2017 ACM Symposium on Document Engineering, pp. 183–192 (2017)

    Chapter  Google Scholar 

  19. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)

    Article  MathSciNet  Google Scholar 

  20. Kittler, J., Illingworth, J., Föglein, J.: Threshold selection based on a simple image statistic. Comput. Vision Graph. Image Process. 30(2), 125–147 (1985)

    Article  Google Scholar 

  21. Niblack, W.: An Introduction to Digital Image Processing. Strandberg Publishing Company (1985)

    Google Scholar 

  22. Das, B., Bhowmik, S., Saha, A., Sarkar, R.: An adaptive foreground-background separation method for effective binarization of document images. In: International Conference on Soft Computing and Pattern Recognition, pp. 515–524 (2016)

    Google Scholar 

  23. Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)

    Article  Google Scholar 

  24. Wolf, C., Jolion, J.-M., Chassaing, F.: Text localization, enhancement and binarization in multimedia documents. In: 2002 International Conference on Pattern Recognition, vol. 2, pp. 1037–1040 (2002)

    Google Scholar 

  25. Gatos, B., Pratikakis, I., Perantonis, S.J.: Adaptive degraded document image binarization. Pattern Recogn. 39(3), 317–327 (2006)

    Article  MATH  Google Scholar 

  26. Trier, O.D., Jain, A.K.: Goal-directed evaluation of binarization methods. IEEE Trans. Pattern Anal. Mach. Intell. 17(12), 1191–1201 (1995)

    Article  Google Scholar 

  27. Sezgin, M.: Survey over image thresholding techniques and quantitative performance evaluation. J. Electron. Imaging. 13(1), 146–168 (2004)

    Article  Google Scholar 

  28. Mishra, A., Alahari, K., Jawahar, C.V.: Unsupervised refinement of color and stroke features for text binarization. Int. J. Doc. Anal. Recognit. 20, 1–17 (2017)

    Article  Google Scholar 

  29. Chen, Y., Wang, L.: Broken and degraded document images Binarization. Neurocomputing. 237, 272–280 (2017)

    Article  Google Scholar 

  30. Lu, D., Huang, X., Liu, C., Lin, X., Zhang, H., Yan, J.: Binarization of degraded document image based on contrast enhancement. In: Control Conference (CCC), 2016 35th Chinese, pp. 4894–4899 (2016)

    Chapter  Google Scholar 

  31. Howe, N.R.: Document binarization with automatic parameter tuning. Int. J. Doc. Anal. Recognit. 16(3), 247–258 (2013)

    Article  Google Scholar 

  32. Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1124–1137 (2004)

    Article  MATH  Google Scholar 

  33. Kolmogorov, V., Zabin, R.: What energy functions can be minimized via graph cuts? IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 147–159 (2004)

    Article  Google Scholar 

  34. Ayyalasomayajula, K.R., Brun, A.: Document binarization using topological clustering guided Laplacian energy segmentation. In: 2014 14th International Conference on Frontiers in Handwriting Recognition, pp. 523–528 (2014)

    Chapter  Google Scholar 

  35. Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICFHR 2012 competition on handwritten document image binarization (H-DIBCO 2012). In: 2012 International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 817–822 (2012)

    Chapter  Google Scholar 

  36. Ntirogiannis, K., Gatos, B., Pratikakis, I.: ICFHR2014 competition on handwritten document image binarization (H-DIBCO 2014). In: 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 809–813 (2014)

    Chapter  Google Scholar 

  37. Pratikakis, I., Zagoris, K., Barlas, G., Gatos, B.: ICFHR2016 Handwritten Document Image Binarization Contest (H-DIBCO 2016). In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 619–623 (2016)

    Google Scholar 

  38. Peng, X., Setlur, S., Govindaraju, V., Sitaram, R.: Markov random field based binarization for hand-held devices captured document images. In: Proceedings of the 7th Indian Conference on Computer Vision, Graphics and Image Processing, pp. 71–76 (2010)

    Google Scholar 

  39. Su, B., Lu, S., Tan, C.L.: A learning framework for degraded document image binarization using Markov random field. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pp. 3200–3203 (2012)

    Google Scholar 

  40. Kuk, J.G., Cho, N.I.: Feature based binarization of document images degraded by uneven light condition. In: 10th International Conference on Document Analysis and Recognition, 2009. ICDAR’09, pp. 748–752 (2009)

    Google Scholar 

  41. Wolf, C., Doermann, D.: Binarization of low quality text using a Markov random field model. In: Proceedings of the 16th International Conference on Pattern Recognition, 2002, vol. 3, pp. 160–163 (2002)

    Google Scholar 

  42. Papamarkos, N.: A neuro-fuzzy technique for document binarisation. Neural Comput. Appl. 12(3–4), 190–199 (2003)

    Article  Google Scholar 

  43. Rabelo, J.C.B., Zanchettin, C., Mello, C.A.B., Bezerra, B.L.D.: A multi-layer perceptron approach to threshold documents with complex background. In: 2011 IEEE International Conference on Systems, Man, and Cybernetics, pp. 2523–2530 (2011)

    Chapter  Google Scholar 

  44. Kefali, A., Sari, T., Bahi, H.: Foreground-background separation by feed-forward neural networks in old manuscripts. Informatica. 38(4), 329 (2014)

    Google Scholar 

  45. Pastor-Pellicer, J., Zamora-Martínez, F., España-Boquera, S., Castro-Bleda, M.J.: F-measure as the error function to train neural networks. In: Advances in Computational Intelligence: 12th International Work-Conference on Artificial Neural Networks, IWANN 2013, Puerto de la Cruz, Tenerife, Spain, 12–14 Jun 2013, Proceedings, Part I, vol. 12, pp. 376–384 (2013)

    Chapter  Google Scholar 

  46. Trier, O.D., Taxt, T.: Evaluation of binarization methods for document images. IEEE Trans. Pattern Anal. Mach. Intell. 17(3), 312–315 (1995)

    Article  Google Scholar 

  47. Badekas, E., Papamarkos, N.: Document binarisation using Kohonen SOM. IET Image Process. 1(1), 67–84 (2007)

    Article  Google Scholar 

  48. Badekas, E., Papamarkos, N.: Optimal combination of document binarization techniques using a self-organizing map neural network. Eng. Appl. Artif. Intell. 20(1), 11–24 (2007)

    Article  Google Scholar 

  49. Mitianoudis, N., Papamarkos, N.: Document image binarization using local features and Gaussian mixture modeling. Image Vis. Comput. 38, 33–51 (2015)

    Article  Google Scholar 

  50. Jana, P., Ghosh, S., Bera, S.K., Sarkar, R.: Handwritten document image binarization: an adaptive K-means based approach. In: 2017 IEEE Calcutta Conference (CALCON), pp. 226–230 (2017)

    Chapter  Google Scholar 

  51. Jana, P., Ghosh, S., Sarkar, R., Nasipuri, M.: A fuzzy C-means based approach towards efficient document image binarization. In: 2017 Ninth International Conference on Advances in Pattern Recognition (ICAPR), pp. 1–6 (2017)

    Google Scholar 

  52. Bhowmik, S., Sarkar, R., Das, B., Doermann, D.: GiB: A game theory inspired binarization technique for degraded document images. IEEE Trans. Image Process. 28(3) (2019). https://doi.org/10.1109/TIP.2018.2878959

  53. Simard, P.Y., Steinkraus, D., Platt, J.C.: Best practices for convolutional neural networks applied to visual document analysis. In: ICDAR, 2003, vol. 3, pp. 958–962 (2003)

    Google Scholar 

  54. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  55. Pastor-Pellicer, J., España-Boquera, S., Zamora-Martínez, F., Afzal, M.Z., Castro-Bleda, M.J.: Insights on the use of convolutional neural networks for document image binarization. In: Advances in Computational Intelligence: 13th International Work-Conference on Artificial Neural Networks, IWANN 2015, Palma de Mallorca, Spain, 10–12 Jun 2015. Proceedings, Part II, vol. 13, pp. 115–126 (2015)

    Chapter  Google Scholar 

  56. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015) pp. 3431–3440

    Google Scholar 

  57. Tensmeyer, C., Martinez, T.: Document Image Binarization with Fully Convolutional Neural Networks. arXiv Prepr. arXiv1708.03276 (2017)

    Google Scholar 

  58. Vo, Q.N., Kim, S.H., Yang, H.J., Lee, G.: Binarization of degraded document images based on hierarchical deep supervised network. Pattern Recogn. 74, 568–586 (2018)

    Article  Google Scholar 

  59. Peng, X., Wang, C., Cao, H.: Document binarization via multi-resolutional attention model with DRD loss. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 45–50 (2019)

    Chapter  Google Scholar 

  60. Krähenbühl, P., Koltun, V.: Efficient inference in fully connected crfs with Gaussian edge potentials. In: Adv. Neural Inf. Process. Syst., vol. 24 (2011)

    Google Scholar 

  61. Ayyalasomayajula, K.R., Malmberg, F., Brun, A.: PDNet: semantic segmentation integrated with a primal-dual network for document binarization. Pattern Recognit. Lett. 121, 52–60 (2018)

    Article  Google Scholar 

  62. Peng, X., Cao, H., Natarajan, P.: Using convolutional encoder-decoder for document image binarization. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp. 708–713 (2017)

    Chapter  Google Scholar 

  63. Calvo-Zaragoza, J., Gallego, A.-J.: A selectional auto-encoder approach for document image binarization. Pattern Recogn. 86, 37–47 (2019)

    Article  Google Scholar 

  64. Huang, X., Li, L., Liu, R., Xu, C., Ye, M.: Binarization of degraded document images with global-local U-Nets. Optik (Stuttg). 203, 164025 (2020)

    Article  Google Scholar 

  65. Kang, S., Iwana, B.K., Uchida, S.: Cascading modular u-nets for document image binarization. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 675–680 (2019)

    Chapter  Google Scholar 

  66. Tensmeyer, C., Brodie, M., Saunders, D., Martinez, T.: Generating realistic binarization data with generative adversarial networks. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 172–177 (2019)

    Chapter  Google Scholar 

  67. Zhao, J., Shi, C., Jia, F., Wang, Y., Xiao, B.: Document image binarization with cascaded generators of conditional generative adversarial networks. Pattern Recogn. 96, 106968 (2019)

    Article  Google Scholar 

  68. Krantz, A., Westphal, F.: Cluster-based sample selection for document image binarization. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 5, pp. 47–52 (2019)

    Chapter  Google Scholar 

  69. Afzal, M.Z., Pastor-Pellicer, J., Shafait, F., Breuel, T.M., Dengel, A., Liwicki, M.: Document image binarization using lstm: a sequence learning approach. In: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, pp. 79–84 (2015)

    Chapter  Google Scholar 

  70. Westphal, F., Lavesson, N., Grahn, H.: Document image binarization using recurrent neural networks. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 263–268 (2018)

    Chapter  Google Scholar 

  71. Su, B., Lu, S., Tan, C.L.: Robust document image binarization technique for degraded document images. IEEE Trans. Image Process. 22(4), 1408–1417 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  72. Lins, R.D., Bernardino, R., Jesus, D.M.: A quality and time assessment of binarization algorithms. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1444–1450 (2019)

    Google Scholar 

  73. ICDAR. DIBCO competition datasets. https://vc.ee.duth.gr/dibco2019/ (2019). Accessed 27 Feb 2023

  74. Ayatollahi, S.M., Nafchi, H.Z.: Persian heritage image binarization competition (PHIBC 2012). In: 2013 First Iranian Conference on Pattern Recognition and Image Analysis (PRIA), pp. 1–4 (2013)

    Google Scholar 

  75. Hossein Ziaie Nafchi, M.C., Ayatollahi, S.M., Moghaddam, R.F.: PHIBD 2012. http://www.iapr-tc11.org/mediawiki/index.php/Persian_Heritage_Image_Binarization_Dataset_(PHIBD_2012)

  76. Mollah, A.F., Basu, S., Nasipuri, M.: Computationally efficient implementation of convolution-based locally adaptive binarization techniques. In: Wireless Networks and Computational Intelligence: 6th International Conference on Information Processing, ICIP 2012, Bangalore, India, 10–12 Aug 2012. Proceedings, pp. 159–168 (2012)

    Google Scholar 

  77. Nafchi, H.Z., Moghaddam, R.F., Cheriet, M.: Phase-based binarization of ancient document images: model and applications. IEEE Trans. Image Process. 23(7), 2916–2930 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  78. Rowley-Brooke, R., Pitié, F., Kokaram, A.: A ground truth bleed-through document image database. In: Proceedings of the Second International Conference on Theory and Practice of Digital Libraries (TPDL 2012), Paphos, Cyprus, 23–27 Sept 2012, vol. 2, pp. 185–196 (2012)

    Google Scholar 

  79. Kesiman, M.W.A., Prum, S., Burie, J.-C., Ogier, J.-M.: An initial study on the construction of ground truth binarized images of ancient palm leaf manuscripts. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 656–660 (2015)

    Google Scholar 

  80. Kesiman, M.W.A., Burie, J.-C., Wibawantara, G.N.M.A., Sunarya, I.M.G., Ogier, J.-M.: AMADI_LontarSet: the first handwritten Balinese palm leaf manuscripts dataset. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 168–173 (2016)

    Google Scholar 

  81. Kesiman, M.W.A., et al.: ICFHR 2018 competition on document image analysis tasks for southeast Asian palm leaf manuscripts. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 483–488 (2018)

    Google Scholar 

  82. Tensmeyer, C., Martinez, T.: Historical document image binarization: a review. SN Comput. Sci. 1(3), 173 (2020)

    Article  Google Scholar 

  83. Bera, S.K., Ghosh, S., Bhowmik, S., Sarkar, R., Nasipuri, M.: A Non-parametric Binarization Method Based on Ensemble of Clustering Algorithms. Multimed. Tools Appl. 80, 7653–7673 (2020). https://doi.org/10.1007/s11042-020-09836-z

    Article  Google Scholar 

  84. Yu, J., Bhanu, B.: Evolutionary feature synthesis for facial expression recognition. Pattern Recogn. Lett. 27(11), 1289–1298 (2006). https://doi.org/10.1016/j.patrec.2005.07.026

    Article  Google Scholar 

  85. Pratikakis, I,, Gatos, B., Ntirogiannis, K.: ICDAR 2011 Document Image Binarization Contest (DIBCO 2011). In: ICDAR, Sept 2011, pp. 1506–1510 (2011)

    Google Scholar 

  86. Lu, H., Kot, A.C., Shi, Y.Q.: Distance-reciprocal distortion measure for binary document images. IEEE Signal Process. Lett. 11(2), 228–231 (2004)

    Article  Google Scholar 

  87. Ntirogiannis, K., Gatos, B., Pratikakis, I.: Performance evaluation methodology for historical document image binarization. IEEE Trans. Image Process. 22(2), 595–609 (2012)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bhowmik, S. (2023). Document Image Binarization. In: Document Layout Analysis. SpringerBriefs in Computer Science. Springer, Singapore. https://doi.org/10.1007/978-981-99-4277-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4277-0_2

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4276-3

  • Online ISBN: 978-981-99-4277-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics