Local thresholding of degraded or unevenly illuminated documents using fuzzy inclusion and entropy measures

Abstract

There are applications in which the content of a scanned document needs to be recognized or improved. We often achieve this by converting our input into a binary image and this is in fact the first step in many document analysis systems or optical character recognition (OCR) processes. In cases where our input is degraded or has a non-uniform illumination, global thresholding algorithms fail to deliver adequate results. For this reason, we have to use some local thresholding techniques which binarize each pixel based on the grayscale information of its adjoining pixels. In this paper, we present a local thresholding method based on specific fuzzy inclusion and entropy measures which we introduced in some of our previous work. We use these indicators to measure specific attributes of the neighborhood of a pixel and then, based on these values, an appropriate threshold is calculated. We don’t use the histogram of the image or any statistical measures and contrast parameters depending on the input. It is an open, automated and adaptable procedure and in this presentation we see some implementations of a more general algorithm along with some specific results. Our main domain of experimentation consists of texts containing lighting “irregularities” but some remarks regarding further generalization are being made as well. We also comment on other potential of these measures and the prospect of being connected with other studies that already use fuzzy inclusion and entropy measures.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3

References

  1. Abak T, Baris U, Sankur B (1997) The performance of thresholding algorithms for optical character recognition. In: International conference on document analysis and recognition ICDAR’97, pp 697–700

  2. Angelov P, Yager Y (2013) Density-based averaging-a new operator for data fusion. Inf Sci 222:163–174

    MathSciNet  MATH  Article  Google Scholar 

  3. Angelov P, Kasabov N (2005) Evolving computational intelligence systems. In: Proceedings of the 1st international workshop on genetic fuzzy systems, pp 76–82

  4. Angelov P, Victor J, Dourado A, Filev D (2004) On-line evolution of Takagi-Sugeno fuzzy models. In: 2nd IFAC workshop on advanced fuzzy/neural control, pp 67–72

    Article  Google Scholar 

  5. Bernsen J (1986) Dynamic thresholding of gray-level images. In: Proceedings of 8th international conference on pattern recognition, Paris, pp 1251–1255

  6. Blayvas I, Bruckstein A, Kimmel R (2006) Efficient computation of adaptive threshold surfaces for image binarization. Pattern Recognit 39:89–101

    Article  Google Scholar 

  7. Bogiatzis A, Papadopoulos B (2018a) Binarization of texts with varying lighting conditions using fuzzy inclusion and entropy measures. Int Conf Num Anal Appl Math 1978(1):290006

    Google Scholar 

  8. Bogiatzis A, Papadopoulos B (2018b) Producing fuzzy inclusion and entropy measures and their application on global image thresholding. Evolving Systems 9(4):331–353

    Article  Google Scholar 

  9. Boulmakoul A, Laarabi MH, Sacile R (2017) An original approach to ranking fuzzy numbers by inclusion index and Bitset Encoding. Fuzzy Optim Decis Mak 16(1):23–49

    MathSciNet  MATH  Article  Google Scholar 

  10. Bronevich AG, Rozenberg IN (2014) Ranking probability measures by inclusion indices in the case of unknown utility function. Fuzzy Optim Decis Mak 13(1):49–71 Springer, US

    MathSciNet  MATH  Article  Google Scholar 

  11. Baruah RD, Angelov P (2014) DEC: dynamically evolving clustering and its application to structure identification of evolving fuzzy models. IEEE Trans Cybern 44(9):1619–1631

    Article  Google Scholar 

  12. Baruah RD, Angelov P (2012) Evolving local means method for clustering of streaming data. In: IEEE international conference on fuzzy systems, pp 1-8

  13. Cho S, Haralick R, Yi S (1989) Improvement of Kittler and Illingworth’s minimum error thresholding. Pattern Recognit 22(5):609–617

    Article  Google Scholar 

  14. Chow CK, Kaneko T (1972) Automatic detection of the left ventricle from cineangiograms. Comput Biomed Res 5:388–410

    Article  Google Scholar 

  15. Cintra ME, Monard MC, Camargo HA (2010) Data base definition and feature selection for the genetic generation of fuzzy rule bases. Evol Syst 1(4):241–252

    Article  Google Scholar 

  16. Cross V (2018) Relating fuzzy set similarity measures. Adv Intell Syst Comput 648:9–21

    Google Scholar 

  17. Dey V, Pratihar DK, Datta GL (2011) Genetic algorithm-tuned entropy-based fuzzy C-means algorithm for obtaining distinct and compact clusters. Fuzzy Optim Decis Mak 10(2):153–166

    MathSciNet  Article  Google Scholar 

  18. Eikvil L, Taxt T, Moen K (1991) A fast adaptive method for binarization of document images. In: Proceedings of ICDAR, France, pp 435–443

  19. Firdousi R, Parveen S (2014) Local thresholding techniques in image binarization. Int J Eng Comput Sci 3(3):4062–4065

    Google Scholar 

  20. Henzgen S, Strickert M, Hullermeier E (2014) Visualization of evolving fuzzy rule-based systems. Evol Syst 5(3):175–191

    Article  Google Scholar 

  21. Herbst G, Bocklisch SF (2010) Recognition of fuzzy time series patterns using evolving classification results. Evol Syst 1(2):97–110

    Article  Google Scholar 

  22. Huang LK, Wang MJJ (1995) Image thresholding by minimizing the measures of fuzziness. Pattern Recognit 28(1):41–51

    Article  Google Scholar 

  23. Hulianytskyi LF, Riasna II (2016) Automatic classification method based on a fuzzy similarity relation. Cybern Syst Anal 52(1):30–37

    MATH  Article  Google Scholar 

  24. Jung D, Choi JW, Park WJ (2011) Quantitative comparison of similarity measure and entropy for fuzzy sets. J Cent South Univ Technol 18(6):2045–2049

    Article  Google Scholar 

  25. Klir GJ, Yuan B (1996) Fuzzy sets and fuzzy logic:  theory and applications. Prentice Hall, Upper Saddle River, NJ

    Google Scholar 

  26. Kosko B (1992) Neural networks and fuzzy systems: a dynamical systems approach to machine intelligence. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  27. Kosko B (1990) Fuzziness vs. probability. Int J Gen Syst 17:211–240

    MATH  Article  Google Scholar 

  28. Kosko B (1986) Fuzzy entropy and conditioning. Inf Sci 40:165–174

    MathSciNet  MATH  Article  Google Scholar 

  29. Lan R, Fan JL, Liu Y (2016) Image thresholding by maximizing the similarity degree based on intuitionistic fuzzy sets. Quant Log Soft Comput Adv Intell Syst Comput 510:631–640

    Google Scholar 

  30. Leedham G, Yan C, Takru K et al (2003) Thresholding algorithms for text/background segmentation in difficult document images. In: Seventh international conference on document analysis and recognition (ICDAR), pp 859–864

  31. Leng G, Zeng XJ, Keane JA (2012) An improved approach of self-organising fuzzy neural network based on similarity measures. Evol Syst 3(1):19–30

    Article  Google Scholar 

  32. Lukka P (2011) Feature selection using fuzzy entropy measures with similarity classifer. Expert Syst Appl 38(4):4600–4607

    Article  Google Scholar 

  33. Mansoori EG, Shafiee KS (2016) On fuzzy feature selection in designing fuzzy classifiers for high-dimensional data. Evol Syst 7(4):255–265

    Article  Google Scholar 

  34. Mardia KV, Hainsworth TJ (1988) A spatial thresholding method for image segmentation. IEEE Trans Pattern Anal Mach Intell 10:919–927

    Article  Google Scholar 

  35. Niblack W (1986) An introduction to digital image processing. Prentice-Hall International, Englewood Cliffs

    Google Scholar 

  36. Oh W, Lindquist B (1999) Image thresholding by indicator kriging. Pattern Anal Mach Intell IEEE Trans 21(7):590–602

    Article  Google Scholar 

  37. Otsu N (1975) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9:62–66

    Article  Google Scholar 

  38. Palanisamy C, Selvan S (2009) Efficient subspace clustering for higher dimensional data using fuzzy entropy. J Syst Sci Syst Eng 18(1):95–110

    MATH  Article  Google Scholar 

  39. Parker JR (1991) Gray level thresholding in badly illuminated images. IEEE Trans Pattern Anal Mach Intell 13(8):813–819

    Article  Google Scholar 

  40. Prasad M, Divakar T, Rao B (2011) Unsupervised image thresholding using fuzzy measures. Int J Comput Appl 27(2):32–41

    Google Scholar 

  41. Sauvola J, Pietikainen M (2000) Adaptive document image binarization. Pattern Recognit 33(2):225–236

    Article  Google Scholar 

  42. Sauvola J, Seppanen T, Haapakoski S et al (1997) Adaptive document binarization. In: Proceedings of 4th international conference on document analysis and recognition, Ulm Germany, pp 147–152

  43. Scozzafava R, Vantaggi B (2009) Fuzzy inclusion and similarity through coherent conditional probability. Fuzzy Sets Syst 160:292–305

    MathSciNet  MATH  Article  Google Scholar 

  44. Sezgin M, Sankur B (2001) Comparison of thresholding methods for non-destructive testing applications, IEEE ICIP’2001. In: International Conference Image Processing, pp 764–767

  45. Sezgin M, Sankur B (2004)  Survey over image thresholding techniques and quantitative performance evaluation. J Electron Imaging 13:146

    Article  Google Scholar 

  46. Singh TR, Roy S, Singh OI et al (2011) A new local adaptive thresholding technique in binarization. Int J Comput Sci Issues 8(6):271–277

    Google Scholar 

  47. Singh OI, Sinam T, James O et al (2012) Local contrast and mean based thresholding technique in image binarization. Int J Comput Appl 51(6):4–10

    Google Scholar 

  48. Sussner P, Valle ME (2008) Classification of fuzzy mathematical morphologies based on concepts of inclusion measure and duality. J Math Imaging Vis 32(2):139–159

    MathSciNet  Article  Google Scholar 

  49. Trier OD, Taxt T (1995) Evaluation of binarization methods for document images. IEEE Trans Pattern Anal Mach Intell 17:312–315

    Article  Google Scholar 

  50. White JM, Rohrer GD (1983) Image thresholding for optical character recognition and other applications requiring character image extraction. IBM J Res Dev 27(4):400–411

    Article  Google Scholar 

  51. Xiaoyi J (2003) Adaptive local thresholding by verification—based multithreshold probing with application to vessel detection in retinal images. In: IEEE transactions on pattern analysis and machine intelligence Vol. 25. Computer Society, pp 131–137

  52. Yanowitz SD, Bruckstein AM (1989) A new method for image segmentation*. Comput Vis Graph Image Process 46(1):82–95

    Article  Google Scholar 

  53. Young RV (1996) Fuzzy subsethood. Fuzzy Sets Syst 77:371–384

    MathSciNet  MATH  Article  Google Scholar 

  54. Zhang H, Yang S (2016) Inclusion measure for typical hesitant fuzzy sets, the relative similarity measure and fuzzy entropy. Soft Comput 20(4):1277–1287

    MathSciNet  MATH  Article  Google Scholar 

  55. Zhang YJ (1996) A survey on evaluation methods for image segmentation. Pattern Recognit 29:1335–1346

    Article  Google Scholar 

  56. Zhou R, Yang Z, Yu M (2015) A portfolio optimization model based on information entropy and fuzzy time series. Fuzzy Optim Decis Mak 14(4):381–397

    MathSciNet  MATH  Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Athanasios C. Bogiatzis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bogiatzis, A.C., Papadopoulos, B.K. Local thresholding of degraded or unevenly illuminated documents using fuzzy inclusion and entropy measures. Evolving Systems 10, 593–619 (2019). https://doi.org/10.1007/s12530-018-09262-5

Download citation

Keywords

  • Fuzzy entropy
  • Fuzzy inclusion
  • Fuzzy measuring
  • Image binarization
  • Local thresholding