Advertisement

Efficient multiscale Sauvola’s binarization

  • Guillaume Lazzara
  • Thierry GéraudEmail author
Original Paper

Abstract

This work focuses on the most commonly used binarization method: Sauvola’s. It performs relatively well on classical documents, however, three main defects remain: the window parameter of Sauvola’s formula does not fit automatically to the contents, it is not robust to low contrasts, and it is not invariant with respect to contrast inversion. Thus, on documents such as magazines, the contents may not be retrieved correctly, which is crucial for indexing purpose. In this paper, we describe how to implement an efficient multiscale implementation of Sauvola’s algorithm in order to guarantee good binarization for both small and large objects inside a single document without adjusting manually the window size to the contents. We also describe how to implement it in an efficient way, step by step. This algorithm remains notably fast compared to the original one. For fixed parameters, text recognition rates and binarization quality are equal or better than other methods on text with low and medium x-height and are significantly improved on text with large x-height. Pixel-based accuracy and OCR evaluations are performed on more than 120 documents. Compared to awarded methods in the latest binarization contests, Sauvola’s formula does not give the best results on historical documents. On the other hand, on clean magazines, it outperforms those methods. This implementation improves the robustness of Sauvola’s algorithm by making the results almost insensible to the window size whatever the object sizes. Its properties make it usable in full document analysis toolchains.

Keywords

Binarization Multiscale Document image analysis  Algorithm 

Notes

Acknowledgments

The authors would like to thank Yongchao Xu, Jonathan Fabrizio, and Roland Levillain for proof-reading and commenting on the paper, and Reza Farrahi Moghaddam, Thibault Lelore, and Frédéric Bouchara for their active collaboration. The authors are very grateful to Yan Gilbert who has accepted that we use and publish as data some pages from the French magazine “Le Nouvel Obs” (issue 2402, November 18–24, 2010) for our experiments.

References

  1. 1.
    Sezgin, M., Sankur, B.: Survey over image thresholding techniques and quantitative performance evaluation. J. Electron. Imaging 13, 146–165 (2004)CrossRefGoogle Scholar
  2. 2.
    Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern 9(1), 62–66 (1979)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Niblack, W.: An Introduction to Digital Image Processing. Strandberg Publishing Company, Birkeroed (1985)Google Scholar
  4. 4.
    Sauvola, J., Pietikainen, M.: Adaptive document image binarization. Pattern Recogn. 33, 225–236 (2000)CrossRefGoogle Scholar
  5. 5.
    Badekas, E., Papamarkos, N.: Automatic evaluation of document binarization results. In: Progress in Pattern Recognition, Image Analysis and Applications, pp. 1005–1014 (2005)Google Scholar
  6. 6.
    Bernsen, J.: Dynamic thresholding of grey-level images. In: Proceedings of the International Conference on, Pattern Recognition, pp. 1251–1255 (1986)Google Scholar
  7. 7.
    Gabarra, E., Tabbone, A.: Combining global and local threshold to binarize document of images. In: Pattern Recognition and Image Analysis, vol. 3523 of LNCS, pp. 173–186. Springer, Berlin (2005)Google Scholar
  8. 8.
    Rangoni, Y., Shafait, F., Breuel, T.M.: OCR based thresholding. In: Proceedings of IAPR Conference on Machine Vision Applications, pp. 98–101 (2009)Google Scholar
  9. 9.
    Tabbone, S., Wendling, L.: Multi-scale binarization of images. Pattern Recogn. Lett. 24(1–3), 403–411 (2003)CrossRefGoogle Scholar
  10. 10.
    Farrahi Moghaddam, R., Cheriet, M.: A multi-scale framework for adaptive binarization of degraded document images. Pattern Recogn. 43(6), 2186–2198 (2010)CrossRefzbMATHGoogle Scholar
  11. 11.
    Chang, F., Liang, K.-H., Tan, T.-M., Hwang, W.-L.: Binarization of document images using hadamard multiresolution analysis. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 157–160 (1999)Google Scholar
  12. 12.
    Bukhari, S.S., Shafait, F., Breuel, T.: Foreground-background regions guided binarization of camera-captured document images. In: Proceedings of the International Workshop on Camera Based Document Analysis and Recognition, 7 (2009)Google Scholar
  13. 13.
    Lu, S., Su, B., Tan, C.: Document image binarization using background estimation and stroke edges. Int. J. Doc. Anal. Recogn. 13, 303–314 (2010)CrossRefGoogle Scholar
  14. 14.
    Lelore, T., Bouchara, F.: Super-resolved binarization of text based on the FAIR algorithm. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 839–843 (2011)Google Scholar
  15. 15.
    Lelore, T., Bouchara, F.: FAIR: a fast algorithm for document image restoration. (2013, published)Google Scholar
  16. 16.
    Gatos, B., Ntirogiannis, K., Pratikakis, I.: ICDAR 2009 document image binarization contest (DIBCO). In: Proceedings of ICDAR, pp. 1375–1382 (2009)Google Scholar
  17. 17.
    Pratikakis, I., Gatos, B., Ntirogiannis, K.: H-DIBCO 2010—handwritten document image binarization competition. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 727–732 (2010)Google Scholar
  18. 18.
    Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICDAR 2011 document image binarization contest (DIBCO). In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1506–1510 (2011)Google Scholar
  19. 19.
    Howe, N.: A laplacian energy for document binarization. In: Proceedings of the IEEE International Conference on Document Analysis and Recognition, pp. 6–10 (2011)Google Scholar
  20. 20.
    Shafait, F., Keysers, D., Breuel, T.M.: Efficient implementation of local adaptive thresholding techniques using integral images. In: Document Recognition and Retrieval XV, vol. 6815, p. 681510 (2008)Google Scholar
  21. 21.
    Wolf, C., Jolion, J.-M.: Extraction and recognition of artificial text in multimedia documents. Pattern Anal. Appl. 6, 309–326 (2004)Google Scholar
  22. 22.
    Kim I.-J. Multi-window binarization of camera image for document recognition. In: Proceedings of International Workshop on Frontiers in Handwriting Recognition, pp. 323–327 (2004)Google Scholar
  23. 23.
    Vincent, L.: Exact Euclidean distance function by chain propagations. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 520–525 (1991)Google Scholar
  24. 24.
    Chassery, J.-M., Montanvert, A.: Geometrical representation of shapes and objects for visual perception. In: Geometric Reasoning for Perception and Action, vol. 708 of LNCS, pp. 163–182. Springer, Berlin (1993)Google Scholar
  25. 25.
    Dillencourt, M.B., Samet, H., Tamminen, M.: A general approach to connected-component labeling for arbitrary image representations. J. ACM 39(2), 253–280 (1992)CrossRefzbMATHMathSciNetGoogle Scholar
  26. 26.
    Morton, G.M.: A computer oriented geodetic data base; and a new technique in file sequencing. Technical report, IBM Company (1966)Google Scholar
  27. 27.
    Antonacopoulos, A., Pletschacher, S., Bridson, D., Papadopoulos, C.: ICDAR 2009 page segmentation competition. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1370–1374 (2009a)Google Scholar
  28. 28.
    Fabrizio, J., Marcotegui, B., Cord, M.: Text segmentation in natural scenes using toggle-mapping. In: Proceedings of IEEE International Conference on Image Processing (2009)Google Scholar
  29. 29.
    Su, B., Lu, S., Tan, C.L.: Binarization of historical document images using the local maximum and minimum. In: Proceedings of the IAPR International Workshop on Document Analysis Systems, pp. 159–166 (2010)Google Scholar
  30. 30.
    Serra, J.: Toggle mappings. Technical report, CMM, Ecole des Mines, France (1989)Google Scholar
  31. 31.
    Antonacopoulos, A., Bridson, D., Papadopoulos, C., Pletschacher, S.: A realistic dataset for performance evaluation of document layout analysis. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 296–300 (2009b)Google Scholar
  32. 32.
    Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICFHR 2012 competition on handwritten document image binarization. In Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 813–818 (2102)Google Scholar
  33. 33.
    Mollah, A.F., Basu, S., Nasipuri, M.: Computationally efficient implementation of convolution-based locally adaptive binarization techniques. In: Wireless Networks and Computational Intelligence, vol. 292 of CCIS pp. 159–168. Springer, Berlin (2012)Google Scholar
  34. 34.
    Smith, R.: An overview of the Tesseract OCR engine. In Proceedings of International Conference on Document Analysis and Recognition 2, 629–633 (2007) Google Scholar
  35. 35.
    Vandewalle, P., Kovacevic, J., Vetterli, M.: Reproducible research in signal processing. IEEE Signal Process. Mag. 26(3), 37–47 (2009)Google Scholar
  36. 36.
    Lazzara, G., Levillain, R., Géraud, T., Jacquelet, Y., Marquegnies, J., Crépin-Leblond, A.: The SCRIBO module of the Olena platform: a free software framework for document image analysis. In Proc. of the Intl. Conf. on Document Analysis and Recognition (2011)Google Scholar
  37. 37.
    Levillain, R., Géraud, T., Najman, L.: Milena: Write generic morphological algorithms once, run on many kinds of images. In: Mathematical Morphology and Its Application to Signal and Image Processing (Proceedings of the International Symposium on Mathematical Morphology), pp. 295–306. Springer, Berlin (2009)Google Scholar
  38. 38.
    Levillain, R., Géraud, T., Najman, L.: Why and how to design a generic and efficient image processing framework: The case of the Milena library. In: Proceedings of the IEEE International Conference on Image Processing, pp. 1941–1944 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.EPITA Research and Development Laboratory, LRDELe Kremlin-BicêtreFrance

Personalised recommendations