Fast Adaptive Binarization with Background Estimation for Non-uniformly Lightened Document Images
Fast and reliable adaptive binarization of unevenly lightened document images is one of the key issues for the Optical Character Recognition (OCR) purposes applied in mobile devices with limited computational power. Considering the document image captured in unknown lighting conditions the use of a single global thresholding in the binarization step makes the text recognition impossible as some parts of it might be lost in the analysed binary image.
On the other hand some well-known adaptive binarization methods e.g. Niblack, Sauvola and their modifications, are computationally demanding and might not be efficiently applied in some applications. Therefore a method for filling the gap between those two approaches is proposed in the paper. It is based on the region based approach utilizing the lighting correction method, in which input data are taken from lighting distribution approximated using reduced resolution images. Obtained binarization results are superior in comparison to typically used adaptive thresholding algorithms in terms of computational speed as well as the final OCR accuracy.
KeywordsBinarization OCR Document image analysis
- 3.Feng, M.L., Tan, Y.P.: Adaptive binarization method for document image analysis. In: Proceedings of the 2004 IEEE International Conference on Multimedia and Expo (ICME), vol. 1, pp. 339–342 (2004)Google Scholar
- 5.Khurshid, K., Siddiqi, I., Faure, C., Vincent, N.: Comparison of Niblack inspired binarization methods for ancient documents. In: Document Recognition and Retrieval XVI, vol. 7247, pp. 7247–7247–9 (2009)Google Scholar
- 6.Lech, P., Okarma, K.: Fast histogram based image binarization using the Monte Carlo threshold estimation. In: Chmielewski, L.J., Kozera, R., Shin, B.S., Wojciechowski, K. (eds.) Computer Vision and Graphics. LNCS, vol. 8671, pp. 382–390. Springer International Publishing, Switzerland (2014)Google Scholar
- 8.Leedham, G., Yan, C., Takru, K., Tan, J.H.N., Mian, L.: Comparison of some thresholding algorithms for text/background segmentation in difficult document images. In: Proceedings of the 7th International Conference on Document Analysis and Recognition, ICDAR 2003, pp. 859–864 (2003)Google Scholar
- 9.Michalak, H., Okarma, K.: Region based adaptive binarization for optical character recognition purposes. In: 2018 International Interdisciplinary PhD Workshop (IIPhDW), pp. 361–366 (2018)Google Scholar
- 10.Michalak, H., Okarma, K.: Fast adaptive image binarization using the region based approach. In: Silhavy, R. (ed.) Artificial Intelligence and Algorithms in Intelligent Systems, AISC, vol. 764, pp. 79–90. Springer International Publishing (2019)Google Scholar
- 12.Niblack, W.: An Introduction to Digital Image Processing. Prentice Hall, Englewood Cliffs (1986)Google Scholar
- 15.Pratikakis, I., Zagoris, K., Barlas, G., Gatos, B.: ICDAR 2017 Document Image Binarization COmpetition (DIBCO 2017) (2017). https://vc.ee.duth.gr/dibco2017/
- 18.Saxena, L.P.: Niblack’s binarization method and its modifications to real-time applications: a review. Artif. Intell. Rev. 1–33 (2017)Google Scholar
- 19.Shrivastava, A., Srivastava, D.K.: A review on pixel-based binarization of gray images. In: AISC, vol. 439, pp. 357–364. Springer, Singapore (2016)Google Scholar