Unsupervised Document Binarization Via Disentangled Representation

Salman, K. H.; Bhagvati, Chakravarthy

doi:10.1007/978-981-16-6616-2_39

K. H. Salman⁸ &
Chakravarthy Bhagvati⁸

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 267))

321 Accesses

Abstract

Binarization of document is considered the first key step in many document processing tasks. In this paper, we try to reformulate the problem as an image-to-image translation. Most of the existing learning methods for document binarization make use of supervised approach, but obtaining ground truth for binarized documents is difficult. Here we developed an unsupervised adversarial training procedure for binarization. We use disentangling of style and content from a binarized document and transfer the binarized style to the input document. Our results indicate that this approach works on par with many other results published in literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Article Google Scholar
Meng, G., Yuan, K., Wu, Y., Xiang, S., Pan, C.: Deep networks for degraded document image binarization through pyramid reconstruction. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 727–732. IEEE (2017)
Google Scholar
Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recognit. 33(2), 225–236 (2000)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
Vo, Q.N., Kim, S.H., Yang, H.J., Lee, G.: Binarization of degraded document images based on hierarchical deep supervised network. Pattern Recognit. 74, 568–586 (2018)
Article Google Scholar
He, S., Schomaker, L.: Deepotsu: document enhancement and binarization using iterative deep learning. Pattern Recognit. 91, 379–390 (2019)
Article Google Scholar
Huang, X., Liu, M.Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 172–189 (2018)
Google Scholar
Lee, H.Y., Tseng, H.Y., Huang, J.B., Singh, M.K., Yang, M.H.: Diverse image-to-image translation via disentangled representations. In: European Conference on Computer Vision (2018)
Google Scholar
Li, X., Chen, L., Wang, L., Wu, P., Tong, W.: Scgan: disentangled representation learning by adding similarity constraint on generative adversarial nets. IEEE Access 7, 147928–147938 (2019)
Article Google Scholar
Mao, X., Li, Q., Xie, H., Lau, R.Y.K., Wang, Z., Smolley, S.P.: Least squares generative adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2813–2821 (2017)
Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV) (2017)
Google Scholar
Bao, J., Chen, D., Wen, F., Li, H., Hua, G.: Towards open-set identity preserving face synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6713–6722 (2018)
Google Scholar
Zhu, J.Y., Zhang, R., Pathak, D., Darrell, T., Efros, A.A., Wang, O., Shechtman, E.: Toward multimodal image-to-image translation (2017)
Google Scholar
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)
Google Scholar
Kazemi, H., Iranmanesh, S.M., Nasrabadi, N.: Style and content disentanglement in generative adversarial networks. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 848–856. IEEE (2019)
Google Scholar
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. CVPR (2017)
Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Hyderabad, Hyderabad, India
K. H. Salman & Chakravarthy Bhagvati

Authors

K. H. Salman
View author publications
You can also search for this author in PubMed Google Scholar
Chakravarthy Bhagvati
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electronics and Communication Engineering, Shri Ramswaroop Memorial Group of Professional Colleges (SRMGPC), Lucknow, Uttar Pradesh, India
Vikrant Bhateja
College of Computing, Michigan Technological University, Michigan, MI, USA
Jinshan Tang
School of Computer Engineering, Kalinga Institute of Industrial Technology (KIIT), Bhubaneswar, India
Suresh Chandra Satapathy
Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
Peter Peer
Department of Computer Science and Engineering, National Institute of Technology (NIT) Mizoram, Aizawl, India
Ranjita Das

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Salman, K.H., Bhagvati, C. (2022). Unsupervised Document Binarization Via Disentangled Representation. In: Bhateja, V., Tang, J., Satapathy, S.C., Peer, P., Das, R. (eds) Evolution in Computational Intelligence. Smart Innovation, Systems and Technologies, vol 267. Springer, Singapore. https://doi.org/10.1007/978-981-16-6616-2_39

Download citation

DOI: https://doi.org/10.1007/978-981-16-6616-2_39
Published: 24 April 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-6615-5
Online ISBN: 978-981-16-6616-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Unsupervised Document Binarization Via Disentangled Representation