Skip to main content

Simultaneous Detection of Regular Patterns in Ancient Manuscripts Using GAN-Based Deep Unsupervised Segmentation

Part of the Lecture Notes in Computer Science book series (LNIP,volume 12667)

Abstract

Document Information Retrieval has attracted researchers’ attention when discovering secrets behind ancient manuscripts. To understand such documents, analyzing their layouts and segmenting their relevant features are fundamental tasks. Recent efforts represent unsupervised document segmentation, and its importance in ancient manuscripts has provided a unique opportunity to study the said problem. This paper proposes a novel collaborative deep learning architecture in an unsupervised mode that can generate synthetic data to avoid uncertainties regarding their degradations. Moreover, this approach utilizes the generated distribution to assign labels that are associated with superpixels. The unsupervised trained model is used to segment the page, ornaments, and characters simultaneously. Promising accuracies in the segmentation task were noted. Experiments with data from degraded documents show that the proposed method can synthesize noise-free documents and enhance associations better than the state-of-the-art methods. We also investigate the usage of overall generated samples, and their effectiveness in different unlabelled historical documents tasks.

Keywords

  • Ancient manuscripts
  • Degradations
  • Synthesize data
  • Unsupervised segmentation
  • Layout

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-68787-8_20
  • Chapter length: 13 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-68787-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.

References

  1. Abuelwafa, S., Pedersoli, M., Cheriet, M.: Unsupervised exemplar-based learning for improved document image classification. IEEE Access 7, 133738–133748 (2019)

    CrossRef  Google Scholar 

  2. Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intelli. 34(11), 2274–2282 (2012)

    CrossRef  Google Scholar 

  3. Adak, C., Chaudhuri, B.B., Blumenstein, M.: A study on idiosyncratic handwriting with impact on writer identification. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 193–198. IEEE (2018)

    Google Scholar 

  4. Afzal, M.Z., Kölsch, A., Ahmed, S., Liwicki, M.: Cutting the error by half: investigation of very deep CNN and advanced training strategies for document image classification. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 883–888. IEEE (2017)

    Google Scholar 

  5. Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D.: Unsupervised pixel-level domain adaptation with generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3722–3731 (2017)

    Google Scholar 

  6. Bukhari, S.S., Dengel, A.: Visual appearance based document classification methods: Performance evaluation and benchmarking. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 981–985. IEEE (2015)

    Google Scholar 

  7. Chen, S., He, Y., Sun, J., Naoi, S.: Structured document classification by matching local salient features. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pp. 653–656. IEEE (2012)

    Google Scholar 

  8. Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2172–2180 (2016)

    Google Scholar 

  9. Das, A., Roy, S., Bhattacharya, U., Parui, S.K.: Document image classification with intra-domain transfer learning and stacked generalization of deep convolutional neural networks. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 3180–3185. IEEE (2018)

    Google Scholar 

  10. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  11. Diem, M., Kleber, F., Fiel, S., Grüning, T., Gatos, B.: cbad: ICDAR 2017 competition on baseline detection. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1355–1360. IEEE (2017)

    Google Scholar 

  12. Eaton-Rosen, Z., Bragman, F., Ourselin, S., Cardoso, M.J.: Improving data augmentation for medical image segmentation (2018)

    Google Scholar 

  13. Gattal, A., Abbas, F., Laouar, M.R.: Automatic parameter tuning of k-means algorithm for document binarization. In: Proceedings of the 7th International Conference on Software Engineering and New Technologies, pp. 1–4 (2018)

    Google Scholar 

  14. Gidaris, S., Singh, P., Komodakis, N.: Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728 (2018)

  15. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

  16. Harley, A.W., Ufkes, A., Derpanis, K.G.: Evaluation of deep convolutional nets for document image classification and retrieval. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 991–995. IEEE (2015)

    Google Scholar 

  17. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    MathSciNet  CrossRef  Google Scholar 

  18. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)

    Google Scholar 

  19. Ji, B., Chen, T.: Generative adversarial network for handwritten text. arXiv preprint arXiv:1907.11845 (2019)

  20. Kanezaki, A.: Unsupervised image segmentation by backpropagation. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1543–1547. IEEE (2018)

    Google Scholar 

  21. Kumar, J., Doermann, D.: Unsupervised classification of structurally similar document images. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1225–1229. IEEE (2013)

    Google Scholar 

  22. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  23. Maroñas, J., Paredes, R., Ramos, D.: Generative models for deep learning with very scarce data. In: Vera-Rodriguez, R., Fierrez, J., Morales, A. (eds.) CIARP 2018. LNCS, vol. 11401, pp. 20–28. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-13469-3_3

    CrossRef  Google Scholar 

  24. Ntirogiannis, K., Gatos, B., Pratikakis, I.: ICFHR 2014 competition on handwritten document image binarization (h-dibco 2014). In: 2014 14th International Conference on Frontiers in Handwriting Recognition, pp. 809–813. IEEE (2014)

    Google Scholar 

  25. Pratikakis, I., Zagoris, K., Barlas, G., Gatos, B.: ICDAR 2017 competition on document image binarization (dibco 2017). In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1395–1403. IEEE (2017)

    Google Scholar 

  26. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)

  27. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    CrossRef  Google Scholar 

  28. Saddami, K., Afrah, P., Mutiawani, V., Arnia, F.: A new adaptive thresholding technique for binarizing ancient document. In: 2018 Indonesian Association for Pattern Recognition International Conference (INAPR), pp. 57–61. IEEE (2018)

    Google Scholar 

  29. Schomaker, L.: Lifelong learning for text retrieval and recognition in historical handwritten document collections. arXiv preprint arXiv:1912.05156 (2019)

  30. Simistira, F., et al.: ICDAR 2017 competition on layout analysis for challenging medieval manuscripts. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1361–1370. IEEE (2017)

    Google Scholar 

  31. Tensmeyer, C.A.: Deep learning for document image analysis (2019)

    Google Scholar 

  32. Wei, H., Chen, K., Seuret, M., Würsch, M., Liwicki, M., Ingold, R.: Divadiawi-a web-based interface for semi-automatic labeling of historical document images. Digital Humanities (2015)

    Google Scholar 

  33. Xia, X., Kulis, B.: W-net: A deep model for fully unsupervised image segmentation. arXiv preprint arXiv:1711.08506 (2017)

Download references

Acknowledgement

The authors thank the NSERC Discovery held by Prof. Cheriet for their financial support. We thank Ms. MG Jones, for assistance and comments that greatly improved the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Milad Omrani Tamrin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Tamrin, M.O., Cheriet, M. (2021). Simultaneous Detection of Regular Patterns in Ancient Manuscripts Using GAN-Based Deep Unsupervised Segmentation. In: , et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12667. Springer, Cham. https://doi.org/10.1007/978-3-030-68787-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-68787-8_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-68786-1

  • Online ISBN: 978-3-030-68787-8

  • eBook Packages: Computer ScienceComputer Science (R0)