Quaternion Generative Adversarial Networks for Inscription Detection in Byzantine Monuments

Sfikas, Giorgos; Giotis, Angelos P.; Retsinas, George; Nikou, Christophoros

doi:10.1007/978-3-030-68787-8_12

Quaternion Generative Adversarial Networks for Inscription Detection in Byzantine Monuments

Giorgos Sfikas¹⁶,
Angelos P. Giotis¹⁶,
George Retsinas¹⁷ &
…
Christophoros Nikou¹⁶

Conference paper
First Online: 21 February 2021

1885 Accesses
8 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12667))

Abstract

In this work, we introduce and discuss Quaternion Generative Adversarial Networks, a variant of generative adversarial networks that uses quaternion-valued inputs, weights and intermediate network representations. Quaternionic representation has the advantage of treating cross-channel information carried by multichannel signals (e.g. color images) holistically, while quaternionic convolution has been shown to be less resource-demanding. Standard convolutional and deconvolutional layers are replaced by their quaternionic variants, in both generator and discriminator nets, while activations and loss functions are adapted accordingly. We have succesfully tested the model on the task of detecting byzantine inscriptions in the wild, where the proposed model is on par with a vanilla conditional generative adversarial network, but is significantly less expensive in terms of model size (requires \(4{\times }\) less parameters). Code is available at https://github.com/sfikas/quaternion-gan.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Dimitrakopoulos, P., Sfikas, G., Nikou, C.: ISING-GAN: annotated data augmentation with a spatially constrained generative adversarial network. In: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), pp. 1600–1603. IEEE (2020)
Google Scholar
Ell, T.A., Sangwine, S.J.: Hypercomplex fourier transforms of color images. IEEE Trans. Image Process. 16(1), 22–35 (2007)
Article MathSciNet Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88, 303–338 (2010)
Article Google Scholar
Giotis, A.P., Sfikas, G., Gatos, B., Nikou, C.: A survey of document image word spotting techniques. Pattern Recogn. 68, 310–332 (2017)
Article Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems (NIPS), pp. 2672–2680 (2014)
Google Scholar
Hui, W., Xiao-Hui, W., Yue, Z., Jie, Y.: Color texture segmentation using quaternion-gabor filters. In: 2006 International Conference on Image Processing, pp. 745–748. IEEE (2006)
Google Scholar
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004 (2016)
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. Int. J. Comput. Vision 116(1), 1–20 (2016)
Article MathSciNet Google Scholar
Kordatos, E., Exarchos, D., Stavrakos, C., Moropoulou, A., Matikas, T.: Infrared thermographic inspection of murals and characterization of degradation in historic monuments. Constr. Build. Mater. 48, 1261–1265 (2013)
Article Google Scholar
Leung, H., Haykin, S.: The complex backpropagation algorithm. IEEE Trans. Signal Process. 39(9), 2101–2104 (1991)
Article Google Scholar
Liao, M., Shi, B., Bai, X.: Textboxes++: a single-shot oriented scene text detector. IEEE Trans. Image Process. 27(8), 3676–3690 (2018)
Article MathSciNet Google Scholar
Liao, M., Zhu, Z., Shi, B., song Xia, G., Bai, X.: Rotation-sensitive regression for oriented scene text detection (2018)
Google Scholar
Lucic, M., Kurach, K., Michalski, M., Gelly, S., Bousquet, O.: Are GANs created equal? A large-scale study. In: Advances in Neural Information Processing Systems (NIPS), pp. 700–709 (2018)
Google Scholar
Nitta, T.: A quaternary version of the back-propagation algorithm. In: Proceedings of ICNN 1995-International Conference on Neural Networks, vol. 5, pp. 2753–2756. IEEE (1995)
Google Scholar
Papadimitriou, K., Sfikas, G., Nikou, C.: Tomographic image reconstruction with a spatially varying gamma mixture prior. J. Math. Imaging Vis. 60(8), 1355–1365 (2018)
Article MathSciNet Google Scholar
Parcollet, T., Morchid, M., Linarès, G.: Quaternion convolutional neural networks for heterogeneous image processing. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8514–8518. IEEE (2019)
Google Scholar
Parcollet, T., Morchid, M., Linares, G.: A survey of quaternion neural networks. Artif. Intell. Rev. 53(4), 2957–2982 (2020)
Article Google Scholar
Parcollet, T., et al.: Quaternion convolutional neural networks for end-to-end automatic speech recognition. arXiv preprint arXiv:1806.07789 (2018)
Raisi, Z., Naiel, M.A., Fieguth, P., Wardell, S., Zelek, J.: Text detection and recognition in the wild: a review (2020)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks (2016)
Google Scholar
Rhoby, A.: Text as art? Byzantine inscriptions and their display. In: Writing Matters: Presenting and Perceiving Monumental Inscriptions in Antiquity and the Middle Ages, pp. 265–283. de Gruyter, Berlin (2017)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Su, F., Ding, W., Wang, L., Shan, S., Xu, H.: Text proposals based on windowed maximally stable extremal region for scene text detection. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 376–381 (2017)
Google Scholar
Yao, C., Bai, X., Sang, N., Zhou, X., Zhou, S., Cao, Z.: Scene text detection via holistic, multi-channel prediction (2016)
Google Scholar
Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(07), 1480–1500 (2015)
Article Google Scholar
Zhu, X., Xu, Y., Xu, H., Chen, C.: Quaternion convolutional neural networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 631–647 (2018)
Google Scholar

Download references

Acknowledgement

We would like to thank Dr. Christos Stavrakos, Dr. Katerina Kontopanagou, Dr. Fanny Lyttari and Ioannis Theodorakopoulos for supplying us with the Byzantine inscription images used for our experiments.

We also gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan XP GPU used for this research.

This research has been partially co-financed by the EU and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call OPEN INNOVATION IN CULTURE, project Bessarion (T6YB\(\varPi \)-00214).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Ioannina, Ioannina, Greece
Giorgos Sfikas, Angelos P. Giotis & Christophoros Nikou
School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece
George Retsinas

Authors

Giorgos Sfikas
View author publications
You can also search for this author in PubMed Google Scholar
Angelos P. Giotis
View author publications
You can also search for this author in PubMed Google Scholar
George Retsinas
View author publications
You can also search for this author in PubMed Google Scholar
Christophoros Nikou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Giorgos Sfikas .

Editor information

Editors and Affiliations

Dipartimento di Ingegneria dell’Informazione, University of Firenze, Firenze, Italy
Alberto Del Bimbo
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Rita Cucchiara
Department of Computer Science, Boston University, Boston, MA, USA
Stan Sclaroff
Dipartimento di Matematica e Informatica, University of Catania, Catania, Italy
Giovanni Maria Farinella
Cloud & AI, JD.COM, Beijing, China
Tao Mei
Dipartimento di Ingegneria dell’Informazione, University of Firenze, Firenze, Italy
Marco Bertini
Computational Sciences Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Tonantzintla, Puebla, Mexico
Hugo Jair Escalante
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Roberto Vezzani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sfikas, G., Giotis, A.P., Retsinas, G., Nikou, C. (2021). Quaternion Generative Adversarial Networks for Inscription Detection in Byzantine Monuments. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12667. Springer, Cham. https://doi.org/10.1007/978-3-030-68787-8_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-68787-8_12
Published: 21 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68786-1
Online ISBN: 978-3-030-68787-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)