Arbitrary Style Transfer with Style Enhancement and Structure Retention

Yang, Sijia; Zhou, Yun

doi:10.1007/978-3-031-50072-5_32

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14496))

Included in the following conference series:

Computer Graphics International Conference

461 Accesses

Abstract

Arbitrary style transfer is to transfer the style of any reference image to another image by a trained neural network, while preserving its content as much as possible. So far, a lot of work focused on reducing training costs without achieving sufficient style transfer, while some other work has neglected image frequency domain alignment. Balancing style presentation and content retention is no doubt challenging, we therefore propose a style transfer method that introduces frequency domain alignment and style secondary embedding, which is mainly embodied in two parts: style enhancement module (SEM) and content retention module (SRM). SEM aligns the stylistic image and stylized image statistics in the feature space. SRM reduces the loss of content by mapping the original and stylized images into the frequency domain and airspace for synchronous alignment. This new approach works well in terms of both style transfer and content retention. Experimental and questionnaire results show that this method can generate satisfactory stylized images without loss of content information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Following the setting of [3], we tried several mask sizes and chose 27, which generated good results.

References

Afifi, M., Brubaker, M.A., Brown, M.S.: Histogan: controlling colors of GAN-generated and real images via color histograms. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7941–7950 (2021)
Google Scholar
An, J., Huang, S., Song, Y., Dou, D., Liu, W., Luo, J.: Artflow: unbiased image style transfer via reversible neural flows. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 862–871 (2021)
Google Scholar
Cai, M., Zhang, H., Huang, H., Geng, Q., Li, Y., Huang, G.: Frequency domain image translation: more photo-realistic, better identity-preserving. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13930–13940 (2021)
Google Scholar
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8789–8797 (2018)
Google Scholar
Deng, Y., Tang, F., Dong, W., Huang, H., Ma, C., Xu, C.: Arbitrary video style transfer via multi-channel correlation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 1210–1217 (2021)
Google Scholar
Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. arXiv preprint arXiv:1610.07629 (2016)
Fišer, J., et al.: Stylit: illumination-guided example-based stylization of 3D renderings. ACM Trans. Graph. (TOG) 35(4), 1–11 (2016)
Article Google Scholar
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
Google Scholar
Gatys, L.A., Ecker, A.S., Bethge, M., Hertzmann, A., Shechtman, E.: Controlling perceptual factors in neural style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3985–3993 (2017)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361. IEEE (2012)
Google Scholar
Godard, C., Mac Aodha, O., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3828–3838 (2019)
Google Scholar
Hertzmann, A.: Painterly rendering with curved brush strokes of multiple sizes. In: Proceedings of the 25th Annual Conference on Computer Graphics and Interactive Techniques, pp. 453–460 (1998)
Google Scholar
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)
Google Scholar
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Chapter Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lee, J., Son, H., Lee, G., Lee, J., Cho, S., Lee, S.: Deep color transfer using histogram analogy. Vis. Comput. 36, 2129–2143 (2020)
Article Google Scholar
Li, S., Xu, X., Nie, L., Chua, T.S.: Laplacian-steered neural style transfer. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1716–1724 (2017)
Google Scholar
Li, X., Liu, S., Kautz, J., Yang, M.H.: Learning linear transformations for fast image and video style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3809–3817 (2019)
Google Scholar
Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Diversified texture synthesis with feed-forward networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3920–3928 (2017)
Google Scholar
Li, Z., Zhao, X., Zhao, C., Tang, M., Wang, J.: Transfering low-frequency features for domain adaptation. In: 2022 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2022)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, S., et al.: Adaattn: revisit attention mechanism in arbitrary neural style transfer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6649–6658 (2021)
Google Scholar
Park, D.Y., Lee, K.H.: Arbitrary style transfer with style-attentional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5880–5888 (2019)
Google Scholar
Phillips, F., Mackintosh, B.: Wiki art gallery, inc.: a case for critical thinking. Issues Account. Educ. 26(3), 593–608 (2011)
Google Scholar
Sanakoyeu, A., Kotovenko, D., Lang, S., Ommer, B.: A style-aware content loss for real-time HD style transfer. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 698–714 (2018)
Google Scholar
Ulyanov, D., Lebedev, V., Vedaldi, A., Lempitsky, V.: Texture networks: feed-forward synthesis of textures and stylized images. arXiv preprint arXiv:1603.03417 (2016)
Wang, L., Wang, Z., Yang, X., Hu, S.M., Zhang, J.: Photographic style transfer. Vis. Comput. 36, 317–331 (2020)
Article Google Scholar
Xu, W., Long, C., Wang, R., Wang, G.: DRB-GAN: a dynamic resblock generative adversarial network for artistic style transfer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6383–6392 (2021)
Google Scholar
Yang, H., Min, K.: Importance-based approach for rough drawings. Vis. Comput. 35(4), 609–622 (2019)
Article Google Scholar
Zhang, H., Dana, K.: Multi-style generative network for real-time transfer. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2018)
Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
Google Scholar
Zhang, Y., et al.: Domain enhanced arbitrary image style transfer via contrastive learning. arXiv preprint arXiv:2205.09542 (2022)

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 62276262; Science and Technology Innovation Program of Hunan Province under Grant 2021RC3076; Training Program for Excellent Young Innovators of Changsha under Grant KQ2009009.

Author information

Authors and Affiliations

National University of Defense Technology, Changsha, 410000, China
Sijia Yang & Yun Zhou

Authors

Sijia Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yun Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yun Zhou .

Editor information

Editors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Bin Sheng
Shanghai Jiao Tong University, Shanghai, China
Lei Bi
University of Sydney, Sydney, NSW, Australia
Jinman Kim
MIRALab-CUI, University of Geneve, Carouge, Geneve, Switzerland
Nadia Magnenat-Thalmann
Swiss Federal Institute of Technology, Lausanne, Switzerland
Daniel Thalmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, S., Zhou, Y. (2024). Arbitrary Style Transfer with Style Enhancement and Structure Retention. In: Sheng, B., Bi, L., Kim, J., Magnenat-Thalmann, N., Thalmann, D. (eds) Advances in Computer Graphics. CGI 2023. Lecture Notes in Computer Science, vol 14496. Springer, Cham. https://doi.org/10.1007/978-3-031-50072-5_32

Download citation

DOI: https://doi.org/10.1007/978-3-031-50072-5_32
Published: 29 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50071-8
Online ISBN: 978-3-031-50072-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Arbitrary Style Transfer with Style Enhancement and Structure Retention