Two-stream FCNs to balance content and style for style transfer

Vo, Duc Minh; Sugimoto, Akihiro

doi:10.1007/s00138-020-01086-1

Two-stream FCNs to balance content and style for style transfer

Original Paper
Published: 08 June 2020

Volume 31, article number 37, (2020)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

203 Accesses
1 Citation
2 Altmetric
Explore all metrics

Abstract

Style transfer is to render given image contents in given styles, and it has an important role in both computer vision fundamental research and industrial applications. Following the success of deep learning-based approaches, this problem has been re-launched recently, but still remains a difficult task because of trade-off between preserving contents and faithful rendering of styles. Indeed, how well-balanced content and style are is crucial in evaluating the quality of stylized images. In this paper, we propose an end-to-end two-stream fully convolutional networks (FCNs) aiming at balancing the contributions of the content and the style in rendered images. Our proposed network consists of the encoder and decoder parts. The encoder part utilizes a FCN for content and a FCN for style where the two FCNs have feature injections and are independently trained to preserve the semantic content and to learn the faithful style representation in each. The semantic content feature and the style representation feature are then concatenated adaptively and fed into the decoder to generate style-transferred (stylized) images. In order to train our proposed network, we employ a loss network, the pre-trained VGG-16, to compute content loss and style loss, both of which are efficiently used for the feature injection as well as the feature concatenation. Our intensive experiments show that our proposed model generates more balanced stylized images in content and style than state-of-the-art methods. Moreover, our proposed network achieves efficiency in speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

Fig. 7

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Learning a Deep Convolutional Network for Image Super-Resolution

Open-Vocabulary Text-Driven Human Image Generation

Article 15 May 2024

Notes

References

Ashikhmin, M.: Synthesizing natural textures. In: Symposium on Interactive 3D Graphics (2001)
Azadi, S., Fisher, M., Kim, V., Wang, Z., Shechtman, E., Darrell, T.: Multi-content gan for few-shot font style transfer. In: CVPR (2018)
Chen, T.Q., Schmidt, M.: Fast patch-based style transfer of arbitrary style. In: NIPS (2016)
Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. In: ICLR (2017)
Efros, A.A., Freeman, W.T.: Image quilting for texture synthesis and transfer. In: SIGGRAPH (2001)
Gao, C., Gu, D., Zhang, F., Yu, Y.: Reconet: real-time coherent video style transfer network. In: ACCV (2018)
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: CVPR (2016)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: NIPS (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Heeger, D.J., Bergen, J.R.: Pyramid-based texture analysis/synthesis. In: SIGGRAPH (1995)
Horn, R.A., Johnson, C.R.: Matrix Analysis, 2nd edn. Cambridge University Press, New York (2012)
Book Google Scholar
Huang, H., Wang, H., Luo, W., Ma, L., Jiang, W., Zhu, X., Li, Z., Liu, W.: Real-time neural style transfer for videos. In: CVPR (2017)
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: ICCV (2017)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015)
Jing, Y., Yang, Y., Feng, Z., Ye, J., Yu, Y., Song, M.: Neural style transfer: a review. IEEE Trans. Vis. Comput. Graph. (2019). https://doi.org/10.1109/TVCG.2019.2921336
Article Google Scholar
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: ECCV (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
Kotovenko, D., Sanakoyeu, A., Ma, P., Lang, S., Ommer, B.: A content transformation block for image style transfer. In: CVPR (2019)
Kyprianidis, J.E., Collomosse, J., Wang, T., Isenberg, T.: State of the “art”: a taxonomy of artistic stylization techniques for images and video. IEEE Trans. Vis. Comput. Graph. 19, 866–885 (2013)
Article Google Scholar
Li, C., Wand, M.: Precomputed real-time texture synthesis with Markovian generative adversarial networks. In: ECCV (2016)
Li, S., Xu, X., Nie, L., Chua, T.S.: Laplacian-steered neural style transfer. In: ACM-MM (2017)
Li, W., Wen, L., Bian, X., Lyu, S.: Evolvement constrained adversarial learning for video style transfer. In: ACCV (2018)
Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Universal style transfer via feature transforms. In: NIPS (2017)
Li, Y., Liu, M.Y., Li, X., Yang, M.H., Kautz, J.: A closed-form solution to photorealistic image stylization. In: ECCV (2018)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: ECCV (2014)
Luan, F., Paris, S., Shechtman, E., Bala, K.: Deep photo style transfer. In: CVPR (2017)
Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: CVPR (2015)
Mechrez, R., Shechtman, E., Zelnik-Manor, L.: Photorealistic style transfer with screened Poisson equation. In: BMVC (2017)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: ICML (2010)
Park, D.Y., Lee, K.H.: Arbitrary style transfer with style-attentional networks. In: CVPR (2019)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: ICLR (2016)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.S., Berg, A.C., Li, F.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)
Article MathSciNet Google Scholar
Sanakoyeu, A., Kotovenko, D., Lang, S., Ommer, B.: A style-aware content loss for real-time hd style transfer. In: ECCV (2018)
Sheng, L., Lin, Z., Shao, J., Wang, X.: Avatar-net: multi-scale zero-shot style transfer by feature decoration. In: CVPR (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Ulyanov, D., Lebedev, V., Vedaldi, A., Lempitsky, V.: Texture networks: feed-forward synthesis of textures and stylized images. In: ICML (2016)
Ulyanov, D., Vedaldi, A., Lempitsky, V.S.: Instance normalization: the missing ingredient for fast stylization. In: ICML (2016)
Vo, D.M., Le, T.N., Sugimoto, A.: Balancing content and style with two-stream FCNs for style transfer. In: WACV (2018)
Wang, X., Oxholm, G., Zhang, D., Wang, Y.F.: Multimodal transfer: a hierarchical deep convolutional neural network for fast artistic style transfer. In: CVPR (2017)
Zhang, Y., Zhang, Y., Cai, W.: Separating style and content for generalized style transfer. In: CVPR (2018)

Download references

Acknowledgements

This work was in part supported by JST CREST (Grant No. JPMJCR14D1). The authors are thankful to Dr. Trung-Nghia Le for his valuable comments on this work.

Author information

Authors and Affiliations

Department of Informatics, SOKENDAI (The Graduate University for Advanced Studies), Tokyo, Japan
Duc Minh Vo
National Institute of Informatics, Tokyo, Japan
Akihiro Sugimoto

Authors

Duc Minh Vo
View author publications
You can also search for this author in PubMed Google Scholar
Akihiro Sugimoto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Duc Minh Vo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vo, D.M., Sugimoto, A. Two-stream FCNs to balance content and style for style transfer. Machine Vision and Applications 31, 37 (2020). https://doi.org/10.1007/s00138-020-01086-1

Download citation

Received: 04 July 2019
Revised: 22 February 2020
Accepted: 06 May 2020
Published: 08 June 2020
DOI: https://doi.org/10.1007/s00138-020-01086-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-stream FCNs to balance content and style for style transfer

Abstract

Access this article

Similar content being viewed by others

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Learning a Deep Convolutional Network for Image Super-Resolution

Open-Vocabulary Text-Driven Human Image Generation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Two-stream FCNs to balance content and style for style transfer

Abstract

Access this article

Similar content being viewed by others

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Learning a Deep Convolutional Network for Image Super-Resolution

Open-Vocabulary Text-Driven Human Image Generation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation