Computational Decomposition of Style for Controllable and Enhanced Style Transfer

Li, Minchao; Tu, Shikui; Xu, Lei

doi:10.1007/978-3-030-36204-1_2

Minchao Li¹³,
Shikui Tu¹³ &
Lei Xu¹³

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11936))

Included in the following conference series:

International Conference on Intelligent Science and Big Data Engineering

1612 Accesses

Abstract

Neural style transfer has been demonstrated to be powerful in creating artistic images with help of Convolutional Neural Networks (CNN), but continuously controllable transfer is still a challenging task. This paper provides a computational decomposition of the style into basic factors, which aim to be factorized, interpretable representations of the artistic styles. We propose to decompose the style by not only spectrum based methods including Fast Fourier Transform and Discrete Cosine Transform, but also latent variable models such as Principal Component Analysis, Independent Component Analysis, and so on. Such decomposition induces various ways of controlling the style factors to generate enhanced, diversified styled images. We mix or intervene the style basis from more than one styles so that compound style or new style could be generated to produce styled images. To implement our method, we derive a simple, effective computational module, which can be embedded into state-of-the-art style transfer algorithms. Experiments demonstrate the effectiveness of our method on not only painting style transfer but also other possible applications such as picture-to-sketch problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Champandard, A.J.: Semantic style transfer and turning two-bit doodles into fine artworks. arXiv preprint arXiv:1603.01768 (2016)
Chen, Y., Tu, S., Yi, Y., Xu, L.: Sketch-pix2seq: a model to generate sketches of multiple categories. CoRR abs/1709.04121 (2017)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Der Maaten, L.V., Hinton, G.E.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
Efros, A.A., Leung, T.K.: Texture synthesis by non-parametric sampling. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1033–1038. IEEE (1999)
Google Scholar
Engstrom, L.: Fast style transfer (2016). https://github.com/lengstrom/fast-style-transfer/
Gatys, L., Ecker, A.S., Bethge, M.: Texture synthesis using convolutional neural networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28, pp. 262–270. Curran Associates, Inc. (2015)
Google Scholar
Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)
Gatys, L.A., Ecker, A.S., Bethge, M., Hertzmann, A., Shechtman, E.: Controlling perceptual factors in neural style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3985–3993 (2017)
Google Scholar
Hertzmann, A.P.: Algorithms for rendering in artistic styles. Ph.D. thesis, New York University, Graduate School of Arts and Science (2001)
Google Scholar
Hyvarinen, A.: Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10(3), 626–634 (1999)
Article Google Scholar
Jing, Y., Yang, Y., Feng, Z., Ye, J., Yu, Y., Song, M.: Neural style transfer: a review. IEEE Trans. Vis. Comput. Graph. (2019)
Google Scholar
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Chapter Google Scholar
Li, C., Wand, M.: Combining Markov random fields and convolutional neural networks for image synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2479–2486 (2016)
Google Scholar
Li, C., Wand, M.: Precomputed real-time texture synthesis with Markovian generative adversarial networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 702–716. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_43
Chapter Google Scholar
Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Article MathSciNet Google Scholar
Rosin, P., Collomosse, J.: Image and Video-Based Artistic Stylisation, vol. 42. Springer, London (2012). https://doi.org/10.1007/978-1-4471-4519-6
Book Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Song, J., Pang, K., Song, Y.Z., Xiang, T., Hospedales, T.M.: Learning to sketch with shortcut cycle consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 801–810 (2018)
Google Scholar
Tenenbaum, J.B., De Silva, V., Langford, J.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Article Google Scholar
Ulyanov, D., Lebedev, V., Vedaldi, A., Lempitsky, V.: Texture networks: feed-forward synthesis of textures and stylized images. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, ICML 2016, vol. 48, pp. 1349–1357 (2016)
Google Scholar
Wei, L.Y., Levoy, M.: Fast texture synthesis using tree-structured vector quantization. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2000, pp. 479–488. ACM Press/Addison-Wesley Publishing Co., New York (2000)
Google Scholar
Yin, R.: Content aware neural style transfer. CoRR abs/1601.04568 (2016). http://arxiv.org/abs/1601.04568

Download references

Acknowledgement

This work was supported by the Zhi-Yuan Chair Professorship Start-up Grant (WF220103010), and Startup Fund (WF220403029) for Youngman Research, from Shanghai Jiao Tong University.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Center for Cognitive Machines and Computational Health (CMaCH), Shanghai Jiao Tong University, Shanghai, China
Minchao Li, Shikui Tu & Lei Xu

Authors

Minchao Li
View author publications
You can also search for this author in PubMed Google Scholar
Shikui Tu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Shikui Tu or Lei Xu .

Editor information

Editors and Affiliations

Nanjing University of Science and Technology, Nanjing, China
Zhen Cui
Nanjing University of Science and Technology, Nanjing, China
Jinshan Pan
Nanjing University of Science and Technology, Nanjing, China
Shanshan Zhang
Nanjing University of Science and Technology, Nanjing, China
Liang Xiao
Nanjing University of Science and Technology, Nanjing, China
Jian Yang

Appendices

Appendix 1: The Stylization Effect of Every Activation Layer in VGG19

Since different layers in VGG19 [18] represent different abstract levels, we experiment the stylization effect of every activation layer on a couple of different style images as are shown in Figs. 9 and 10. From our experiments, it can be noticed that not every layer is effective in style transfer and among those that work, shallow layers only transfer the coarse scale of style (color) while deep layers can transfer both the coarse scale (color) and detailed scale (stroke) of style, which conforms to the result of scale control in [9]. Since ‘relu4_1’ performs the best in style transfer after same learning epochs, we determine to study the feature map of ‘relu4_1’ in our research.

We further visualize each channel of the feature map of the style image using t-SNE [4], as are shown in Figs. 11 and 12 where the similarity of both the results is quite interesting. However, we could not explain those specific patterns shown in the visualization results yet. What’s more, the relationship between the similarity of the visualization results with the similarity of the stylization effects of different VGG layers remains further study as well.

Appendix 2: The Manifold of Spectrum Based Methods

We analyze the spectrum space by projecting the style bases via Isomap [20] into low dimensional space where the X-axis represents the color basis and the Y-axis represents the stroke basis, which can analytically demonstrate the effectiveness and robustness of spectrum based methods. Three artistic styles are experimented (shown in Fig. 13(a–c)). Chinese paintings and pen sketches share similar color style which is sharply distinguished with oil paintings’ while the stroke of three artistic styles are quite different from each other. Thus, as in shown in Fig. 13(d), Chinese paintings and pen sketches are close to each other and both stay away from oil paintings in X-axis which represents color while three styles are respectively separable in Y-axis which represents stroke, which completely satisfies our analysis of the three artistic styles.

When we apply the same method to large scale of style images (Fig. 14), X-axis clearly represents the linear transition from dull-colored to rich-colored. However, we fail to conclude any notable linear transition for Y-axis from the 2-dimensional visualization probably because it is hard to describe the style of stroke (boldface, length, curvity, etc.) using only one dimension.

Appendix 3: Stroke Intervention

We demonstrate more styled images with stroke basis intervened using spectrum based method (Figs. 15 and 16) and ICA (Figs. 17 and 18) respectively.

Appendix 4: Style Mixing

We demonstrate more styled images transferred with compound style generated by mixing the color basis and stroke basis of two different styles. The results of both spectrum based method and ICA method are shown in Fig. 19 with comparison with traditional mixing method - interpolation.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, M., Tu, S., Xu, L. (2019). Computational Decomposition of Style for Controllable and Enhanced Style Transfer. In: Cui, Z., Pan, J., Zhang, S., Xiao, L., Yang, J. (eds) Intelligence Science and Big Data Engineering. Big Data and Machine Learning. IScIDE 2019. Lecture Notes in Computer Science(), vol 11936. Springer, Cham. https://doi.org/10.1007/978-3-030-36204-1_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-36204-1_2
Published: 29 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36203-4
Online ISBN: 978-3-030-36204-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics