Skip to main content

Computational Decomposition of Style for Controllable and Enhanced Style Transfer

  • Conference paper
  • First Online:
Intelligence Science and Big Data Engineering. Big Data and Machine Learning (IScIDE 2019)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11936))

  • 1612 Accesses

Abstract

Neural style transfer has been demonstrated to be powerful in creating artistic images with help of Convolutional Neural Networks (CNN), but continuously controllable transfer is still a challenging task. This paper provides a computational decomposition of the style into basic factors, which aim to be factorized, interpretable representations of the artistic styles. We propose to decompose the style by not only spectrum based methods including Fast Fourier Transform and Discrete Cosine Transform, but also latent variable models such as Principal Component Analysis, Independent Component Analysis, and so on. Such decomposition induces various ways of controlling the style factors to generate enhanced, diversified styled images. We mix or intervene the style basis from more than one styles so that compound style or new style could be generated to produce styled images. To implement our method, we derive a simple, effective computational module, which can be embedded into state-of-the-art style transfer algorithms. Experiments demonstrate the effectiveness of our method on not only painting style transfer but also other possible applications such as picture-to-sketch problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Champandard, A.J.: Semantic style transfer and turning two-bit doodles into fine artworks. arXiv preprint arXiv:1603.01768 (2016)

  2. Chen, Y., Tu, S., Yi, Y., Xu, L.: Sketch-pix2seq: a model to generate sketches of multiple categories. CoRR abs/1709.04121 (2017)

    Google Scholar 

  3. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  4. Der Maaten, L.V., Hinton, G.E.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  5. Efros, A.A., Leung, T.K.: Texture synthesis by non-parametric sampling. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1033–1038. IEEE (1999)

    Google Scholar 

  6. Engstrom, L.: Fast style transfer (2016). https://github.com/lengstrom/fast-style-transfer/

  7. Gatys, L., Ecker, A.S., Bethge, M.: Texture synthesis using convolutional neural networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28, pp. 262–270. Curran Associates, Inc. (2015)

    Google Scholar 

  8. Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)

  9. Gatys, L.A., Ecker, A.S., Bethge, M., Hertzmann, A., Shechtman, E.: Controlling perceptual factors in neural style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3985–3993 (2017)

    Google Scholar 

  10. Hertzmann, A.P.: Algorithms for rendering in artistic styles. Ph.D. thesis, New York University, Graduate School of Arts and Science (2001)

    Google Scholar 

  11. Hyvarinen, A.: Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10(3), 626–634 (1999)

    Article  Google Scholar 

  12. Jing, Y., Yang, Y., Feng, Z., Ye, J., Yu, Y., Song, M.: Neural style transfer: a review. IEEE Trans. Vis. Comput. Graph. (2019)

    Google Scholar 

  13. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43

    Chapter  Google Scholar 

  14. Li, C., Wand, M.: Combining Markov random fields and convolutional neural networks for image synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2479–2486 (2016)

    Google Scholar 

  15. Li, C., Wand, M.: Precomputed real-time texture synthesis with Markovian generative adversarial networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 702–716. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_43

    Chapter  Google Scholar 

  16. Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)

    Article  MathSciNet  Google Scholar 

  17. Rosin, P., Collomosse, J.: Image and Video-Based Artistic Stylisation, vol. 42. Springer, London (2012). https://doi.org/10.1007/978-1-4471-4519-6

    Book  Google Scholar 

  18. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  19. Song, J., Pang, K., Song, Y.Z., Xiang, T., Hospedales, T.M.: Learning to sketch with shortcut cycle consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 801–810 (2018)

    Google Scholar 

  20. Tenenbaum, J.B., De Silva, V., Langford, J.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)

    Article  Google Scholar 

  21. Ulyanov, D., Lebedev, V., Vedaldi, A., Lempitsky, V.: Texture networks: feed-forward synthesis of textures and stylized images. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, ICML 2016, vol. 48, pp. 1349–1357 (2016)

    Google Scholar 

  22. Wei, L.Y., Levoy, M.: Fast texture synthesis using tree-structured vector quantization. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2000, pp. 479–488. ACM Press/Addison-Wesley Publishing Co., New York (2000)

    Google Scholar 

  23. Yin, R.: Content aware neural style transfer. CoRR abs/1601.04568 (2016). http://arxiv.org/abs/1601.04568

Download references

Acknowledgement

This work was supported by the Zhi-Yuan Chair Professorship Start-up Grant (WF220103010), and Startup Fund (WF220403029) for Youngman Research, from Shanghai Jiao Tong University.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Shikui Tu or Lei Xu .

Editor information

Editors and Affiliations

Appendices

Appendix 1: The Stylization Effect of Every Activation Layer in VGG19

Since different layers in VGG19 [18] represent different abstract levels, we experiment the stylization effect of every activation layer on a couple of different style images as are shown in Figs. 9 and 10. From our experiments, it can be noticed that not every layer is effective in style transfer and among those that work, shallow layers only transfer the coarse scale of style (color) while deep layers can transfer both the coarse scale (color) and detailed scale (stroke) of style, which conforms to the result of scale control in [9]. Since ‘relu4_1’ performs the best in style transfer after same learning epochs, we determine to study the feature map of ‘relu4_1’ in our research.

We further visualize each channel of the feature map of the style image using t-SNE [4], as are shown in Figs. 11 and 12 where the similarity of both the results is quite interesting. However, we could not explain those specific patterns shown in the visualization results yet. What’s more, the relationship between the similarity of the visualization results with the similarity of the stylization effects of different VGG layers remains further study as well.

Fig. 10.
figure 10

Styled images using [8] with same epochs using every single activation layer from the pre-trained VGG19.

Fig. 11.
figure 11

Low dimensional projection of all channels of every single layer of the style image (Fig. 9(b)) via t-SNE [4].

Fig. 12.
figure 12

Low dimensional projection of all channels of every single layer of style image (Fig. 10(b)) via t-SNE [4].

Fig. 13.
figure 13

(a) Chinese paintings; (b) Oil paintings (by Leonid Afremov); (c) Pen sketches; (d) low-dimensional projections of the spectrum of style (a–c) via Isomap [20].

Fig. 14.
figure 14

Low-dimensional projections of the spectrum of large scale of style images via Isomap [20]. The size of each image shown above does not indicate any other information, but is set to prevent the overlap of the images only.

Appendix 2: The Manifold of Spectrum Based Methods

We analyze the spectrum space by projecting the style bases via Isomap [20] into low dimensional space where the X-axis represents the color basis and the Y-axis represents the stroke basis, which can analytically demonstrate the effectiveness and robustness of spectrum based methods. Three artistic styles are experimented (shown in Fig. 13(a–c)). Chinese paintings and pen sketches share similar color style which is sharply distinguished with oil paintings’ while the stroke of three artistic styles are quite different from each other. Thus, as in shown in Fig. 13(d), Chinese paintings and pen sketches are close to each other and both stay away from oil paintings in X-axis which represents color while three styles are respectively separable in Y-axis which represents stroke, which completely satisfies our analysis of the three artistic styles.

When we apply the same method to large scale of style images (Fig. 14), X-axis clearly represents the linear transition from dull-colored to rich-colored. However, we fail to conclude any notable linear transition for Y-axis from the 2-dimensional visualization probably because it is hard to describe the style of stroke (boldface, length, curvity, etc.) using only one dimension.

Fig. 15.
figure 15

The styled image with stroke basis intervened using spectrum based methods. The left most row shows the style images (From top to bottom: The Great Wave off Kanagawa - Katsushika Hokusai; Composition - Alberto Magnelli; Dancer - Ernst Ludwig Kirchner; Pistachio Tree in the Courtyard of the Chateau Noir - Paul Cezanne; Potrait - Lucian Freud). From left to right of each row, the effect of stroke is increasingly amplified.

Fig. 16.
figure 16

The styled image with stroke basis intervened using spectrum based methods. The left most row shows the style images (From top to bottom: A Muse (La Muse) - Pablo Picasso; Number 4 (Gray and Red) - Jackson Pollock; Shipwreck - J.M.W. Turner; Natura Morta - Giorgio Morandi; The Scream - Edvard Munch). From left to right of each row, the effect of stroke is increasingly amplified. (Color figure online)

Fig. 17.
figure 17

The styled image with stroke basis intervened using ICA. The left most row shows the style images (From top to bottom: The Great Wave off Kanagawa - Katsushika Hokusai; Composition - Alberto Magnelli; Dancer - Ernst Ludwig Kirchner; Pistachio Tree in the Courtyard of the Chateau Noir - Paul Cezanne; Potrait - Lucian Freud). From left to right of each row, the effect of stroke is increasingly amplified.

Fig. 18.
figure 18

The styled image with stroke basis intervened using ICA. The left most row shows the style images (From top to bottom: A Muse (La Muse) - Pablo Picasso; Number 4 (Gray and Red) - Jackson Pollock; Shipwreck - J.M.W. Turner; Natura Morta - Giorgio Morandi; The Scream - Edvard Munch). From left to right of each row, the effect of stroke is increasingly amplified. (Color figure online)

Appendix 3: Stroke Intervention

We demonstrate more styled images with stroke basis intervened using spectrum based method (Figs. 15 and 16) and ICA (Figs. 17 and 18) respectively.

Appendix 4: Style Mixing

We demonstrate more styled images transferred with compound style generated by mixing the color basis and stroke basis of two different styles. The results of both spectrum based method and ICA method are shown in Fig. 19 with comparison with traditional mixing method - interpolation.

Fig. 19.
figure 19

The left two columns are the style images used for mixing. Specifically, we mix the color of the most left one with the stroke of the second left one. The third left column shows the styled images with traditional interpolation method. The second right column shows the styled images using spectrum mixing method. The most right column shows the styled images using ICA mixing method.

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, M., Tu, S., Xu, L. (2019). Computational Decomposition of Style for Controllable and Enhanced Style Transfer. In: Cui, Z., Pan, J., Zhang, S., Xiao, L., Yang, J. (eds) Intelligence Science and Big Data Engineering. Big Data and Machine Learning. IScIDE 2019. Lecture Notes in Computer Science(), vol 11936. Springer, Cham. https://doi.org/10.1007/978-3-030-36204-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-36204-1_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-36203-4

  • Online ISBN: 978-3-030-36204-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics