Advertisement

Stroke Controllable Fast Style Transfer with Adaptive Receptive Fields

  • Yongcheng Jing
  • Yang Liu
  • Yezhou Yang
  • Zunlei Feng
  • Yizhou Yu
  • Dacheng Tao
  • Mingli SongEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11217)

Abstract

The Fast Style Transfer methods have been recently proposed to transfer a photograph to an artistic style in real-time. This task involves controlling the stroke size in the stylized results, which remains an open challenge. In this paper, we present a stroke controllable style transfer network that can achieve continuous and spatial stroke size control. By analyzing the factors that influence the stroke size, we propose to explicitly account for the receptive field and the style image scales. We propose a StrokePyramid module to endow the network with adaptive receptive fields, and two training strategies to achieve faster convergence and augment new stroke sizes upon a trained model respectively. By combining the proposed runtime control strategies, our network can achieve continuous changes in stroke sizes and produce distinct stroke sizes in different spatial regions within the same output image.

Keywords

Neural Style Transfer Adaptive receptive fields 

Notes

Acknowledgments

The first two authors contributed equally. Mingli Song is the corresponding author. This work is supported by National Key Research and Development Program (2016YFB1200203), National Natural Science Foundation of China (61572428, U1509206), Fundamental Research Funds for the Central Universities (2017FZA5014) and Key Research, Development Program of Zhejiang Province (2018C01004) and ARC FL-170100117, DP-180103424 of Australia.

Supplementary material

474201_1_En_15_MOESM1_ESM.pdf (30.1 mb)
Supplementary material 1 (pdf 30833 KB)

References

  1. 1.
    Chen, D., Liao, J., Yuan, L., Yu, N., Hua, G.: Coherent online video style transfer. In: Proceedings of the IEEE International Conference on Computer Vision (2017)Google Scholar
  2. 2.
    Chen, D., Yuan, L., Liao, J., Yu, N., Hua, G.: Stylebank: an explicit representation for neural image style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  3. 3.
    Chen, D., Yuan, L., Liao, J., Yu, N., Hua, G.: Stereoscopic neural style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)Google Scholar
  4. 4.
    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. (2017)Google Scholar
  5. 5.
    Dai, J., et al.: Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (2017)Google Scholar
  6. 6.
    Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. In: Advances in Neural Information Processing Systems, pp. 658–666 (2016)Google Scholar
  7. 7.
    Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. In: International Conference on Learning Representations (2017)Google Scholar
  8. 8.
    Efros, A.A., Freeman, W.T.: Image quilting for texture synthesis and transfer. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, pp. 341–346. ACM (2001)Google Scholar
  9. 9.
    Elad, M., Milanfar, P.: Style transfer via texture synthesis. IEEE Trans. Image Process. 26(5), 2338–2351 (2017)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Fan, Q., Chen, D., Yuan, L., Hua, G., Yu, N., Chen, B.: Decouple learning for parameterized image operators. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part XIII. LNCS, vol. 11217, pp. 455–471. Springer, Cham (2018)Google Scholar
  11. 11.
    Frigo, O., Sabater, N., Delon, J., Hellier, P.: Split and match: example-based adaptive patch sampling for unsupervised style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 553–561 (2016)Google Scholar
  12. 12.
    Gatys, L.A., Ecker, A.S., Bethge, M.: Texture synthesis using convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 262–270 (2015)Google Scholar
  13. 13.
    Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)Google Scholar
  14. 14.
    Gatys, L.A., Ecker, A.S., Bethge, M., Hertzmann, A., Shechtman, E.: Controlling perceptual factors in neural style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  15. 15.
    Gooch, B., Gooch, A.: Non-Photorealistic Rendering. A. K. Peters Ltd., Natick (2001)CrossRefGoogle Scholar
  16. 16.
    He, M., Chen, D., Liao, J., Sander, P.V., Yuan, L.: Deep exemplar-based colorization. ACM Transactions on Graphics (Proc. of Siggraph 2018) (2018)Google Scholar
  17. 17.
    Hertzmann, A., Jacobs, C.E., Oliver, N., Curless, B., Salesin, D.H.: Image analogies. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, pp. 327–340. ACM (2001)Google Scholar
  18. 18.
    Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision (2017)Google Scholar
  19. 19.
    Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_43 CrossRefGoogle Scholar
  20. 20.
    Julesz, B., et al.: Textons, the elements of texture perception, and their interactions. Nature 290(5802), 91–97 (1981)CrossRefGoogle Scholar
  21. 21.
    Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)Google Scholar
  22. 22.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)Google Scholar
  23. 23.
    Li, C., Wand, M.: Combining Markov random fields and convolutional neural networks for image synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2479–2486 (2016)Google Scholar
  24. 24.
    Li, C., Wand, M.: Precomputed real-time texture synthesis with markovian generative adversarial networks. In: European Conference on Computer Vision, pp. 702–716 (2016)CrossRefGoogle Scholar
  25. 25.
    Li, Y., Wang, N., Liu, J., Hou, X.: Demystifying neural style transfer. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pp. 2230–2236 (2017).  https://doi.org/10.24963/ijcai.2017/310,  https://doi.org/10.24963/ijcai.2017/310
  26. 26.
    Li, Y., et al.: Diversified texture synthesis with feed-forward networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  27. 27.
    Li, Y., et al.: Universal style transfer via feature transforms. In: Advances in Neural Information Processing Systems (2017)Google Scholar
  28. 28.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_48 CrossRefGoogle Scholar
  29. 29.
    Lu, M., et al.: Decoder network over lightweight reconstructed feature for fast semantic style transfer. In: Proceedings of the IEEE International Conference on Computer Vision (2017)Google Scholar
  30. 30.
    Prisma Labs, I. Prisma: turn memories into art using artificial intelligence (2016). http://prisma-ai.com
  31. 31.
    Rosin, P., Collomosse, J.: Image and Video-Based Artistic Stylisation, vol. 42. Springer Science & Business Media, London (2012)Google Scholar
  32. 32.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  33. 33.
    Strothotte, T., Schlechtweg, S.: Non-Photorealistic Computer Graphics: Modeling, Rendering, and Animation. Morgan Kaufmann, San Francisco (2002)Google Scholar
  34. 34.
    Ulyanov, D., Lebedev, V., Vedaldi, A., Lempitsky, V.: Texture networks: feed-forward synthesis of textures and stylized images. In: International Conference on Machine Learning, pp. 1349–1357 (2016)Google Scholar
  35. 35.
    Ulyanov, D., Vedaldi, A., Lempitsky, V.: Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  36. 36.
    Wang, X., Oxholm, G., Zhang, D., Wang, Y.F.: Multimodal transfer: A hierarchical deep convolutional neural network for fast artistic style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  37. 37.
    Wei, Z., Sun, Y., Wang, J., Lai, H., Liu, S.: Learning adaptive receptive fields for deep image parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2434–2442 (2017)Google Scholar
  38. 38.
    Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: International Conference on Learning Representations (2016)Google Scholar
  39. 39.
    Zhang, H., Dana, K.: Multi-style generative network for real-time transfer. arXiv preprint arXiv:1703.06953 (2017)
  40. 40.
    Zhang, H., et al.: Context encoding for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)Google Scholar
  41. 41.
    Zhu, S.C., Guo, C.E., Wang, Y., Xu, Z.: What are textons? Int. J. Comput. Vis. 62(1), 121–143 (2005)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.College of Computer Science and TechnologyZhejiang UniversityHangzhouChina
  2. 2.Alibaba-Zhejiang University Joint Institute of Frontier TechnologiesHangzhouChina
  3. 3.Arizona State UniversityTempeUSA
  4. 4.Deepwise AI LabBeijingChina
  5. 5.UBTECH Sydney AI Centre, SIT, FEITUniversity of SydneySydneyAustralia

Personalised recommendations