Advertisement

International Journal of Computer Vision

, Volume 126, Issue 11, pp 1199–1219 | Cite as

Artistic Style Transfer for Videos and Spherical Images

  • Manuel Ruder
  • Alexey Dosovitskiy
  • Thomas Brox
Article

Abstract

Manually re-drawing an image in a certain artistic style takes a professional artist a long time. Doing this for a video sequence single-handedly is beyond imagination. We present two computational approaches that transfer the style from one image (for example, a painting) to a whole video sequence. In our first approach, we adapt to videos the original image style transfer technique by Gatys et al. based on energy minimization. We introduce new ways of initialization and new loss functions to generate consistent and stable stylized video sequences even in cases with large motion and strong occlusion. Our second approach formulates video stylization as a learning problem. We propose a deep network architecture and training procedures that allow us to stylize arbitrary-length videos in a consistent and stable way, and nearly in real time. We show that the proposed methods clearly outperform simpler baselines both qualitatively and quantitatively. Finally, we propose a way to adapt these approaches also to 360\(^\circ \) images and videos as they emerge with recent virtual reality hardware.

Keywords

Style transfer Deep networks Artistic videos Video stylization 

References

  1. Butler, D. J., Wulff, J., Stanley, G. B., & Black, M. J. (2012). A naturalistic open source movie for optical flow evaluation. In ECCV (pp. 611–625).Google Scholar
  2. Chen, D., Liao, J., Yuan, L., Yu, N., & Hua, G. (2017). Coherent online video style transfer. In ICCV (pp. 1114–1123).Google Scholar
  3. Collobert, R., Kavukcuoglu, K., & Farabet, C. (2011). Torch7: A matlab-like environment for machine learning. In BigLearn, NIPS Workshop.Google Scholar
  4. Gatys, L. A., Ecker, A. S., & Bethge, M. (2015). Texture synthesis using convolutional neural networks. In NIPS (pp. 262–270).Google Scholar
  5. Gatys, L. A., Ecker, A. S., & Bethge, M. (2016). Image style transfer using convolutional neural networks. In CVPR (pp. 2414–2423).Google Scholar
  6. Ghiasi, G., Lee, H., Kudlur, M., Dumoulin, V., & Shlens, J. (2017). Exploring the structure of a real-time, arbitrary neural artistic stylization network. In BMVC.Google Scholar
  7. Gupta, A., Johnson, J., Alahi, A., & Fei-Fei, L. (2017). Characterizing and improving stability in neural style transfer. In ICCV (pp. 4087–4096).Google Scholar
  8. Hays, J., & Essa, I. (2004). Image and video based painterly animation. In Proceedings of the 3rd international symposium on non-photorealistic animation and rendering, NPAR (pp. 113–120).Google Scholar
  9. Huang, H., Wang, H., Luo, W., Ma, L., Jiang, W., Zhu, X., Li, Z., & Liu, W. (2017). Real-time neural style transfer for videos. In CVPR (pp. 7044–7052).Google Scholar
  10. Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., & Brox, T. (2017). Flownet 2.0: Evolution of optical flow estimation with deep networks. In CVPR .Google Scholar
  11. Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML.Google Scholar
  12. Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In ECCV (pp. 694–711).Google Scholar
  13. Li, C., & Wand, M. (2016a). Combining markov random fields and convolutional neural networks for image synthesis. In CVPR (pp. 2479–2486).Google Scholar
  14. Li, C., & Wand, M. (2016b). Precomputed real-time texture synthesis with markovian generative adversarial networks. In ECCV (pp. 702–716).Google Scholar
  15. Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollr, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In ECCV .Google Scholar
  16. Litwinowicz, P. (1997). Processing images and video for an impressionist effect. In Proceedings of the 24th annual conference on computer graphics and interactive techniques, SIGGRAPH (pp. 407–414).Google Scholar
  17. Luan, F., Paris, S., Shechtman, E., & Bala, K. (2017). Deep photo style transfer. arXiv:1703.07511.
  18. Marszalek, M., Laptev, I., & Schmid, C. (2009). Actions in context. In CVPR (pp. 2929–2936).Google Scholar
  19. Nikulin, Y., & Novak, R. (2016). Exploring the neural algorithm of artistic style. CoRR. arXiv:abs/1602.07188.
  20. O’Donovan, P., & Hertzmann, A. (2012). Anipaint: Interactive painterly animation from video. Transactions on Visualization and Computer Graphics, 18(3), 475–487.CrossRefGoogle Scholar
  21. Revaud, J., Weinzaepfel, P., Harchaoui, Z., & Schmid, C. (2015). Epicflow: Edge-preserving interpolation of correspondences for optical flow. In CVPR (pp. 1164–1172).Google Scholar
  22. Ruder, M., Dosovitskiy, A., & Brox, T. (2016). Artistic style transfer for videos. In GCPR (pp. 26–36).Google Scholar
  23. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR .Google Scholar
  24. Sundaram, N., Brox, T., & Keutzer, K. (2010). Dense point trajectories by GPU-accelerated large displacement optical flow. In ECCV (pp. 438–451).Google Scholar
  25. Ulyanov, D., Lebedev, V., Vedaldi, A., & Lempitsky, V. S. (2016). Texture networks: Feed-forward synthesis of textures and stylized images. In ICML (pp. 1349–1357).Google Scholar
  26. Ulyanov, D., Vedaldi, A., & Lempitsky, V. S. (2016). Instance normalization: The missing ingredient for fast stylization. CoRR. arXiv:abs/1607.08022.
  27. Weinzaepfel, P., Revaud, J., Harchaoui, Z., & Schmid, C. (2013). DeepFlow: Large displacement optical flow with deep matching. In ICCV (pp. 1385–1392).Google Scholar
  28. Yu, F., & Koltun, V. (2016). Multi-scale context aggregation by dilated convolutions. In ICLR .Google Scholar
  29. Zhang, H., & Dana, K. J. (2017). Multi-style generative network for real-time transfer. CoRR. arXiv:abs/1703.06953.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer Science, BIOSS Centre for Biological Signalling StudiesUniversity of FreiburgFreiburgGermany

Personalised recommendations