# Artistic Style Transfer for Videos and Spherical Images

- 1.3k Downloads
- 1 Citations

## Abstract

Manually re-drawing an image in a certain artistic style takes a professional artist a long time. Doing this for a video sequence single-handedly is beyond imagination. We present two computational approaches that transfer the style from one image (for example, a painting) to a whole video sequence. In our first approach, we adapt to videos the original image style transfer technique by Gatys et al. based on energy minimization. We introduce new ways of initialization and new loss functions to generate consistent and stable stylized video sequences even in cases with large motion and strong occlusion. Our second approach formulates video stylization as a learning problem. We propose a deep network architecture and training procedures that allow us to stylize arbitrary-length videos in a consistent and stable way, and nearly in real time. We show that the proposed methods clearly outperform simpler baselines both qualitatively and quantitatively. Finally, we propose a way to adapt these approaches also to 360\(^\circ \) images and videos as they emerge with recent virtual reality hardware.

## Keywords

Style transfer Deep networks Artistic videos Video stylization## References

- Butler, D. J., Wulff, J., Stanley, G. B., & Black, M. J. (2012). A naturalistic open source movie for optical flow evaluation. In
*ECCV*(pp. 611–625).Google Scholar - Chen, D., Liao, J., Yuan, L., Yu, N., & Hua, G. (2017). Coherent online video style transfer. In
*ICCV*(pp. 1114–1123).Google Scholar - Collobert, R., Kavukcuoglu, K., & Farabet, C. (2011). Torch7: A matlab-like environment for machine learning. In
*BigLearn*, NIPS Workshop.Google Scholar - Gatys, L. A., Ecker, A. S., & Bethge, M. (2015). Texture synthesis using convolutional neural networks. In
*NIPS*(pp. 262–270).Google Scholar - Gatys, L. A., Ecker, A. S., & Bethge, M. (2016). Image style transfer using convolutional neural networks. In
*CVPR*(pp. 2414–2423).Google Scholar - Ghiasi, G., Lee, H., Kudlur, M., Dumoulin, V., & Shlens, J. (2017). Exploring the structure of a real-time, arbitrary neural artistic stylization network. In
*BMVC*.Google Scholar - Gupta, A., Johnson, J., Alahi, A., & Fei-Fei, L. (2017). Characterizing and improving stability in neural style transfer. In
*ICCV*(pp. 4087–4096).Google Scholar - Hays, J., & Essa, I. (2004). Image and video based painterly animation. In
*Proceedings of the 3rd international symposium on non-photorealistic animation and rendering, NPAR*(pp. 113–120).Google Scholar - Huang, H., Wang, H., Luo, W., Ma, L., Jiang, W., Zhu, X., Li, Z., & Liu, W. (2017). Real-time neural style transfer for videos. In
*CVPR*(pp. 7044–7052).Google Scholar - Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., & Brox, T. (2017). Flownet 2.0: Evolution of optical flow estimation with deep networks. In
*CVPR*.Google Scholar - Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In
*ICML*.Google Scholar - Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In
*ECCV*(pp. 694–711).Google Scholar - Li, C., & Wand, M. (2016a). Combining markov random fields and convolutional neural networks for image synthesis. In
*CVPR*(pp. 2479–2486).Google Scholar - Li, C., & Wand, M. (2016b). Precomputed real-time texture synthesis with markovian generative adversarial networks. In
*ECCV*(pp. 702–716).Google Scholar - Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollr, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In
*ECCV*.Google Scholar - Litwinowicz, P. (1997). Processing images and video for an impressionist effect. In
*Proceedings of the 24th annual conference on computer graphics and interactive techniques, SIGGRAPH*(pp. 407–414).Google Scholar - Luan, F., Paris, S., Shechtman, E., & Bala, K. (2017). Deep photo style transfer. arXiv:1703.07511.
- Marszalek, M., Laptev, I., & Schmid, C. (2009). Actions in context. In
*CVPR*(pp. 2929–2936).Google Scholar - Nikulin, Y., & Novak, R. (2016). Exploring the neural algorithm of artistic style. CoRR. arXiv:abs/1602.07188.
- O’Donovan, P., & Hertzmann, A. (2012). Anipaint: Interactive painterly animation from video.
*Transactions on Visualization and Computer Graphics*,*18*(3), 475–487.CrossRefGoogle Scholar - Revaud, J., Weinzaepfel, P., Harchaoui, Z., & Schmid, C. (2015). Epicflow: Edge-preserving interpolation of correspondences for optical flow. In
*CVPR*(pp. 1164–1172).Google Scholar - Ruder, M., Dosovitskiy, A., & Brox, T. (2016). Artistic style transfer for videos. In
*GCPR*(pp. 26–36).Google Scholar - Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In
*ICLR*.Google Scholar - Sundaram, N., Brox, T., & Keutzer, K. (2010). Dense point trajectories by GPU-accelerated large displacement optical flow. In
*ECCV*(pp. 438–451).Google Scholar - Ulyanov, D., Lebedev, V., Vedaldi, A., & Lempitsky, V. S. (2016). Texture networks: Feed-forward synthesis of textures and stylized images. In
*ICML*(pp. 1349–1357).Google Scholar - Ulyanov, D., Vedaldi, A., & Lempitsky, V. S. (2016). Instance normalization: The missing ingredient for fast stylization. CoRR. arXiv:abs/1607.08022.
- Weinzaepfel, P., Revaud, J., Harchaoui, Z., & Schmid, C. (2013). DeepFlow: Large displacement optical flow with deep matching. In
*ICCV*(pp. 1385–1392).Google Scholar - Yu, F., & Koltun, V. (2016). Multi-scale context aggregation by dilated convolutions. In
*ICLR*.Google Scholar - Zhang, H., & Dana, K. J. (2017). Multi-style generative network for real-time transfer. CoRR. arXiv:abs/1703.06953.