Abstract
Video style transfer is a useful component for applications such as augmented reality, non-photorealistic rendering, and interactive games. Many existing methods use optical flow to preserve the temporal smoothness of the synthesized video. However, the estimation of optical flow is sensitive to occlusions and rapid motions. Thus, in this work, we introduce a novel evolve-sync loss computed by evolvements to replace optical flow. Using this evolve-sync loss, we build an adversarial learning framework, termed as Video Style Transfer Generative Adversarial Network (VST-GAN), which improves upon the MGAN method for image style transfer for more efficient video style transfer. We perform extensive experimental evaluations of our method and show quantitative and qualitative improvements over the state-of-the-art methods.
W. Li and L. Wen—Equally contributed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The naming fashion of real and fake samples follow the convention of GAN: the output \(\mathcal{Y}\) of G is considered to be fake, while the precomputed \(\mathcal {Y}^{\prime }\) is real.
References
Anderson, A.G., Berg, C.P., Mossing, D.P., Olshausen, B.A.: Deepmovie: Using optical flow and deep neural networks to stylize movies. CoRR abs/1605.08153 (2016)
Bousseau, A., Neyret, F., Thollot, J., Salesin, D.: Video watercolorization using bidirectional texture advection. TOG 26(3), 104 (2007)
Chen, D., Liao, J., Yuan, L., Yu, N., Hua, G.: Coherent online video style transfer. In: ICCV (2017)
Chen, D., Yuan, L., Liao, J., Yu, N., Hua, G.: Stylebank: an explicit representation for neural image style transfer. In: CVPR, pp. 1897–1906 (2017)
Chen, D., Yuan, L., Liao, J., Yu, N., Hua, G.: Stereoscopic neural style transfer. In: CVPR, pp. 1–9 (2018)
Denton, E.L., Chintala, S., Szlam, A., Fergus, R.: Deep generative image models using a laplacian pyramid of adversarial networks. In: NIPS, pp. 1486–1494 (2015)
Fan, Q., Chen, D., Yuan, L., Hua, G., Yu, N., Chen, B.: Decouple learning for parameterized image operators. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 455–471. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_27
Gatys, L.A., Ecker, A.S., Bethge, M.: Texture synthesis using convolutional neural networks. In: NIPS, pp. 262–270 (2015)
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: CVPR, pp. 2414–2423 (2016)
Goodfellow, I.J., et al.: Generative adversarial nets. In: NIPS, pp. 2672–2680 (2014)
Gretton, A., Borgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.J.: A kernel two-sample test. JMLR 13, 723–773 (2012)
Hays, J., Essa, I.A.: Image and video based painterly animation. In: NPAR, pp. 113–120 (2004)
He, M., Chen, D., Liao, J., Sander, P.V., Yuan, L.: Deep exemplar-based colorization. TOG 37(4), 47:1–47:16 (2018)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. CoRR abs/1412.6980 (2014)
Kyprianidis, J.E., Collomosse, J.P., Wang, T., Isenberg, T.: State of the “art”: a taxonomy of artistic stylization techniques for images and video. TVCG 19(5), 866–885 (2013)
Li, C., Wand, M.: Precomputed real-time texture synthesis with markovian generative adversarial networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 702–716. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_43
Lu, J., Sander, P.V., Finkelstein, A.: Interactive painterly stylization of images, videos and 3D animations. In: SI3D, pp. 127–134 (2010)
Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: CVPR, pp. 5188–5196 (2015)
Mathieu, M., Couprie, C., LeCun, Y.: Deep multi-scale video prediction beyond mean square error, pp. 1–14 (2016)
Mathieu, M., Zhao, J.J., Sprechmann, P., Ramesh, A., LeCun, Y.: Disentangling factors of variation in deep representation using adversarial training. In: NIPS, pp. 5041–5049 (2016)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR abs/1511.06434 (2015)
Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: EpicFlow: edge-preserving interpolation of correspondences for optical flow. In: CVPR, pp. 1164–1172 (2015)
Ruder, M., Dosovitskiy, A., Brox, T.: Artistic style transfer for videos. In: GCPR, pp. 26–36 (2016)
Ulyanov, D., Lebedev, V., Vedaldi, A., Lempitsky, V.S.: Texture networks: feed-forward synthesis of textures and stylized images. In: ICML, pp. 1349–1357 (2016)
Vondrick, C., Pirsiavash, H., Torralba, A.: Generating videos with scene dynamics. In: NIPS, pp. 613–621 (2016)
Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: Deepflow: large displacement optical flow with deep matching. In: ICCV, pp. 1385–1392 (2013)
Xu, T., et al.: AttnGAN: fine-grained text to image generation with attentional generative adversarial networks, pp. 1–9 (2018)
Zhang, H., Xu, T., Li, H.: StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: ICCV, pp. 5908–5916 (2017)
Zhang, H., et al.: StackGAN++: Realistic image synthesis with stacked generative adversarial networks. CoRR abs/1710.10916 (2017)
Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 649–666. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_40
Zhang, S., Li, X., Hu, S., Martin, R.R.: Online video stream abstraction and stylization. TMM 13(6), 1286–1294 (2011)
Zhu, J.-Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 597–613. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_36
Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, W., Wen, L., Bian, X., Lyu, S. (2019). Evolvement Constrained Adversarial Learning for Video Style Transfer. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11361. Springer, Cham. https://doi.org/10.1007/978-3-030-20887-5_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-20887-5_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20886-8
Online ISBN: 978-3-030-20887-5
eBook Packages: Computer ScienceComputer Science (R0)