Skip to main content
Log in

Hybrid Warping Fusion for Video Frame Interpolation

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Video frame interpolation aims to synthesize new intermediate frames between existing ones, which is an important task in video enhancement. A classic direction in this field is flow-based which estimates motions in the form of optical flow, warps the frames, and synthesizes the final results. In this work, we explicitly investigate the warping step and propose a way to combine the strength from using both forward and backward warping. Our method, named HWFI, introduces hybrid warping fusion for frame interpolation. We also include edge information explicitly in our pipeline and employ channel attention in our synthesis network. Compared to the latest state-of-the-art method that only uses forward warping, our method produces better results with higher quality, especially in edge regions. Extensive experiments show that our method can obtain the best results qualitatively and quantitatively on multiple benchmark datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M. J., & Szeliski, R. (2011). A database and evaluation methodology for optical flow. Int. J. Comput. Vision, 92, 11–31.

    Article  Google Scholar 

  • Bao, W., Lai, W. S., Ma, C., Zhang, X., Gao, Z., & Yang, M. H. (2019). Depth-aware video frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3703–3712).

  • Bao, W., Lai, W. S., Zhang, X., Gao, Z., & Yang, M. H. (2019). MEMC-Net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement. IEEE Trans. Pattern Anal. Mach. Intell., 43(3), 933–948.

    Article  Google Scholar 

  • Bojanowski, P., Joulin, A., Lopez-Pas, D., & Szlam, A. (2018). Optimizing the latent space of generative networks. In International conference on machine learning (pp. 600–609).

  • Cheng, X., & Chen, Z. (2020). Video frame interpolation via deformable separable convolution. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 10607–10614.

    Article  Google Scholar 

  • Cheng, X., & Chen, Z. (2021). Multiple video frame interpolation via enhanced deformable separable convolution. IEEE Transactions on Pattern Analysis and Machine Intelligence (01) 1.

  • Chi, Z., Mohammadi Nasiri, R., Liu, Z., Lu, J., Tang, J., & Plataniotis, K. N. (2020). All at once: Temporally adaptive multi-frame interpolation with advanced motion modeling. In European conference on computer vision (pp. 107–123).

  • Choi, M., Choi, J., Baik, S., Kim, T. H., & Lee, K. M. (2020). Scene-adaptive video frame interpolation via meta-learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9444–9453).

  • Choi, M., Kim, H., Han, B., Xu, N., & Lee, K. M. (2020). Channel attention is all you need for video frame interpolation. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 10663–10671.

    Article  Google Scholar 

  • Ding, T., Liang, L., Zhu, Z., & Zharkov, I. (2021). Cdfi: Compression-driven network design for frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8001–8011).

  • Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., & Brox, T. (2015). Flownet: Learning optical flow with convolutional networks. In Proceedings of the IEEE international conference on computer vision (pp. 2758–2766).

  • Fourure, D., Emonet, R., Fromont, E., Muselet, D., Tremeau, A., & Wolf, C. (2017). Residual conv-deconv grid network for semantic segmentation. In Proceedings of the British machine vision conference (pp. 181.1-181.13).

  • Gu, D., Wen, Z., Cui, W., Wang, R., Jiang, F., & Liu, S. (2019). Continuous bidirectional optical flow for video frame sequence interpolation. In IEEE international conference on multimedia and expo (pp. 1768–1773).

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).

  • Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., & Brox, T. (2017). Flownet 2.0: Evolution of optical flow estimation with deep networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2462–2470).

  • Jiang, H., Sun, D., Jampani, V., Yang, M. H., Learned-Miller, E., & Kautz, J. (2018). Super slomo: High quality estimation of multiple intermediate frames for video interpolation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9000–9008).

  • Kalantari, N. K., Wang, T. C., & Ramamoorthi, R. (2016). Learning-based view synthesis for light field cameras. ACM Transactions on Graphics, 35(6), 1–10.

    Article  Google Scholar 

  • Kang, J., Jo, Y., Oh, S. W., Vajda, P., & Kim, S. J. (2020). Deep space-time video upsampling networks. In European conference on computer vision (pp. 701–717).

  • Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In International conference on learning representations.

  • Lee, H., Kim, T., Chung, T y., Pak, D., Ban, Y., & Lee, S. (2020). Adacof: Adaptive collaboration of flows for video frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5316–5325).

  • Lee, S., Choi, N., & Choi, W. I. (2022). Enhanced correlation matching based video frame interpolation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (pp. 2839–2847).

  • Li, H., Yuan, Y., & Wang, Q. (2020). Video frame interpolation via residue refinement. In IEEE international conference on acoustics, speech and signal processing (pp. 2613–2617).

  • Liu, Y., Xie, L., Siyao, L., Sun, W., Qiao, Y., & Dong, C. (2020). Enhanced quadratic video interpolation. In European conference on computer vision (pp. 41–56).

  • Liu, Y. L., Liao, Y. T., Lin, Y. Y., & Chuang, Y. Y. (2019). Deep video frame interpolation using cyclic frame generation. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 8794–8802.

    Article  Google Scholar 

  • Liu, Z., Yeh, R. A., Tang, X., Liu, Y., & Agarwala, A. (2017). Video frame synthesis using deep voxel flow. In Proceedings of the IEEE international conference on computer vision (pp. 4463–4471).

  • Long, G., Kneip, L., Li, X., Zhang, X., & Yu, Q. (2015). Simplified mirror-based camera pose computation via rotation averaging. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1247–1255).

  • Meyer, S., Djelouah, A., McWilliams, B., Sorkine-Hornung, A., Gross, M., & Schroers, C. (2018). Phasenet for video frame interpolation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 498–507).

  • Meyer, S., Wang, O., Zimmer, H., Grosse, M., & Sorkine-Hornung, A. (2015). Phase-based frame interpolation for video. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1410–1418).

  • Nguyen-Phuoc, T. H., Li, C., Balaban, S., & Yang, Y. (2018). Rendernet: A deep convolutional network for differentiable rendering from 3d shapes. Adv. Neural. Inf. Process. Syst., 31, 7902–7912.

    Google Scholar 

  • Niklaus, S., & Liu, F. (2018). Context-aware synthesis for video frame interpolation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1701–1710).

  • Niklaus, S., & Liu, F. (2020). Softmax splatting for video frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5437–5446).

  • Niklaus, S., Mai, L., & Liu, F. (2017a). Video frame interpolation via adaptive convolution Video frame interpolation via adaptive convolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 670–679).

  • Niklaus, S., Mai, L., & Liu, F. (2017b). Video frame interpolation via adaptive separable convolution. In Proceedings of the IEEE international conference on computer vision (pp. 261–270).

  • Niklaus, S., Mai, L., & Wang, O. (2021). Revisiting adaptive convolutions for video frame interpolation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 1099–1109).

  • Park, J., Ko, K., Lee, C., & Kim, C S. (2020). Bmbc: Bilateral motion estimation with bilateral cost volume for video interpolation. In European conference on computer vision (pp. 109–125).

  • Peleg, T., Szekely, P., Sabo, D., & Sendik, O. (2019). Im-net for high resolution video frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern Recognition (pp. 2398–2407).

  • Reda, F. A., Liu, G., Shih, K. J., Kirby, R., Barker, J., Tarjan, D., & Catanzaro, B. (2018). Sdc-net: Video prediction using spatially-displaced convolution. In European conference on computer vision (pp. 718–733).

  • Reda, F. A., Sun, D., Dundar, A., Shoeybi, M., Liu, G., Shih, K. J., & Catanzaro, B. (2019). Unsupervised video interpolation using cycle consistency. In Proceedings of the IEEE/CVF international conference on computer Vision (pp. 892–900).

  • Shen, W., Bao, W., Zhai, G., Chen, L., Min, X., & Gao, Z. (2020). Blurry video frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5114–5123).

  • Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., & Wang, Z. (2016). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1874–1883).

  • Shi, Z., Liu, X., Shi, K., Dai, L., & Chen, J. (2021). Video frame interpolation via generalized deformable convolution. IEEE Trans. Multimedia, 24, 426–439.

    Article  Google Scholar 

  • Sim, H., Oh, J., & Kim, M. (2021). Xvfi: Extreme video frame interpolation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 14489–14498).

  • Siyao, L., Zhao, S., Yu, W., Sun, W., Metaxas, D., Loy, C. C., & Liu, Z. (2021). Deep animation video interpolation in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6587–6595).

  • Sun, D., Yang, X., Liu, M. Y., & Kautz, J. (2018). Pwc-net: CNNs for optical flow using pyramid, warping, and cost volume. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8934–8943).

  • Teed, Z., & Deng, J. (2020). Raft: Recurrent all-pairs field transforms for optical flow. European Conference on Computer Vision 402–419.

  • Tulyakov, S., Gehrig, D., Georgoulis, S., Erbach, J., Gehrig, M., Li, Y., & Scaramuzza, D. (2021). Time lens: Event-based video frame interpolation. Im Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 16155–16164).

  • Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process., 13(4), 600–612.

    Article  Google Scholar 

  • Wu, C. Y., Singhal, N., & Krahenbuhl, P. (2018). Video compression through image interpolation. In European conference on computer vision (pp. 416–431).

  • Xiang, X., Tian, Y., Zhang, Y., Fu, Y., Allebach, J. P., & Xu, C. (2020). Zooming slow-mo: Fast and accurate one-stage space-time video super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 3370–3379).

  • Xu, X., Siyao, L., Sun, W., Yin, Q., & Yang, M, H. (2019). Quadratic video interpolation. Advances in Neural Information Processing Systems32.

  • Xue, F., Li, J., Liu, J., & Wu, C. (2021). Bwin: A bilateral warping method for video frame interpolation. In IEEE International conference on multimedia and exPO (PP. 1–6).

  • Xue, T., Chen, B., Wu, J., Wei, D., & Freeman, W. T. (2019). Video enhancement with task-oriented flow. International Journal of Computer Vision, 127(8), 1106–1125.

    Article  Google Scholar 

  • Zhang, H., Zhao, Y., & Wang, R. (2020). A flexible recurrent residual pyramid network for video frame interpolation. In European conference on computer vision (pp. 474–491).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Li.

Additional information

Communicated by Shaodi You.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 24209 KB)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Y., Zhu, Y., Li, R. et al. Hybrid Warping Fusion for Video Frame Interpolation. Int J Comput Vis 130, 2980–2993 (2022). https://doi.org/10.1007/s11263-022-01683-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-022-01683-9

Keywords

Navigation