Enhanced Quadratic Video Interpolation

Liu, Yihao; Xie, Liangbin; Siyao, Li; Sun, Wenxiu; Qiao, Yu; Dong, Chao

doi:10.1007/978-3-030-66823-5_3

Yihao Liu^10,11,
Liangbin Xie^10,11,
Li Siyao¹²,
Wenxiu Sun¹²,
Yu Qiao¹⁰ &
…
Chao Dong¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12538))

Included in the following conference series:

European Conference on Computer Vision

3012 Accesses
42 Citations

Abstract

With the prosperity of digital video industry, video frame interpolation has arisen continuous attention in computer vision community and become a new upsurge in industry. Many learning-based methods have been proposed and achieved progressive results. Among them, a recent algorithm named quadratic video interpolation (QVI) achieves appealing performance. It exploits higher-order motion information (e.g. acceleration) and successfully models the estimation of interpolated flow. However, its produced intermediate frames still contain some unsatisfactory ghosting, artifacts and inaccurate motion, especially when large and complex motion occurs. In this work, we further improve the performance of QVI from three facets and propose an enhanced quadratic video interpolation (EQVI) model. In particular, we adopt a rectified quadratic flow prediction (RQFP) formulation with least squares method to estimate the motion more accurately. Complementary with image pixel-level blending, we introduce a residual contextual synthesis network (RCSN) to employ contextual information in high-dimensional feature space, which could help the model handle more complicated scenes and motion patterns. Moreover, to further boost the performance, we devise a novel multi-scale fusion network (MS-Fusion) which can be regarded as a learnable augmentation process. The proposed EQVI model won the first place in the AIM2020 Video Temporal Super-Resolution Challenge. Codes are available at https://github.com/lyh-18/EQVI.

Y. Liu and L. Xie—Co-first authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
RQFP has no trainable parameters and only rectifies the formula of intermediate flow estimation. However, it requires matrix multiplication and costs more training time, so we adopt it after RCSN is equipped to speed up the entire learning process.
2.
Put \(t=-1\), \(t=1\) and \(t=2\) into Eq. (2), respectively.
3.
Derived from the first and second formulas of Eq. (4). Similar derivations for the others.
4.
https://data.vision.ee.ethz.ch/cvl/aim20/.
5.
The serial numbers are 002, 005, 010, 017 and 025.

References

Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. Int. J. Comput. Vision 92(1), 1–31 (2011)
Article Google Scholar
Bao, W., Lai, W.S., Ma, C., Zhang, X., Gao, Z., Yang, M.H.: Depth-aware video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3703–3712 (2019)
Google Scholar
Bar-Haim, A., Wolf, L.: ScopeFlow: dynamic scene scoping for optical flow. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7998–8007 (2020)
Google Scholar
Bojanowski, P., Joulin, A., Lopez-Paz, D., Szlam, A.: Optimizing the latent space of generative networks. arXiv preprint arXiv:1707.05776 (2017)
Choi, M., Kim, H., Han, B., Xu, N., Lee, K.M.: Channel attention is all you need for video frame interpolation. In: AAAI, pp. 10663–10671 (2020)
Google Scholar
Dai, J., et al.: Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 764–773 (2017)
Google Scholar
Gui, S., Wang, C., Chen, Q., Tao, D.: FeatureFlow: robust video interpolation via structure-to-texture generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14004–14013 (2020)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Jiang, H., Sun, D., Jampani, V., Yang, M.H., Learned-Miller, E., Kautz, J.: Super SloMo: high quality estimation of multiple intermediate frames for video interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9000–9008 (2018)
Google Scholar
Lee, H., Kim, T., Chung, T.y., Pak, D., Ban, Y., Lee, S.: AdaCoF: adaptive collaboration of flows for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5316–5325 (2020)
Google Scholar
Liu, Z., Yeh, R.A., Tang, X., Liu, Y., Agarwala, A.: Video frame synthesis using deep voxel flow. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4463–4471 (2017)
Google Scholar
Meyer, S., Djelouah, A., McWilliams, B., Sorkine-Hornung, A., Gross, M., Schroers, C.: PhaseNet for video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 498–507 (2018)
Google Scholar
Nah, S., et al.: NTIRE 2019 challenges on video deblurring and super-resolution: Dataset and study. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019
Google Scholar
Nah, S., Son, S., Timofte, R., Lee, K.M., et al.: AIM 2019 challenge on video temporal super-resolution: methods and results. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, pp. 3388–3398. IEEE (2019)
Google Scholar
Niklaus, S., Liu, F.: Context-aware synthesis for video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1710 (2018)
Google Scholar
Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 670–679 (2017)
Google Scholar
Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive separable convolution. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 261–270 (2017)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874–1883 (2016)
Google Scholar
Son, S., Lee, J., Nah, S., Timofte, R., Lee, K.M., et al.: AIM 2020 challenge on video temporal super-resolution. In: European Conference on Computer Vision Workshops (2020)
Google Scholar
Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)
Su, S., Delbracio, M., Wang, J., Sapiro, G., Heidrich, W., Wang, O.: Deep video deblurring for hand-held cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1279–1288 (2017)
Google Scholar
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8934–8943 (2018)
Google Scholar
Xu, X., Siyao, L., Sun, W., Yin, Q., Yang, M.H.: Quadratic video interpolation. In: Advances in Neural Information Processing Systems, pp. 1647–1656 (2019)
Google Scholar
Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. Int. J. Comput. Vision 127(8), 1106–1125 (2019)
Article Google Scholar

Download references

Acknowledgement

This work is partially supported by the National Natural Science Foundation of China (61906184), Science and Technology Service Network Initiative of Chinese Academy of Sciences (KFJ-STS-QYZX-092), Shenzhen Basic Research Program (JSGG20180507182100698, CXB201104220032A), the Joint Lab of CAS-HKShenzhen Institute of Artificial Intelligence and Robotics for Society.

Author information

Authors and Affiliations

ShenZhen Key Lab of Computer Vision and Pattern Recognition, SIAT-SenseTime Joint Lab, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Yihao Liu, Liangbin Xie, Yu Qiao & Chao Dong
University of Chinese Academy of Sciences, Beijing, China
Yihao Liu & Liangbin Xie
SenseTime Research, Beijing, China
Li Siyao & Wenxiu Sun

Authors

Yihao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Liangbin Xie
View author publications
You can also search for this author in PubMed Google Scholar
Li Siyao
View author publications
You can also search for this author in PubMed Google Scholar
Wenxiu Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yu Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Chao Dong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yihao Liu .

Editor information

Editors and Affiliations

University of Clermont Auvergne, Clermont Ferrand, France
Adrien Bartoli
Università degli Studi di Udine, Udine, Italy
Andrea Fusiello

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Y., Xie, L., Siyao, L., Sun, W., Qiao, Y., Dong, C. (2020). Enhanced Quadratic Video Interpolation. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science(), vol 12538. Springer, Cham. https://doi.org/10.1007/978-3-030-66823-5_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-66823-5_3
Published: 03 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66822-8
Online ISBN: 978-3-030-66823-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics