Proposal-Based Video Completion

Hu, Yuan-Ting; Wang, Heng; Ballas, Nicolas; Grauman, Kristen; Schwing, Alexander G.

doi:10.1007/978-3-030-58583-9_3

Yuan-Ting Hu¹²,
Heng Wang¹³,
Nicolas Ballas¹⁴,
Kristen Grauman^14,15 &
…
Alexander G. Schwing¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12372))

Included in the following conference series:

European Conference on Computer Vision

3602 Accesses
13 Citations

Abstract

Video inpainting is an important technique for a wide variety of applications from video content editing to video restoration. Early approaches follow image inpainting paradigms, but are challenged by complex camera motion and non-rigid deformations. To address these challenges flow-guided propagation techniques have been proposed. However, computation of flow is non-trivial for unobserved regions and propagation across a whole video sequence is computationally demanding. In contrast, in this paper, we propose a video inpainting algorithm based on proposals: we use 3D convolutions to obtain an initial inpainting estimate which is subsequently refined by fusing a generated set of proposals. Different from existing approaches for video inpainting, and inspired by well-explored mechanisms for object detection, we argue that proposals provide a rich source of information that permits combining similarly looking patches that may be spatially and temporally far from the region to be inpainted. We validate the effectiveness of our method on the challenging YouTube VOS and DAVIS datasets using different settings and demonstrate results outperforming state-of-the-art on standard metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Flow and Color Inpainting for Video Completion

Flow-edge Guided Video Completion

Inpainting Strategies for Reconstruction of Missing Data in Images and Videos: Techniques, Algorithms and Quality Assessment

References

Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: PatchMatch: a randomized correspondence algorithm for structural image editing. ACM TOG 28(3), 24 (2009)
Article Google Scholar
Chang, Y.L., Liu, Z.Y., Lee, K.Y., Hsu, W.: Free-form video inpainting with 3D gated convolution and temporal PatchGAN. In: Proceedings of the ICCV (2019)
Google Scholar
Chang, Y.L., Liu, Z.Y., Lee, K.Y., Hsu, W.: Learnable gated temporal shift module for deep video inpainting. In: Proceedings of the BMVC (2019)
Google Scholar
Criminisi, A., Pérez, P., Toyama, K.: Region filling and object removal by exemplar-based image inpainting. IEEE TIP 13(9), 1200–1212 (2004)
Google Scholar
Efros, A.A., Freeman, W.T.: Image quilting for texture synthesis and transfer. In: Proceedings of the Computer Graphics and Interactive Techniques (2001)
Google Scholar
Efros, A.A., Leung, T.K.: Texture synthesis by non-parametric sampling. In: Proceedings of the ICCV (1999)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the CVPR (2014)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Proceedings of the NeurIPS (2014)
Google Scholar
Granados, M., Kim, K.I., Tompkin, J., Kautz, J., Theobalt, C.: Background inpainting for videos with dynamic objects and a free-moving camera. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 682–695. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_49
Chapter Google Scholar
Granados, M., Tompkin, J., Kim, K., Grau, O., Kautz, J., Theobalt, C.: How not to be seen-object removal from videos of crowded scenes. In: Computer Graphics Forum (2012)
Google Scholar
Hays, J., Efros, A.A.: Scene completion using millions of photographs. ACM TOG 26(3), 4-es (2007)
Article Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the ICCV (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE TPAMI 37(9), 1904–1916 (2015)
Article Google Scholar
Huang, J.B., Kang, S.B., Ahuja, N., Kopf, J.: Temporally coherent completion of dynamic video. ACM TOG 35(6), 1–11 (2016)
Google Scholar
Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM TOG 36(4), 1–14 (2017)
Article Google Scholar
Ilan, S., Shamir, A.: A survey on data-driven video completion. In: Computer Graphics Forum (2015)
Google Scholar
Kim, D., Woo, S., Lee, J.Y., Kweon, I.S.: Deep blind video decaptioning by temporal aggregation and recurrence. In: Proceedings of the CVPR (2019)
Google Scholar
Kim, D., Woo, S., Lee, J.Y., Kweon, I.S.: Deep video inpainting. In: Proceedings of the CVPR (2019)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lee, S., Oh, S.W., Won, D., Kim, S.J.: Copy-and-paste networks for deep video inpainting. In: Proceedings of the ICCV (2019)
Google Scholar
Liu, G., Reda, F.A., Shih, K.J., Wang, T.-C., Tao, A., Catanzaro, B.: Image inpainting for irregular holes using partial convolutions. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 89–105. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_6
Chapter Google Scholar
Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018)
Nazeri, K., Ng, E., Joseph, T., Qureshi, F., Ebrahimi, M.: EdgeConnect: generative image inpainting with adversarial edge learning. arXiv preprint arXiv:1901.00212 (2019)
Newson, A., Almansa, A., Fradet, M., Gousseau, Y., Pérez, P.: Video inpainting of complex scenes. SIAM J. Imaging Sci. 7(4), 1993–2019 (2014)
Article MathSciNet Google Scholar
Oh, S.W., Lee, S., Lee, J.Y., Kim, S.J.: Onion-peel networks for deep video completion. In: Proceedings of the ICCV (2019)
Google Scholar
Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the CVPR, pp. 724–732 (2016)
Google Scholar
Pont-Tuset, J., Perazzi, F., Caelles, S., Arbeláez, P., Sorkine-Hornung, A., Van Gool, L.: The 2017 Davis challenge on video object segmentation. arXiv:1704.00675 (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of the NeurIPS (2015)
Google Scholar
Ren, Y., Yu, X., Zhang, R., Li, T.H., Liu, S., Li, G.: StructureFlow: image inpainting via structure-aware appearance flow. In: Proceedings of the ICCV (2019)
Google Scholar
Song, Y., et al.: Contextual-based image inpainting: infer, match, and translate. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 3–18. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_1
Chapter Google Scholar
Strobel, M., Diebold, J., Cremers, D.: Flow and color inpainting for video completion. In: Jiang, X., Hornegger, J., Koch, R. (eds.) GCPR 2014. LNCS, vol. 8753, pp. 293–304. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11752-2_23
Chapter Google Scholar
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Deep image prior. In: Proceedings of the CVPR (2018)
Google Scholar
Wang, C., Huang, H., Han, X., Wang, J.: Video inpainting by jointly learning temporal structure and spatial details. arXiv preprint arXiv:1806.08482 (2018)
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the CVPR (2018)
Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from error visibility to structural similarity. IEEE TIP 13(4), 600–612 (2004)
Google Scholar
Woo, S., Kim, D., Park, K., Lee, J.Y., Kweon, I.S.: Align-and-attend network for globally and locally coherent video inpainting. arXiv preprint arXiv:1905.13066 (2019)
Xiong, W., et al.: Foreground-aware image inpainting. In: Proceedings of the CVPR (2019)
Google Scholar
Xu, N., et al.: YouTube-VOS: a large-scale video object segmentation benchmark. arXiv preprint arXiv:1809.03327 (2018)
Xu, R., Li, X., Zhou, B., Loy, C.C.: Deep flow-guided video inpainting. arXiv preprint arXiv:1905.02884 (2019)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: Proceedings of the CVPR (2018)
Google Scholar
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: Proceedings of the ICCV (2019)
Google Scholar
Zhang, H., Mai, L., Xu, N., Wang, Z., Collomosse, J., Jin, H.: An internal learning approach to video inpainting. In: Proceedings of the ICCV (2019)
Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the CVPR (2018)
Google Scholar

Download references

Acknowledgements

This work is supported in part by NSF under Grant No. 1718221 and MRI #1725729, UIUC, Samsung, 3M, and Cisco Systems Inc. (Gift Award CG 1377144). We thank Cisco for access to the Arcetri cluster.

Author information

Authors and Affiliations

University of Illinois Urbana-Champaign, Urbana, USA
Yuan-Ting Hu & Alexander G. Schwing
Facebook AI, Menlo Park, USA
Heng Wang
Facebook AI Research, Menlo Park, USA
Nicolas Ballas & Kristen Grauman
University of Texas at Austin, Austin, USA
Kristen Grauman

Authors

Yuan-Ting Hu
View author publications
You can also search for this author in PubMed Google Scholar
Heng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Ballas
View author publications
You can also search for this author in PubMed Google Scholar
Kristen Grauman
View author publications
You can also search for this author in PubMed Google Scholar
Alexander G. Schwing
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuan-Ting Hu .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 819 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, YT., Wang, H., Ballas, N., Grauman, K., Schwing, A.G. (2020). Proposal-Based Video Completion. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12372. Springer, Cham. https://doi.org/10.1007/978-3-030-58583-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-58583-9_3
Published: 19 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58582-2
Online ISBN: 978-3-030-58583-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Proposal-Based Video Completion

Abstract

Access this chapter

Similar content being viewed by others

Flow and Color Inpainting for Video Completion

Flow-edge Guided Video Completion

Inpainting Strategies for Reconstruction of Missing Data in Images and Videos: Techniques, Algorithms and Quality Assessment

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 819 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Proposal-Based Video Completion

Abstract

Access this chapter

Similar content being viewed by others

Flow and Color Inpainting for Video Completion

Flow-edge Guided Video Completion

Inpainting Strategies for Reconstruction of Missing Data in Images and Videos: Techniques, Algorithms and Quality Assessment

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 819 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation