Improving the Perceptual Quality of 2D Animation Interpolation

Chen, Shuhong; Zwicker, Matthias

doi:10.1007/978-3-031-19790-1_17

Shuhong Chen¹² &
Matthias Zwicker¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13677))

Included in the following conference series:

European Conference on Computer Vision

2629 Accesses
6 Citations
1 Altmetric

Abstract

Traditional 2D animation is labor-intensive, often requiring animators to manually draw twelve illustrations per second of movement. While automatic frame interpolation may ease this burden, 2D animation poses additional difficulties compared to photorealistic video. In this work, we address challenges unexplored in previous animation interpolation systems, with a focus on improving perceptual quality. Firstly, we propose SoftsplatLite (SSL), a forward-warping interpolation architecture with fewer trainable parameters and better perceptual performance. Secondly, we design a Distance Transform Module (DTM) that leverages line proximity cues to correct aberrations in difficult solid-color regions. Thirdly, we define a Restricted Relative Linear Discrepancy metric (RRLD) to automate the previously manual training data collection process. Lastly, we explore evaluation of 2D animation generation through a user study, and establish that the LPIPS perceptual metric and chamfer line distance (CD) are more appropriate measures of quality than PSNR and SSIM used in prior art.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Learning a perceptual manifold with deep features for animation video resequencing

Article 18 March 2022

Optimizations of VR360 Animation Production Process

CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes

References

Bao, W., Lai, W.S., Ma, C., Zhang, X., Gao, Z., Yang, M.H.: Depth-aware video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3703–3712 (2019)
Google Scholar
Bao, W., Lai, W.S., Zhang, X., Gao, Z., Yang, M.H.: MEMC-Net: motion estimation and motion compensation driven neural network for video interpolation and enhancement. IEEE Trans. Pattern Anal. Mach. Intell. 43, 933–948 (2019)
Article Google Scholar
Blau, Y., Michaeli, T.: The perception-distortion tradeoff. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6228–6237 (2018)
Google Scholar
Cao, T.T., Tang, K., Mohamed, A., Tan, T.S.: Parallel banding algorithm to compute exact distance transform with the GPU. In: Proceedings of the 2010 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, pp. 83–90 (2010)
Google Scholar
Casey, E., Pérez, V., Li, Z.: The animation transformer: visual correspondence via segment matching. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11323–11332 (2021)
Google Scholar
Choi, M., Kim, H., Han, B., Xu, N., Lee, K.M.: Channel attention is all you need for video frame interpolation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 10663–10671 (2020)
Google Scholar
Dalstein, B., Ronfard, R., Van De Panne, M.: Vector graphics animation with time-varying topology. ACM Trans. Graph. (TOG) 34(4), 1–12 (2015)
Article Google Scholar
Falcon, W., The PyTorch Lightning team: PyTorch Lightning (2019). https://doi.org/10.5281/zenodo.3828935. https://github.com/PyTorchLightning/pytorch-lightning
Felzenszwalb, P.F., Huttenlocher, D.P.: Distance transforms of sampled functions. Theory Comput. 8(1), 415–428 (2012)
Article MathSciNet Google Scholar
Fourure, D., Emonet, R., Fromont, E., Muselet, D., Tremeau, A., Wolf, C.: Residual conv-deconv grid network for semantic segmentation. arXiv preprint arXiv:1707.07958 (2017)
Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Huang, Z., Zhang, T., Heng, W., Shi, B., Zhou, S.: Rife: real-time intermediate flow estimation for video frame interpolation. arXiv preprint arXiv:2011.06294 (2020)
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2462–2470 (2017)
Google Scholar
Jaegle, A., et al.: Perceiver IO: a general architecture for structured inputs & outputs. arXiv preprint arXiv:2107.14795 (2021)
Jiang, H., Sun, D., Jampani, V., Yang, M.H., Learned-Miller, E., Kautz, J.: Super SloMo: high quality estimation of multiple intermediate frames for video interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9000–9008 (2018)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105 (2012)
Google Scholar
Liu, L., et al.: Learning by analogy: reliable supervision from transformations for unsupervised optical flow estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6489–6498 (2020)
Google Scholar
Maejima, A., et al.: Anime character colorization using few-shot learning. In: SIGGRAPH Asia 2021 Technical Communications, pp. 1–4 (2021)
Google Scholar
Meyer, S., Djelouah, A., McWilliams, B., Sorkine-Hornung, A., Gross, M., Schroers, C.: PhaseNet for video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 498–507 (2018)
Google Scholar
Meyer, S., Wang, O., Zimmer, H., Grosse, M., Sorkine-Hornung, A.: Phase-based frame interpolation for video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1410–1418 (2015)
Google Scholar
Narita, R., Hirakawa, K., Aizawa, K.: Optical flow based line drawing frame interpolation using distance transform to support inbetweenings. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 4200–4204. IEEE (2019)
Google Scholar
Niklaus, S., Liu, F.: Softmax splatting for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5437–5446 (2020)
Google Scholar
Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 670–679 (2017)
Google Scholar
Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive separable convolution. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 261–270 (2017)
Google Scholar
Okuta, R., Unno, Y., Nishino, D., Hido, S., Loomis, C.: CuPy: a NumPy-compatible library for NVIDIA GPU calculations. In: Proceedings of Workshop on Machine Learning Systems (LearningSys) in the Thirty-First Annual Conference on Neural Information Processing Systems (NIPS) (2017)
Google Scholar
Park, J., Lee, C., Kim, C.S.: Asymmetric bilateral motion estimation for video frame interpolation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14539–14548 (2021)
Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32, pp. 8026–8037 (2019)
Google Scholar
Qian, Z., Bo, W., Wei, W., Hai, L., Hui, L.J.: Line art correlation matching network for automatic animation colorization. arXiv e-prints, pp. arXiv-2004 (2020)
Google Scholar
Ren, H., Li, J., Gao, N.: Two-stage sketch colorization with color parsing. IEEE Access 8, 44599–44610 (2019)
Article Google Scholar
Riba, E., Mishkin, D., Shi, J., Ponsa, D., Moreno-Noguer, F., Bradski, G.: A survey on Kornia: an open source differentiable computer vision library for Pytorch (2020)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Sampat, M.P., Wang, Z., Gupta, S., Bovik, A.C., Markey, M.K.: Complex wavelet structural similarity: a new image similarity index. IEEE Trans. Image Process. 18(11), 2385–2401 (2009)
Article MathSciNet Google Scholar
Simo-Serra, E., Iizuka, S., Ishikawa, H.: Mastering sketching: adversarial augmentation for structured prediction. ACM Trans. Graph. (TOG) 37(1), 1–13 (2018)
Article Google Scholar
Simo-Serra, E., Iizuka, S., Sasaki, K., Ishikawa, H.: Learning to simplify: fully convolutional networks for rough sketch cleanup. ACM Trans. Graph. (TOG) 35(4), 1–11 (2016)
Article Google Scholar
Siyao, L., et al.: Deep animation video interpolation in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6587–6595 (2021)
Google Scholar
Souček, T., Lokoč, J.: TransNet v2: an effective deep network architecture for fast shot transition detection. arXiv preprint arXiv:2008.04838 (2020)
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8934–8943 (2018)
Google Scholar
Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 402–419. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_24
Chapter Google Scholar
Virtanen, P., et al.: SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020). https://doi.org/10.1038/s41592-019-0686-2
Article Google Scholar
Whited, B., Noris, G., Simmons, M., Sumner, R.W., Gross, M., Rossignac, J.: BetweenIT: an interactive tool for tight inbetweening. In: Computer Graphics Forum, vol. 29, pp. 605–614. Wiley Online Library (2010)
Google Scholar
Xu, X., Siyao, L., Sun, W., Yin, Q., Yang, M.H.: Quadratic video interpolation. arXiv preprint arXiv:1911.00627 (2019)
Yagi, Y.: A filter based approach for inbetweening. arXiv preprint arXiv:1706.03497 (2017)
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
Google Scholar

Download references

Acknowledgements

The authors would like to thank Lillian Huang and Saeed Hadadan for their discussion and feedback, as well as NVIDIA for GPU support.

Author information

Authors and Affiliations

University of Maryland, College Park, MD, 20742, USA
Shuhong Chen & Matthias Zwicker

Authors

Shuhong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Zwicker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuhong Chen .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 15537 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, S., Zwicker, M. (2022). Improving the Perceptual Quality of 2D Animation Interpolation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13677. Springer, Cham. https://doi.org/10.1007/978-3-031-19790-1_17

Download citation

DOI: https://doi.org/10.1007/978-3-031-19790-1_17
Published: 24 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19789-5
Online ISBN: 978-3-031-19790-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improving the Perceptual Quality of 2D Animation Interpolation

Abstract

Access this chapter

Similar content being viewed by others

Learning a perceptual manifold with deep features for animation video resequencing

Optimizations of VR360 Animation Production Process

CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (zip 15537 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Improving the Perceptual Quality of 2D Animation Interpolation

Abstract

Access this chapter

Similar content being viewed by others

Learning a perceptual manifold with deep features for animation video resequencing

Optimizations of VR360 Animation Production Process

CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (zip 15537 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation