Improving Optical Flow on a Pyramid Level

Hofinger, Markus; Bulò, Samuel Rota; Porzi, Lorenzo; Knapitsch, Arno; Pock, Thomas; Kontschieder, Peter

doi:10.1007/978-3-030-58604-1_46

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12373))

Included in the following conference series:

European Conference on Computer Vision

3410 Accesses
24 Citations

Abstract

In this work we review the coarse-to-fine spatial feature pyramid concept, which is used in state-of-the-art optical flow estimation networks to make exploration of the pixel flow search space computationally tractable and efficient. Within an individual pyramid level, we improve the cost volume construction process by departing from a warping- to a sampling-based strategy, which avoids ghosting and hence enables us to better preserve fine flow details. We further amplify the positive effects through a level-specific, loss max-pooling strategy that adaptively shifts the focus of the learning process on under-performing predictions. Our second contribution revises the gradient flow across pyramid levels. The typical operations performed at each pyramid level can lead to noisy, or even contradicting gradients across levels. We show and discuss how properly blocking some of these gradient components leads to improved convergence and ultimately better performance. Finally, we introduce a distillation concept to counteract the issue of catastrophic forgetting during finetuning and thus preserving knowledge over models sequentially trained on multiple datasets. Our findings are conceptually simple and easy to implement, yet result in compelling improvements on relevant error measures that we demonstrate via exhaustive ablations on datasets like Flying Chairs2, Flying Things, Sintel and KITTI. We establish new state-of-the-art results on the challenging Sintel and KITTI 2012 test datasets, and even show the portability of our findings to different optical flow and depth from stereo approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baker, S., Scharstein, D., Lewis, J.P., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. Int. J. Comput. Vis. 92(1), 1–31 (2011). https://doi.org/10.1007/s11263-010-0390-2
Bar-Haim, A., Wolf, L.: ScopeFlow: dynamic scene scoping for optical flow. In: CVPR, June 2020
Google Scholar
Bouguet, J.Y.: Pyramidal implementation of the Lucas Kanade feature tracker. Intel Corporation Microprocess. Res. Labs 5(1–10), 4 (2000)
Google Scholar
Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: ECCV (2004)
Google Scholar
Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., et al. (eds.) European Conference on Computer Vision (ECCV), pp. 611–625. Part IV. LNCS 7577. Springer-Verlag, October 2012, The MPI Sintel Flow Dataset presented in this work uses a modified version of the Sintel movie copyright Blender Foundation, www.sintel.org
Chaudhury, K., Mehrotra, R.: A trajectory-based computational model for optical flow estimation. IEEE Trans. Robot. Autom. 11(5), 733–741 (1995). https://doi.org/10.1109/70.466611
Article Google Scholar
Dollár, P., Zitnick, C.L.: Structured forests for fast edge detection. In: (ICCV) (2013)
Google Scholar
Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: IEEE International Conference on Computer Vision (ICCV) (2015). http://lmb.informatik.uni-freiburg.de/Publications/2015/DFIB15
Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation. Comput. Vis. Image Underst. 134(C), 1–21, May 2015. https://doi.org/10.1016/j.cviu.2015.02.008
Garg, R., Roussos, A., Agapito, L.: A variational approach to video registration with subspace constraints. Int. J. Comput. Vis. 104(3), 286–314, September 2013. https://doi.org/10.1007/s11263-012-0607-7
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. IJRR 32(11), 1231–1237 (2013)
Google Scholar
Hinton, G.E., Vinyals, S., Dean, J.: Distilling the knowledge in a neural network. In: Deep Learning Workshop, NIPS (2014)
Google Scholar
Horn, B.K.P., Schunck, B.G.: Determining optical flow. Artif. Intell. 17, 185–203 (1981)
Article Google Scholar
Hui, T.W., Tang, X., Loy, C.C.: A lightweight optical flow CNN - revisiting data fidelity and regularization. arXiv preprint arXiv:1903.07414 (2019). http://mmlab.ie.cuhk.edu.hk/projects/LiteFlowNet/
Hur, J., Roth, S.: Iterative residual refinement for joint optical flow and occlusion estimation. In: CVPR (2019)
Google Scholar
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: CVPR (2017). http://lmb.informatik.uni-freiburg.de/Publications/2017/IMSKDB17
Ilg, E., Saikia, T., Keuper, M., Brox, T.: Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation. In: ECCV (2018)
Google Scholar
Jiang, H., Sun, D., Jampani, V., Lv, Z., Learned-Miller, E., Kautz, J.: SENSE: a shared encoder network for scene-flow estimation. In: The IEEE International Conference on Computer Vision (ICCV), October 2019
Google Scholar
Jiang, W., Sun, W., Tagliasacchi, A., Trulls, E., Yi, K.M.: Linearized multi-sampling for differentiable image transformation. In: ICCV, pp. 2988–2997 (2019)
Google Scholar
Liu, P., King, I., Lyu, M.R., Xu, S.J.: DDFlow: learning optical flow with unlabeled data distillation. CoRR abs/1902.09145 (2019)
Google Scholar
Liu, P., Lyu, M.R., King, I., Xu, J.: SelFlow: self-supervised learning of optical flow. In: CVPR (2019)
Google Scholar
Lu, Y., Valmadre, J., Wang, H., Kannala, J., Harandi, M., Torr, P.: Devon: deformable volume network for learning optical flow. In: The IEEE Winter Conference on Applications of Computer Vision (WACV), March 2020
Google Scholar
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of Imaging Understanding Workshop, pp. 4884–4893 (1981). http://cseweb.ucsd.edu/classes/sp02/cse252/lucaskanade81.pdf
Mac Aodha, O., Humayun, A., Pollefeys, M., Brostow, G.J.: Learning a confidence measure for optical flow. IEEE Trans. Pattern Anal. Mach. Intell. 35(5), 1107–1120 (2013). https://doi.org/10.1109/TPAMI.2012.171
Article Google Scholar
Menze, M., Heipke, C., Geiger, A.: Joint 3D estimation of vehicles and scene flow. In: ISPRS Workshop on Image Sequence Analysis (ISA) (2015)
Google Scholar
Menze, M., Heipke, C., Geiger, A.: Object scene flow. ISPRS J. Photogrammetry Remote Sens. (JPRS) 140, 60–76 (2018)
Article Google Scholar
Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2720–2729, July 2017. https://doi.org/10.1109/CVPR.2017.291
Ren, Z., Gallo, O., Sun, D., Yang, M.H., Sudderth, E.B., Kautz, J.: A fusion approach for multi-frame optical flow estimation. In: IEEE Winter Conference on Applications of Computer Vision (2019)
Google Scholar
Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: Epicflow: Edge-preserving interpolation of correspondences for optical flow. CoRR (2015)
Google Scholar
Rota Bulò, S., Neuhold, G., Kontschieder, P.: Loss max-pooling for semantic image segmentation. In: CVPR, July 2017
Google Scholar
Bulò, S.R., Porzi, L., Kontschieder, P.: Dropout distillation. In: Proceedings of The 33rd International Conference on Machine Learning, Proceedings of Machine Learning Research. PMLR, 20–22 June 2016, New York, USA , vol. 48, pp. 99–107 (2016)
Google Scholar
Rota Bulò, S., Porzi, L., Kontschieder, P.: In-place activated batchnorm for memory-optimized training of DNNs. In: (CVPR) (2018)
Google Scholar
Santurkar, S., Tsipras, D., Ilyas, A., Mądry, A.: How does batch normalization help optimization? In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. (NIPS 2018), pp. 2488–2498. Curran Associates Inc., Red Hook (2018)
Google Scholar
Sun, D., Roth, S., Black, M.J.: Secrets of optical flow estimation and their principles. In: CVPR, pp. 2432–2439 (2010)
Google Scholar
Sun, D., Roth, S., Black, M.J.: A quantitative analysis of current practices in optical flow estimation and the principles behind them. Int. J. Comput. Vis. 106(2), 115–137 (2014). https://doi.org/10.1007/s11263-013-0644-x
Article Google Scholar
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: CVPR (2018)
Google Scholar
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: Models matter, so does training: an empirical study of CNNs for optical flow estimation. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 42(6), 1408–1423 (2020). https://doi.org/10.1109/TPAMI.2019.2894353 (to appear)
Wang, C., Chen, X., Smola, A.J., Xing, E.P.: Variance reduction for stochastic gradient optimization. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26, pp. 181–189. Curran Associates, Inc. (2013). http://papers.nips.cc/paper/5034-variance-reduction-for-stochastic-gradient-optimization.pdf
Wang, Y., Yang, Y., Yang, Z., Zhao, L., Wang, P., Xu, W.: Occlusion aware unsupervised learning of optical flow. In: CVPR, pp. 4884–4893 (2018). https://doi.org/10.1109/CVPR.2018.00513
Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: DeepFlow: large displacement optical flow with deep matching. In: IEEE Intenational Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. http://hal.inria.fr/hal-00873592
Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., Bischof, H.: Anisotropic Huber-L1 optical flow. In: Proceedings of the British Machine Vision Conference (BMVC), London, UK, September 2009 (to appear)
Google Scholar
Yamaguchi, K., McAllester, D., Urtasun, R.: Efficient joint segmentation, occlusion labeling, stereo and flow estimation. In: ECCV (2014)
Google Scholar
Yang, G., Ramanan, D.: Volumetric correspondence networks for optical flow. In: Advances in Neural Information Processing Systems 32, pp. 793–803. Curran Associates, Inc. (2019). http://papers.nips.cc/paper/8367-volumetric-correspondence-networks-for-optical-flow.pdf
Yin, Z., Darrell, T., Yu, F.: Hierarchical discrete distribution decomposition for match density estimation. In: CVPR (2019)
Google Scholar
Yu, J.J., Harley, A.W., Derpanis, K.G.: Back to basics: unsupervised learning of optical flow via brightness constancy and motion smoothness. In: Computer Vision - ECCV 2016 Workshops, Part 3 (2016)
Google Scholar

Download references

Acknowledgements

T. Pock and M. Hofinger acknowledge that this work was supported by the ERC starting grant HOMOVIS (No. 640156).

Author information

Authors and Affiliations

Facebook, Zürich, Switzerland
Samuel Rota Bulò, Lorenzo Porzi, Arno Knapitsch & Peter Kontschieder
Graz University of Technology, Graz, Austria
Markus Hofinger & Thomas Pock

Authors

Markus Hofinger
View author publications
You can also search for this author in PubMed Google Scholar
Samuel Rota Bulò
View author publications
You can also search for this author in PubMed Google Scholar
Lorenzo Porzi
View author publications
You can also search for this author in PubMed Google Scholar
Arno Knapitsch
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Pock
View author publications
You can also search for this author in PubMed Google Scholar
Peter Kontschieder
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Markus Hofinger .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 2 (mp4 67661 KB)

Supplementary material 1 (pdf 7102 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hofinger, M., Bulò, S.R., Porzi, L., Knapitsch, A., Pock, T., Kontschieder, P. (2020). Improving Optical Flow on a Pyramid Level. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12373. Springer, Cham. https://doi.org/10.1007/978-3-030-58604-1_46

Download citation

DOI: https://doi.org/10.1007/978-3-030-58604-1_46
Published: 03 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58603-4
Online ISBN: 978-3-030-58604-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics