Abstract
In this work we review the coarse-to-fine spatial feature pyramid concept, which is used in state-of-the-art optical flow estimation networks to make exploration of the pixel flow search space computationally tractable and efficient. Within an individual pyramid level, we improve the cost volume construction process by departing from a warping- to a sampling-based strategy, which avoids ghosting and hence enables us to better preserve fine flow details. We further amplify the positive effects through a level-specific, loss max-pooling strategy that adaptively shifts the focus of the learning process on under-performing predictions. Our second contribution revises the gradient flow across pyramid levels. The typical operations performed at each pyramid level can lead to noisy, or even contradicting gradients across levels. We show and discuss how properly blocking some of these gradient components leads to improved convergence and ultimately better performance. Finally, we introduce a distillation concept to counteract the issue of catastrophic forgetting during finetuning and thus preserving knowledge over models sequentially trained on multiple datasets. Our findings are conceptually simple and easy to implement, yet result in compelling improvements on relevant error measures that we demonstrate via exhaustive ablations on datasets like Flying Chairs2, Flying Things, Sintel and KITTI. We establish new state-of-the-art results on the challenging Sintel and KITTI 2012 test datasets, and even show the portability of our findings to different optical flow and depth from stereo approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Baker, S., Scharstein, D., Lewis, J.P., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. Int. J. Comput. Vis. 92(1), 1–31 (2011). https://doi.org/10.1007/s11263-010-0390-2
Bar-Haim, A., Wolf, L.: ScopeFlow: dynamic scene scoping for optical flow. In: CVPR, June 2020
Bouguet, J.Y.: Pyramidal implementation of the Lucas Kanade feature tracker. Intel Corporation Microprocess. Res. Labs 5(1–10), 4 (2000)
Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: ECCV (2004)
Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., et al. (eds.) European Conference on Computer Vision (ECCV), pp. 611–625. Part IV. LNCS 7577. Springer-Verlag, October 2012, The MPI Sintel Flow Dataset presented in this work uses a modified version of the Sintel movie copyright Blender Foundation, www.sintel.org
Chaudhury, K., Mehrotra, R.: A trajectory-based computational model for optical flow estimation. IEEE Trans. Robot. Autom. 11(5), 733–741 (1995). https://doi.org/10.1109/70.466611
Dollár, P., Zitnick, C.L.: Structured forests for fast edge detection. In: (ICCV) (2013)
Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: IEEE International Conference on Computer Vision (ICCV) (2015). http://lmb.informatik.uni-freiburg.de/Publications/2015/DFIB15
Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation. Comput. Vis. Image Underst. 134(C), 1–21, May 2015. https://doi.org/10.1016/j.cviu.2015.02.008
Garg, R., Roussos, A., Agapito, L.: A variational approach to video registration with subspace constraints. Int. J. Comput. Vis. 104(3), 286–314, September 2013. https://doi.org/10.1007/s11263-012-0607-7
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. IJRR 32(11), 1231–1237 (2013)
Hinton, G.E., Vinyals, S., Dean, J.: Distilling the knowledge in a neural network. In: Deep Learning Workshop, NIPS (2014)
Horn, B.K.P., Schunck, B.G.: Determining optical flow. Artif. Intell. 17, 185–203 (1981)
Hui, T.W., Tang, X., Loy, C.C.: A lightweight optical flow CNN - revisiting data fidelity and regularization. arXiv preprint arXiv:1903.07414 (2019). http://mmlab.ie.cuhk.edu.hk/projects/LiteFlowNet/
Hur, J., Roth, S.: Iterative residual refinement for joint optical flow and occlusion estimation. In: CVPR (2019)
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: CVPR (2017). http://lmb.informatik.uni-freiburg.de/Publications/2017/IMSKDB17
Ilg, E., Saikia, T., Keuper, M., Brox, T.: Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation. In: ECCV (2018)
Jiang, H., Sun, D., Jampani, V., Lv, Z., Learned-Miller, E., Kautz, J.: SENSE: a shared encoder network for scene-flow estimation. In: The IEEE International Conference on Computer Vision (ICCV), October 2019
Jiang, W., Sun, W., Tagliasacchi, A., Trulls, E., Yi, K.M.: Linearized multi-sampling for differentiable image transformation. In: ICCV, pp. 2988–2997 (2019)
Liu, P., King, I., Lyu, M.R., Xu, S.J.: DDFlow: learning optical flow with unlabeled data distillation. CoRR abs/1902.09145 (2019)
Liu, P., Lyu, M.R., King, I., Xu, J.: SelFlow: self-supervised learning of optical flow. In: CVPR (2019)
Lu, Y., Valmadre, J., Wang, H., Kannala, J., Harandi, M., Torr, P.: Devon: deformable volume network for learning optical flow. In: The IEEE Winter Conference on Applications of Computer Vision (WACV), March 2020
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of Imaging Understanding Workshop, pp. 4884–4893 (1981). http://cseweb.ucsd.edu/classes/sp02/cse252/lucaskanade81.pdf
Mac Aodha, O., Humayun, A., Pollefeys, M., Brostow, G.J.: Learning a confidence measure for optical flow. IEEE Trans. Pattern Anal. Mach. Intell. 35(5), 1107–1120 (2013). https://doi.org/10.1109/TPAMI.2012.171
Menze, M., Heipke, C., Geiger, A.: Joint 3D estimation of vehicles and scene flow. In: ISPRS Workshop on Image Sequence Analysis (ISA) (2015)
Menze, M., Heipke, C., Geiger, A.: Object scene flow. ISPRS J. Photogrammetry Remote Sens. (JPRS) 140, 60–76 (2018)
Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2720–2729, July 2017. https://doi.org/10.1109/CVPR.2017.291
Ren, Z., Gallo, O., Sun, D., Yang, M.H., Sudderth, E.B., Kautz, J.: A fusion approach for multi-frame optical flow estimation. In: IEEE Winter Conference on Applications of Computer Vision (2019)
Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: Epicflow: Edge-preserving interpolation of correspondences for optical flow. CoRR (2015)
Rota Bulò, S., Neuhold, G., Kontschieder, P.: Loss max-pooling for semantic image segmentation. In: CVPR, July 2017
Bulò, S.R., Porzi, L., Kontschieder, P.: Dropout distillation. In: Proceedings of The 33rd International Conference on Machine Learning, Proceedings of Machine Learning Research. PMLR, 20–22 June 2016, New York, USA , vol. 48, pp. 99–107 (2016)
Rota Bulò, S., Porzi, L., Kontschieder, P.: In-place activated batchnorm for memory-optimized training of DNNs. In: (CVPR) (2018)
Santurkar, S., Tsipras, D., Ilyas, A., Mądry, A.: How does batch normalization help optimization? In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. (NIPS 2018), pp. 2488–2498. Curran Associates Inc., Red Hook (2018)
Sun, D., Roth, S., Black, M.J.: Secrets of optical flow estimation and their principles. In: CVPR, pp. 2432–2439 (2010)
Sun, D., Roth, S., Black, M.J.: A quantitative analysis of current practices in optical flow estimation and the principles behind them. Int. J. Comput. Vis. 106(2), 115–137 (2014). https://doi.org/10.1007/s11263-013-0644-x
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: CVPR (2018)
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: Models matter, so does training: an empirical study of CNNs for optical flow estimation. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 42(6), 1408–1423 (2020). https://doi.org/10.1109/TPAMI.2019.2894353 (to appear)
Wang, C., Chen, X., Smola, A.J., Xing, E.P.: Variance reduction for stochastic gradient optimization. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26, pp. 181–189. Curran Associates, Inc. (2013). http://papers.nips.cc/paper/5034-variance-reduction-for-stochastic-gradient-optimization.pdf
Wang, Y., Yang, Y., Yang, Z., Zhao, L., Wang, P., Xu, W.: Occlusion aware unsupervised learning of optical flow. In: CVPR, pp. 4884–4893 (2018). https://doi.org/10.1109/CVPR.2018.00513
Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: DeepFlow: large displacement optical flow with deep matching. In: IEEE Intenational Conference on Computer Vision (ICCV), Sydney, Australia, December 2013. http://hal.inria.fr/hal-00873592
Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., Bischof, H.: Anisotropic Huber-L1 optical flow. In: Proceedings of the British Machine Vision Conference (BMVC), London, UK, September 2009 (to appear)
Yamaguchi, K., McAllester, D., Urtasun, R.: Efficient joint segmentation, occlusion labeling, stereo and flow estimation. In: ECCV (2014)
Yang, G., Ramanan, D.: Volumetric correspondence networks for optical flow. In: Advances in Neural Information Processing Systems 32, pp. 793–803. Curran Associates, Inc. (2019). http://papers.nips.cc/paper/8367-volumetric-correspondence-networks-for-optical-flow.pdf
Yin, Z., Darrell, T., Yu, F.: Hierarchical discrete distribution decomposition for match density estimation. In: CVPR (2019)
Yu, J.J., Harley, A.W., Derpanis, K.G.: Back to basics: unsupervised learning of optical flow via brightness constancy and motion smoothness. In: Computer Vision - ECCV 2016 Workshops, Part 3 (2016)
Acknowledgements
T. Pock and M. Hofinger acknowledge that this work was supported by the ERC starting grant HOMOVIS (No. 640156).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 2 (mp4 67661 KB)
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Hofinger, M., Bulò, S.R., Porzi, L., Knapitsch, A., Pock, T., Kontschieder, P. (2020). Improving Optical Flow on a Pyramid Level. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12373. Springer, Cham. https://doi.org/10.1007/978-3-030-58604-1_46
Download citation
DOI: https://doi.org/10.1007/978-3-030-58604-1_46
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58603-4
Online ISBN: 978-3-030-58604-1
eBook Packages: Computer ScienceComputer Science (R0)