Advertisement

Learnable Cost Volume Using the Cayley Representation

Conference paper
  • 804 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12354)

Abstract

Cost volume is an essential component of recent deep models for optical flow estimation and is usually constructed by calculating the inner product between two feature vectors. However, the standard inner product in the commonly-used cost volume may limit the representation capacity of flow models because it neglects the correlation among different channel dimensions and weighs each dimension equally. To address this issue, we propose a learnable cost volume (LCV) using an elliptical inner product, which generalizes the standard inner product by a positive definite kernel matrix. To guarantee its positive definiteness, we perform spectral decomposition on the kernel matrix and re-parameterize it via the Cayley representation. The proposed LCV is a lightweight module and can be easily plugged into existing models to replace the vanilla cost volume. Experimental results show that the LCV module not only improves the accuracy of state-of-the-art models on standard benchmarks, but also promotes their robustness against illumination change, noises, and adversarial perturbations of the input signals.

Keywords

Optical flow Cost volume Cayley representation Inner product 

Notes

Acknowledgements

This work is supported in part by NSF CAREER Grant 1149783. We also thank Pengpeng Liu and Jingfeng Wu for kind help.

Supplementary material

504446_1_En_28_MOESM1_ESM.pdf (6.2 mb)
Supplementary material 1 (pdf 6314 KB)

Supplementary material 2 (mp4 17192 KB)

Supplementary material 3 (mp4 33238 KB)

Supplementary material 4 (mp4 27663 KB)

References

  1. 1.
    Absil, P.A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2009)zbMATHGoogle Scholar
  2. 2.
    Bao, W., Lai, W.S., Ma, C., Zhang, X., Gao, Z., Yang, M.H.: Depth-aware video frame interpolation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  3. 3.
    Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans. Pattern Recogn. Mach. Intell. (PAMI) 33(3), 500–513 (2010)CrossRefGoogle Scholar
  4. 4.
    Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33783-3_44CrossRefGoogle Scholar
  5. 5.
    Cayley, A.: About the algebraic structure of the orthogonal group and the other classical groups in a field of characteristic zero or a prime characteristic. Reine Angewandte Mathematik 32, 1846 (1846)Google Scholar
  6. 6.
    Cheng, J., Tsai, Y.H., Wang, S., Yang, M.H.: Segflow: joint learning for video object segmentation and optical flow. In: IEEE International Conference on Computer Vision (ICCV) (2017)Google Scholar
  7. 7.
    Dosovitskiy, A., et al.: Flownet: learning optical flow with convolutional networks. In: IEEE International Conference on Computer Vision (ICCV) (2015)Google Scholar
  8. 8.
    Dosovitskiy, A., et al.: Flownet: learning optical flow with convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  9. 9.
    Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  10. 10.
    Hafner, D., Demetz, O., Weickert, J.: Why is the census transform good for robust optic flow computation? In: Kuijper, A., Bredies, K., Pock, T., Bischof, H. (eds.) SSVM 2013. LNCS, vol. 7893, pp. 210–221. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-38267-3_18CrossRefGoogle Scholar
  11. 11.
    Horn, B.K., Schunck, B.G.: Determining optical flow. Artif. Intell. 17(1–3), 185–203 (1981)CrossRefGoogle Scholar
  12. 12.
    Hui, T.W., Tang, X., Change Loy, C.: Liteflownet: a lightweight convolutional neural network for optical flow estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  13. 13.
    Hur, J., Roth, S.: Mirrorflow: exploiting symmetries in joint optical flow and occlusion estimation. IEEE International Conference on Computer Vision (ICCV), pp. 312–321 (2017)Google Scholar
  14. 14.
    Hur, J., Roth, S.: Iterative residual refinement for joint optical flow and occlusion estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  15. 15.
    Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: Flownet 2.0: evolution of optical flow estimation with deep networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  16. 16.
    Janai, J., Güney, F., Ranjan, A., Black, M.J., Geiger, A.: Unsupervised learning of multi-frame optical flow with occlusions. In: European Conference on Computer Vision (ECCV) (2018)Google Scholar
  17. 17.
    Yu, J.J., Harley, A.W., Derpanis, K.G.: Back to basics: unsupervised learning of optical flow via brightness constancy and motion smoothness. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 3–10. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-49409-8_1CrossRefGoogle Scholar
  18. 18.
    Jolliffe, I.T.: Principal components in regression analysis. In: Principal Component Analysis, pp. 129–155. Springer, New York (1986).  https://doi.org/10.1007/978-1-4757-1904-8_8
  19. 19.
    Kendall, A., et al.: End-to-end learning of geometry and context for deep stereo regression. In: IEEE International Conference on Computer Vision (ICCV) (2017)Google Scholar
  20. 20.
    Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Universal style transfer via feature transforms. In: Neural Information Processing Systems (NeurIPS) (2017)Google Scholar
  21. 21.
    Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Flow-grounded spatial-temporal video prediction from still images. In: European Conference on Computer Vision (ECCV) (2018)Google Scholar
  22. 22.
    Lin, J., Gan, C., Han, S.: Tsm: temporal shift module for efficient video understanding. In: IEEE International Conference on Computer Vision (ICCV) (2019)Google Scholar
  23. 23.
    Liu, P., King, I., Lyu, M.R., Xu, J.: Ddflow: learning optical flow with unlabeled data distillation. In: Association for the Advancement of Artificial Intelligence (AAAI) (2019)Google Scholar
  24. 24.
    Liu, P., Lyu, M.R., King, I., Xu, J.: Selflow: self-supervised learning of optical flow. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  25. 25.
    Meister, S., Hur, J., Roth, S.: Unflow: unsupervised learning of optical flow with a bidirectional census loss. In: Association for the Advancement of Artificial Intelligence (AAAI) (2017)Google Scholar
  26. 26.
    Menze, M., Heipke, C., Geiger, A.: Joint 3d estimation of vehicles and scene flow. In: ISPRS Workshop on Image Sequence Analysis (ISA) (2015)Google Scholar
  27. 27.
    Menze, M., Heipke, C., Geiger, A.: Object scene flow. ISPRS J. Photogrammetry Remote Sensing (JPRS) 140, 60–76 (2018)CrossRefGoogle Scholar
  28. 28.
    Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  29. 29.
    Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  30. 30.
    Ranjan, A., Janai, J., Geiger, A., Black, M.J.: Attacking optical flow. In: IEEE International Conference on Computer Vision (ICCV) (2019)Google Scholar
  31. 31.
    Ren, Z., Yan, J., Ni, B., Liu, B., Yang, X., Zha, H.: Unsupervised deep learning for optical flow estimation. In: Association for the Advancement of Artificial Intelligence (AAAI) (2017)Google Scholar
  32. 32.
    Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. (IJCV) 47(1–3), 7–42 (2002).  https://doi.org/10.1023/A:1014573219977CrossRefzbMATHGoogle Scholar
  33. 33.
    Sun, D., Yang, X., Liu, M.Y., Kautz, J.: Pwc-net: CNNs for optical flow using pyramid, warping, and cost volume. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  34. 34.
    Sun, D., Yang, X., Liu, M.Y., Kautz, J.: Models matter, so does training: an empirical study of CNNs for optical flow estimation. IEEE Trans. Pattern Recogn. Mach. Intell. (PAMI) 42, 1408–1423 (2019)CrossRefGoogle Scholar
  35. 35.
    Teed, Z., Deng, J.: Raft: recurrent all-pairs field transforms for optical flow. arXiv preprint arXiv:2003.12039 (2020)
  36. 36.
    Tsai, Y.H., Yang, M.H., Black, M.J.: Video segmentation via object flow. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  37. 37.
    Wang, Y., Yang, Y., Yang, Z., Zhao, L., Xu, W.: Occlusion aware unsupervised learning of optical flow. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  38. 38.
    Xu, J., Ranftl, R., Koltun, V.: Accurate optical flow via direct cost volume processing. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  39. 39.
    Yang, G., Ramanan, D.: Volumetric correspondence networks for optical flow. In: Neural Information Processing Systems (NeurIPS) (2019)Google Scholar
  40. 40.
    Yin, Z., Darrell, T., Yu, F.: Hierarchical discrete distribution decomposition for match density estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019Google Scholar
  41. 41.
    Yin, Z., Shi, J.: Geonet: unsupervised learning of dense depth, optical flow and camera pose. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  42. 42.
    Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. (JMLR) 17, 2287–2318 (2016)zbMATHGoogle Scholar
  43. 43.
    Zou, Y., Luo, Z., Huang, J.B.: Df-net: unsupervised joint learning of depth and flow using cross-task consistency. In: European Conference on Computer Vision (ECCV) (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.University of CaliforniaMercedUSA
  2. 2.Google ResearchMountain ViewUSA
  3. 3.Nankai UniversityTianjinChina
  4. 4.Peking UniversityBeijingChina

Personalised recommendations