Advertisement

RAFT: Recurrent All-Pairs Field Transforms for Optical Flow

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12347)

Abstract

We introduce Recurrent All-Pairs Field Transforms (RAFT), a new deep network architecture for optical flow. RAFT extracts per-pixel features, builds multi-scale 4D correlation volumes for all pairs of pixels, and iteratively updates a flow field through a recurrent unit that performs lookups on the correlation volumes. RAFT achieves state-of-the-art performance. On KITTI, RAFT achieves an F1-all error of 5.10%, a 16% error reduction from the best published result (6.10%). On Sintel (final pass), RAFT obtains an end-point-error of 2.855 pixels, a 30% error reduction from the best published result (4.098 pixels). In addition, RAFT has strong cross-dataset generalization as well as high efficiency in inference time, training speed, and parameter count. Code is available at https://github.com/princeton-vl/RAFT.

Notes

Acknowledgments

This work was partially funded by the National Science Foundation under Grant No. 1617767.

Supplementary material

504434_1_En_24_MOESM1_ESM.pdf (414 kb)
Supplementary material 1 (pdf 413 KB)

References

  1. 1.
    Adler, J., Öktem, O.: Solving ill-posed inverse problems using iterative deep neural networks. Inverse Probl. 33(12), 124007 (2017)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Adler, J., Öktem, O.: Learned primal-dual reconstruction. IEEE Trans. Med. Imag. 37(6), 1322–1332 (2018)CrossRefGoogle Scholar
  3. 3.
    Agrawal, A., Amos, B., Barratt, S., Boyd, S., Diamond, S., Kolter, J.Z.: Differentiable convex optimization layers. In: Advances in Neural Information Processing Systems, pp. 9558–9570 (2019)Google Scholar
  4. 4.
    Amos, B., Kolter, J.Z.: OptNet: differentiable optimization as a layer in neural networks. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 136–145. JMLR. org (2017)Google Scholar
  5. 5.
    Bai, S., Kolter, J.Z., Koltun, V.: Trellis networks for sequence modeling. arXiv preprint arXiv:1810.06682 (2018)
  6. 6.
    Bai, S., Kolter, J.Z., Koltun, V.: Deep equilibrium models. In: Advances in Neural Information Processing Systems, pp. 688–699 (2019)Google Scholar
  7. 7.
    Bailer, C., Taetz, B., Stricker, D.: Flow fields: dense correspondence fields for highly accurate large displacement optical flow estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4015–4023 (2015)Google Scholar
  8. 8.
    Bar-Haim, A., Wolf, L.: ScopeFlow: dynamic scene scoping for optical flow. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7998–8007 (2020)Google Scholar
  9. 9.
    Black, M.J., Anandan, P.: A framework for the robust estimation of optical flow. In: 1993 (4th) International Conference on Computer Vision, pp. 231–236. IEEE (1993)Google Scholar
  10. 10.
    Brox, T., Bregler, C., Malik, J.: Large displacement optical flow. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 41–48. IEEE (2009)Google Scholar
  11. 11.
    Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33783-3_44CrossRefGoogle Scholar
  12. 12.
    Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imag. Vis. 40(1), 120–145 (2011)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Chen, Q., Koltun, V.: Full flow: optical flow estimation by global optimization over regular grids. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4706–4714 (2016)Google Scholar
  14. 14.
    Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014)
  15. 15.
    Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766 (2015)Google Scholar
  16. 16.
    Fan, L., Huang, W., Gan, C., Ermon, S., Gong, B., Huang, J.: End-to-end learning of motion representation for video understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6016–6025 (2018)Google Scholar
  17. 17.
    Flynn, J., et al.: DeepView: high-quality view synthesis by learned gradient descent (2019)Google Scholar
  18. 18.
    Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the kitti dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)CrossRefGoogle Scholar
  19. 19.
    Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 328–341 (2007)CrossRefGoogle Scholar
  20. 20.
    Hofinger, M., Bulò, S.R., Porzi, L., Knapitsch, A., Kontschieder, P.: Improving optical flow on a pyramidal level. In: ECCV (2020)Google Scholar
  21. 21.
    Horn, B.K., Schunck, B.G.: Determining optical flow. In: Techniques and Applications of Image Understanding, vol. 281, pp. 319–331. International Society for Optics and Photonics (1981)Google Scholar
  22. 22.
    Hui, T.W., Tang, X., Change Loy, C.: LiteflowNet: a lightweight convolutional neural network for optical flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8981–8989 (2018)Google Scholar
  23. 23.
    Hui, T.W., Tang, X., Loy, C.C.: A lightweight optical flow CNN-revisiting data fidelity and regularization. arXiv preprint arXiv:1903.07414 (2019)
  24. 24.
    Hur, J., Roth, S.: Iterative residual refinement for joint optical flow and occlusion estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5754–5763 (2019)Google Scholar
  25. 25.
    Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2462–2470 (2017)Google Scholar
  26. 26.
    Kobler, E., Klatzer, T., Hammernik, K., Pock, T.: Variational networks: connecting variational methods and deep Learning. In: Roth, V., Vetter, T. (eds.) GCPR 2017. LNCS, vol. 10496, pp. 281–293. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-66709-6_23CrossRefGoogle Scholar
  27. 27.
    Kondermann, D., et al.: The HCI benchmark suite: stereo and flow ground truth with uncertainties for urban autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 19–28 (2016)Google Scholar
  28. 28.
    Li, X., Wu, J., Lin, Z., Liu, H., Zha, H.: Recurrent squeeze-and-excitation contextaggregation net for single image deraining. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 262–277. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01234-2_16CrossRefGoogle Scholar
  29. 29.
    Liang, Z., et al.: Learning for disparity estimation through feature constancy. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2811–2820 (2018)Google Scholar
  30. 30.
    Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
  31. 31.
    Lu, Y., Valmadre, J., Wang, H., Kannala, J., Harandi, M., Torr, P.: Devon: deformable volume network for learning optical flow. In: The IEEE Winter Conference on Applications of Computer Vision, pp. 2705–2713 (2020)Google Scholar
  32. 32.
    Lv, Z., Dellaert, F., Rehg, J.M., Geiger, A.: Taking a deeper look at the inverse compositional algorithm. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4581–4590 (2019)Google Scholar
  33. 33.
    Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4040–4048 (2016)Google Scholar
  34. 34.
    Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3061–3070 (2015)Google Scholar
  35. 35.
    Menze, M., Heipke, C., Geiger, A.: Discrete optimization for optical flow. In: Gall, J., Gehler, P., Leibe, B. (eds.) GCPR 2015. LNCS, vol. 9358, pp. 16–28. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-24947-6_2CrossRefGoogle Scholar
  36. 36.
    Paszke, A., et al.: Automatic differentiation in PyTorch (2017)Google Scholar
  37. 37.
    Pont-Tuset, J., Perazzi, F., Caelles, S., Arbeláez, P., Sorkine-Hornung, A., Van Gool, L.: The 2017 davis challenge on video object segmentation. arXiv preprint arXiv:1704.00675 (2017)
  38. 38.
    Ranftl, R., Bredies, K., Pock, T.: Non-local total generalized variation for optical flow estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 439–454. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10590-1_29CrossRefGoogle Scholar
  39. 39.
    Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4161–4170 (2017)Google Scholar
  40. 40.
    Schuster, R., Bailer, C., Wasenmüller, O., Stricker, D.: Flowfields++: accurate optical flow correspondences meet robust interpolation. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 1463–1467. IEEE (2018)Google Scholar
  41. 41.
    Sun, D., Yang, X., Liu, M.Y., Kautz, J.: Models matter, so does training: an empirical study of cnns for optical flow estimation. arXiv preprint arXiv:1809.05571 (2018)
  42. 42.
    Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNS for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8934–8943 (2018)Google Scholar
  43. 43.
    Tang, C., Tan, P.: BA-Net: dense bundle adjustment network. arXiv preprint arXiv:1806.04807 (2018)
  44. 44.
    Teed, Z., Deng, J.: DeepV2D: video to depth with differentiable structure from motion. arXiv preprint arXiv:1812.04605 (2018)
  45. 45.
    Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: DeepFlow: large displacement optical flow with deep matching. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1385–1392 (2013)Google Scholar
  46. 46.
    Wulff, J., Sevilla-Lara, L., Black, M.J.: Optical flow in mostly rigid scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4671–4680 (2017)Google Scholar
  47. 47.
    Xu, J., Ranftl, R., Koltun, V.: Accurate optical flow via direct cost volume processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1289–1297 (2017)Google Scholar
  48. 48.
    Yang, G., Ramanan, D.: Volumetric correspondence networks for optical flow. In: Advances in Neural Information Processing Systems, pp. 793–803 (2019)Google Scholar
  49. 49.
    Yin, Z., Darrell, T., Yu, F.: Hierarchical discrete distribution decomposition for match density estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6044–6053 (2019)Google Scholar
  50. 50.
    Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime TV-\(L^{1}\) optical flow. In: Hamprecht, F.A., Schnörr, C., Jähne, B. (eds.) DAGM 2007. LNCS, vol. 4713, pp. 214–223. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-74936-3_22CrossRefGoogle Scholar
  51. 51.
    Zhao, S., Sheng, Y., Dong, Y., Chang, E.I., Xu, Y., et al.: MaskflowNet: asymmetric feature matching with learnable occlusion mask. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6278–6287 (2020)Google Scholar
  52. 52.
    Zhou, H., Ummenhofer, B., Brox, T.: DeepTAM: deep tracking and mapping. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 822–838. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01270-0_50CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Princeton UniversityPrincetonUSA

Personalised recommendations