Skip to main content

Scalable Full Flow with Learned Binary Descriptors

  • Conference paper
  • First Online:
Pattern Recognition (GCPR 2017)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10496))

Included in the following conference series:

Abstract

We propose a method for large displacement optical flow in which local matching costs are learned by a convolutional neural network (CNN) and a smoothness prior is imposed by a conditional random field (CRF). We tackle the computation- and memory-intensive operations on the 4D cost volume by a min-projection which reduces memory complexity from quadratic to linear and binary descriptors for efficient matching. This enables evaluation of the cost on the fly and allows to perform learning and CRF inference on high resolution images without ever storing the 4D cost volume. To address the problem of learning binary descriptors we propose a new hybrid learning scheme. In contrast to current state of the art approaches for learning binary CNNs we can compute the exact non-zero gradient within our model. We compare several methods for training binary descriptors and show results on public available benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Estimated for the cost volume size \(341{\times }145{\times }160{\times }160\) based on numbers in [5] corresponding to \(\frac{1}{3}\) resolution of Sintel images.

  2. 2.

    Since we want to pose matching as a minimization problem.

References

  1. Bailer, C., Taetz, B., Stricker, D.: Flow Fields: dense correspondence fields for highly accurate large displacement optical flow estimation. In: International Conference on Computer Vision (ICCV) (2015)

    Google Scholar 

  2. Bengio, Y., Léonard, N., Courville, A.C.: Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR abs/1308.3432 (2013). http://arxiv.org/abs/1308.3432

  3. Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33783-3_44

    Chapter  Google Scholar 

  4. Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15561-1_56

    Chapter  Google Scholar 

  5. Chen, Q., Koltun, V.: Full Flow: optical flow estimation by global optimization over regular grids. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  6. Chen, Z., Sun, X., Wang, L., Yu, Y., Huang, C.: A deep visual correspondence embedding model for stereo matching costs. In: International Conference on Computer Vision (ICCV) (2015)

    Google Scholar 

  7. Courbariaux, M., Bengio, Y.: BinaryNet: training deep neural networks with weights and activations constrained to +1 or \(-1\). CoRR abs/1602.02830 (2016). http://arxiv.org/abs/1602.02830

  8. Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient belief propagation for early vision. Int. J. Comput. Vis. 70(1), 41–54 (2006)

    Article  Google Scholar 

  9. Gadot, D., Wolf, L.: PatchBatch: a batch augmented loss for optical flow. In: Conference on Computer Vision and Pattern Recognition, (CVPR) (2016)

    Google Scholar 

  10. Güney, F., Geiger, A.: Deep discrete flow. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10114, pp. 207–224. Springer, Cham (2017). doi:10.1007/978-3-319-54190-7_13

    Chapter  Google Scholar 

  11. Knöbelreiter, P., Reinbacher, C., Shekhovtsov, A., Pock, T.: End-to-end training of hybrid CNN-CRF models for stereo. In: Conference on Computer Vision and Pattern Recognition, (CVPR) (2017). http://arxiv.org/abs/1611.10229

  12. Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. Trans. Pattern Anal. Mach. Intell. 28(10), 1568–1583 (2006)

    Article  Google Scholar 

  13. Luo, W., Schwing, A., Urtasun, R.: Efficient deep learning for stereo matching. In: International Conference on Computer Vision and Pattern Recognition (ICCV) (2016)

    Google Scholar 

  14. Ranftl, R., Bredies, K., Pock, T.: Non-local total generalized variation for optical flow estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 439–454. Springer, Cham (2014). doi:10.1007/978-3-319-10590-1_29

    Google Scholar 

  15. Ranftl, R., Gehrig, S., Pock, T., Bischof, H.: Pushing the limits of stereo using variational stereo estimation. In: IEEE Intelligent Vehicles Symposium (IV) (2012)

    Google Scholar 

  16. Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). doi:10.1007/978-3-319-46493-0_32

    Chapter  Google Scholar 

  17. Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: EpicFlow: edge-preserving interpolation of correspondences for optical flow. In: Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  18. Shekhovtsov, A., Reinbacher, C., Graber, G., Pock, T.: Solving dense image matching in real-time using discrete-continuous optimization. ArXiv e-prints, January 2016

    Google Scholar 

  19. Shekhovtsov, A., Kovtun, I., Hlaváč, V.: Efficient MRF deformation model for non-rigid image matching. CVIU 112, 91–99 (2008)

    Google Scholar 

  20. Trzcinski, T., Christoudias, M., Fua, P., Lepetit, V.: Boosting binary keypoint descriptors. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)

    Google Scholar 

  21. Wainwright, M., Jaakkola, T., Willsky, A.: MAP estimation via agreement on (hyper)trees: message-passing and linear-programming approaches. IT 51(11), 3697–3717 (2005)

    MATH  Google Scholar 

  22. Xu, J., Ranftl, R., Koltun, V.: Accurate optical flow via direct cost volume processing. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  23. Žbontar, J., LeCun, Y.: Computing the stereo matching cost with a convolutional neural network. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

Download references

Acknowledgements

We acknowledge grant support from Toyota Motor Europe HS, the ERC starting grant HOMOVIS No. 640156 and the research initiative Intelligent Vision Austria with funding from the AIT and the Austrian Federal Ministry of Science, Research and Economy HRSM programme (BGBl. II Nr. 292/2012).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gottfried Munda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Munda, G., Shekhovtsov, A., Knöbelreiter, P., Pock, T. (2017). Scalable Full Flow with Learned Binary Descriptors. In: Roth, V., Vetter, T. (eds) Pattern Recognition. GCPR 2017. Lecture Notes in Computer Science(), vol 10496. Springer, Cham. https://doi.org/10.1007/978-3-319-66709-6_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66709-6_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66708-9

  • Online ISBN: 978-3-319-66709-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics