Scalable Full Flow with Learned Binary Descriptors

Munda, Gottfried; Shekhovtsov, Alexander; Knöbelreiter, Patrick; Pock, Thomas

doi:10.1007/978-3-319-66709-6_26

Gottfried Munda¹⁵,
Alexander Shekhovtsov¹⁶,
Patrick Knöbelreiter¹⁵ &
…
Thomas Pock^15,17

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10496))

Included in the following conference series:

German Conference on Pattern Recognition

2244 Accesses
2 Citations

Abstract

We propose a method for large displacement optical flow in which local matching costs are learned by a convolutional neural network (CNN) and a smoothness prior is imposed by a conditional random field (CRF). We tackle the computation- and memory-intensive operations on the 4D cost volume by a min-projection which reduces memory complexity from quadratic to linear and binary descriptors for efficient matching. This enables evaluation of the cost on the fly and allows to perform learning and CRF inference on high resolution images without ever storing the 4D cost volume. To address the problem of learning binary descriptors we propose a new hybrid learning scheme. In contrast to current state of the art approaches for learning binary CNNs we can compute the exact non-zero gradient within our model. We compare several methods for training binary descriptors and show results on public available benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Estimated for the cost volume size \(341{\times }145{\times }160{\times }160\) based on numbers in [5] corresponding to \(\frac{1}{3}\) resolution of Sintel images.
2.
Since we want to pose matching as a minimization problem.

References

Bailer, C., Taetz, B., Stricker, D.: Flow Fields: dense correspondence fields for highly accurate large displacement optical flow estimation. In: International Conference on Computer Vision (ICCV) (2015)
Google Scholar
Bengio, Y., Léonard, N., Courville, A.C.: Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR abs/1308.3432 (2013). http://arxiv.org/abs/1308.3432
Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33783-3_44
Chapter Google Scholar
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15561-1_56
Chapter Google Scholar
Chen, Q., Koltun, V.: Full Flow: optical flow estimation by global optimization over regular grids. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Chen, Z., Sun, X., Wang, L., Yu, Y., Huang, C.: A deep visual correspondence embedding model for stereo matching costs. In: International Conference on Computer Vision (ICCV) (2015)
Google Scholar
Courbariaux, M., Bengio, Y.: BinaryNet: training deep neural networks with weights and activations constrained to +1 or \(-1\). CoRR abs/1602.02830 (2016). http://arxiv.org/abs/1602.02830
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient belief propagation for early vision. Int. J. Comput. Vis. 70(1), 41–54 (2006)
Article Google Scholar
Gadot, D., Wolf, L.: PatchBatch: a batch augmented loss for optical flow. In: Conference on Computer Vision and Pattern Recognition, (CVPR) (2016)
Google Scholar
Güney, F., Geiger, A.: Deep discrete flow. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10114, pp. 207–224. Springer, Cham (2017). doi:10.1007/978-3-319-54190-7_13
Chapter Google Scholar
Knöbelreiter, P., Reinbacher, C., Shekhovtsov, A., Pock, T.: End-to-end training of hybrid CNN-CRF models for stereo. In: Conference on Computer Vision and Pattern Recognition, (CVPR) (2017). http://arxiv.org/abs/1611.10229
Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. Trans. Pattern Anal. Mach. Intell. 28(10), 1568–1583 (2006)
Article Google Scholar
Luo, W., Schwing, A., Urtasun, R.: Efficient deep learning for stereo matching. In: International Conference on Computer Vision and Pattern Recognition (ICCV) (2016)
Google Scholar
Ranftl, R., Bredies, K., Pock, T.: Non-local total generalized variation for optical flow estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 439–454. Springer, Cham (2014). doi:10.1007/978-3-319-10590-1_29
Google Scholar
Ranftl, R., Gehrig, S., Pock, T., Bischof, H.: Pushing the limits of stereo using variational stereo estimation. In: IEEE Intelligent Vehicles Symposium (IV) (2012)
Google Scholar
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). doi:10.1007/978-3-319-46493-0_32
Chapter Google Scholar
Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: EpicFlow: edge-preserving interpolation of correspondences for optical flow. In: Computer Vision and Pattern Recognition (CVPR) (2015)
Google Scholar
Shekhovtsov, A., Reinbacher, C., Graber, G., Pock, T.: Solving dense image matching in real-time using discrete-continuous optimization. ArXiv e-prints, January 2016
Google Scholar
Shekhovtsov, A., Kovtun, I., Hlaváč, V.: Efficient MRF deformation model for non-rigid image matching. CVIU 112, 91–99 (2008)
Google Scholar
Trzcinski, T., Christoudias, M., Fua, P., Lepetit, V.: Boosting binary keypoint descriptors. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)
Google Scholar
Wainwright, M., Jaakkola, T., Willsky, A.: MAP estimation via agreement on (hyper)trees: message-passing and linear-programming approaches. IT 51(11), 3697–3717 (2005)
MATH Google Scholar
Xu, J., Ranftl, R., Koltun, V.: Accurate optical flow via direct cost volume processing. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Žbontar, J., LeCun, Y.: Computing the stereo matching cost with a convolutional neural network. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Google Scholar

Download references

Acknowledgements

We acknowledge grant support from Toyota Motor Europe HS, the ERC starting grant HOMOVIS No. 640156 and the research initiative Intelligent Vision Austria with funding from the AIT and the Austrian Federal Ministry of Science, Research and Economy HRSM programme (BGBl. II Nr. 292/2012).

Author information

Authors and Affiliations

Institute of Computer Graphics and Vision, Graz University of Technology, Graz, Austria
Gottfried Munda, Patrick Knöbelreiter & Thomas Pock
Czech Technical University in Prague, Prague 6, Czech Republic
Alexander Shekhovtsov
Center for Vision, Automation and Control, Austrian Institute of Technology, Vienna, Austria
Thomas Pock

Authors

Gottfried Munda
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Shekhovtsov
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Knöbelreiter
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Pock
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gottfried Munda .

Editor information

Editors and Affiliations

University of Basel, Basel, Switzerland
Volker Roth
University of Basel, Basel, Switzerland
Thomas Vetter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Munda, G., Shekhovtsov, A., Knöbelreiter, P., Pock, T. (2017). Scalable Full Flow with Learned Binary Descriptors. In: Roth, V., Vetter, T. (eds) Pattern Recognition. GCPR 2017. Lecture Notes in Computer Science(), vol 10496. Springer, Cham. https://doi.org/10.1007/978-3-319-66709-6_26

Download citation

DOI: https://doi.org/10.1007/978-3-319-66709-6_26
Published: 15 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66708-9
Online ISBN: 978-3-319-66709-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics