Abstract
We propose a method for large displacement optical flow in which local matching costs are learned by a convolutional neural network (CNN) and a smoothness prior is imposed by a conditional random field (CRF). We tackle the computation- and memory-intensive operations on the 4D cost volume by a min-projection which reduces memory complexity from quadratic to linear and binary descriptors for efficient matching. This enables evaluation of the cost on the fly and allows to perform learning and CRF inference on high resolution images without ever storing the 4D cost volume. To address the problem of learning binary descriptors we propose a new hybrid learning scheme. In contrast to current state of the art approaches for learning binary CNNs we can compute the exact non-zero gradient within our model. We compare several methods for training binary descriptors and show results on public available benchmarks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Estimated for the cost volume size \(341{\times }145{\times }160{\times }160\) based on numbers in [5] corresponding to \(\frac{1}{3}\) resolution of Sintel images.
- 2.
Since we want to pose matching as a minimization problem.
References
Bailer, C., Taetz, B., Stricker, D.: Flow Fields: dense correspondence fields for highly accurate large displacement optical flow estimation. In: International Conference on Computer Vision (ICCV) (2015)
Bengio, Y., Léonard, N., Courville, A.C.: Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR abs/1308.3432 (2013). http://arxiv.org/abs/1308.3432
Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33783-3_44
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15561-1_56
Chen, Q., Koltun, V.: Full Flow: optical flow estimation by global optimization over regular grids. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Chen, Z., Sun, X., Wang, L., Yu, Y., Huang, C.: A deep visual correspondence embedding model for stereo matching costs. In: International Conference on Computer Vision (ICCV) (2015)
Courbariaux, M., Bengio, Y.: BinaryNet: training deep neural networks with weights and activations constrained to +1 or \(-1\). CoRR abs/1602.02830 (2016). http://arxiv.org/abs/1602.02830
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient belief propagation for early vision. Int. J. Comput. Vis. 70(1), 41–54 (2006)
Gadot, D., Wolf, L.: PatchBatch: a batch augmented loss for optical flow. In: Conference on Computer Vision and Pattern Recognition, (CVPR) (2016)
Güney, F., Geiger, A.: Deep discrete flow. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10114, pp. 207–224. Springer, Cham (2017). doi:10.1007/978-3-319-54190-7_13
Knöbelreiter, P., Reinbacher, C., Shekhovtsov, A., Pock, T.: End-to-end training of hybrid CNN-CRF models for stereo. In: Conference on Computer Vision and Pattern Recognition, (CVPR) (2017). http://arxiv.org/abs/1611.10229
Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. Trans. Pattern Anal. Mach. Intell. 28(10), 1568–1583 (2006)
Luo, W., Schwing, A., Urtasun, R.: Efficient deep learning for stereo matching. In: International Conference on Computer Vision and Pattern Recognition (ICCV) (2016)
Ranftl, R., Bredies, K., Pock, T.: Non-local total generalized variation for optical flow estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 439–454. Springer, Cham (2014). doi:10.1007/978-3-319-10590-1_29
Ranftl, R., Gehrig, S., Pock, T., Bischof, H.: Pushing the limits of stereo using variational stereo estimation. In: IEEE Intelligent Vehicles Symposium (IV) (2012)
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). doi:10.1007/978-3-319-46493-0_32
Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: EpicFlow: edge-preserving interpolation of correspondences for optical flow. In: Computer Vision and Pattern Recognition (CVPR) (2015)
Shekhovtsov, A., Reinbacher, C., Graber, G., Pock, T.: Solving dense image matching in real-time using discrete-continuous optimization. ArXiv e-prints, January 2016
Shekhovtsov, A., Kovtun, I., Hlaváč, V.: Efficient MRF deformation model for non-rigid image matching. CVIU 112, 91–99 (2008)
Trzcinski, T., Christoudias, M., Fua, P., Lepetit, V.: Boosting binary keypoint descriptors. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)
Wainwright, M., Jaakkola, T., Willsky, A.: MAP estimation via agreement on (hyper)trees: message-passing and linear-programming approaches. IT 51(11), 3697–3717 (2005)
Xu, J., Ranftl, R., Koltun, V.: Accurate optical flow via direct cost volume processing. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Žbontar, J., LeCun, Y.: Computing the stereo matching cost with a convolutional neural network. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Acknowledgements
We acknowledge grant support from Toyota Motor Europe HS, the ERC starting grant HOMOVIS No. 640156 and the research initiative Intelligent Vision Austria with funding from the AIT and the Austrian Federal Ministry of Science, Research and Economy HRSM programme (BGBl. II Nr. 292/2012).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Munda, G., Shekhovtsov, A., Knöbelreiter, P., Pock, T. (2017). Scalable Full Flow with Learned Binary Descriptors. In: Roth, V., Vetter, T. (eds) Pattern Recognition. GCPR 2017. Lecture Notes in Computer Science(), vol 10496. Springer, Cham. https://doi.org/10.1007/978-3-319-66709-6_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-66709-6_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66708-9
Online ISBN: 978-3-319-66709-6
eBook Packages: Computer ScienceComputer Science (R0)