Exploiting Semantic Information and Deep Matching for Optical Flow

  • Min BaiEmail author
  • Wenjie LuoEmail author
  • Kaustav Kundu
  • Raquel Urtasun
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9910)


We tackle the problem of estimating optical flow from a monocular camera in the context of autonomous driving. We build on the observation that the scene is typically composed of a static background, as well as a relatively small number of traffic participants which move rigidly in 3D. We propose to estimate the traffic participants using instance-level segmentation. For each traffic participant, we use the epipolar constraints that govern each independent motion for faster and more accurate estimation. Our second contribution is a new convolutional net that learns to perform flow matching, and is able to estimate the uncertainty of its matches. This is a core element of our flow estimation pipeline. We demonstrate the effectiveness of our approach in the challenging KITTI 2015 flow benchmark, and show that our approach outperforms published approaches by a large margin.


Optical flow Low-level vision Deep learning Autonomous driving 



This work was partially supported by ONR-N00014-14-1-0232, Samsung, and NSERC.


  1. 1.
    Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: EpicFlow: edge-preserving interpolation of correspondences for optical flow. In: CVPR (2015)Google Scholar
  2. 2.
    Sun, D., Roth, S., Black, M.: A quantitative analysis of current practices in optical flow estimation and the principles behind them. IJCV 106(2), 115–137 (2013)CrossRefGoogle Scholar
  3. 3.
    Yamaguchi, K., Mcallester, D., Urtasun, R.: Robust monocular epipolar flow estimation. In: CVPR (2013)Google Scholar
  4. 4.
    Yamaguchi, K., McAllester, D., Urtasun, R.: Efficient joint segmentation, occlusion labeling, stereo and flow estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 756–771. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-10602-1_49 Google Scholar
  5. 5.
    Hirschmuller, H.: Stereo processing by semigloabl matching and mutual information. In: PAMI (2008)Google Scholar
  6. 6.
    Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: CVPR (2015)Google Scholar
  7. 7.
    Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: CVPR (2015)Google Scholar
  8. 8.
    Horn, B., Schunck, B.: Determining optical flow. Artif. Intell. 17, 185–203 (1981)CrossRefGoogle Scholar
  9. 9.
    Papenberg, N., Bruhn, A., Brox, T., Didas, S., Weickert, J.: Highly accurate optical flow computation with theoretically justified warping. IJCV 67(2), 141–158 (2006)CrossRefGoogle Scholar
  10. 10.
    Sun, D., Roth, S., Black, M.J.: Secrets of optical flow estimation and their principles. In: CVPR (2010)Google Scholar
  11. 11.
    Bruhn, A., Weickert, J., Schnoerr, C.: Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods. IJCV 61(3), 211–231 (2004)Google Scholar
  12. 12.
    Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: DeepFlow: large displacement optical flow with deep matching. In: ICCV (2013)Google Scholar
  13. 13.
    Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: DeepFlow: large displacement optical flow with deep matching. In: ICCV, Sydney, Australia, December 2013Google Scholar
  14. 14.
    Yang, H., Lin, W., Lu, J.: DAISY filter flow: a generalized discrete approach to dense correspondences. In: CVPR (2014)Google Scholar
  15. 15.
    Bao, L., Yang, Q., Jin, H.: Fast edge-preserving PatchMatch for large displacement optical flow. In: CVPR (2014)Google Scholar
  16. 16.
    Lempitsky, V., Roth, S., Rother, C.: FusionFlow: discrete-continuous optimization for optical flow estimation. In: CVPR (2008)Google Scholar
  17. 17.
    Menze, M., Heipke, C., Geiger, A.: Discrete optimization for optical flow. In: Gall, J., Gehler, P., Leibe, B. (eds.) GCPR 2015. LNCS, vol. 9358, pp. 16–28. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-24947-6_2 CrossRefGoogle Scholar
  18. 18.
    Lei, C., Yang, Y.H.: Optical flow estimation on coarse-to-fine region-trees using discrete optimization. In: ICCV (2009)Google Scholar
  19. 19.
    Yang, J., Li, H.: Dense accurate optical flow estimation with piecewise parametric model. In: CVPR (2015)Google Scholar
  20. 20.
    Sevilla-Lara, L., Sun, D., Jampani, V., Black, M.: Optical flow with semantic segmentation and localized layers. In: CVPR (2016)Google Scholar
  21. 21.
    Vogel, C., Schindler, K., Roth, S.: 3D scene flow estimation. IJCV 115(1), 1–28 (2015)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Vidal, R., Ma, Y., Soatto, S., Sastry, S.: Two-view multibody structure from motion. IJCV 68(1), 7–25 (2006)CrossRefGoogle Scholar
  23. 23.
    Zhang, G., Jia, J., Bao, H.: Simultaneous multi-body stereo and segmentation. In: ICCV (2011)Google Scholar
  24. 24.
    Roussos, A., Russell, C., Garg, R., Agapito, L.: Dense multibody motion estimation and reconstruction from a handheld camera. In: ISMAR (2012)Google Scholar
  25. 25.
    Zbontar, J., LeCun, Y.: Computing the stereo matching cost with a convolutional neural network. In: CVPR, June 2015Google Scholar
  26. 26.
    Chen, Z., Sun, X., Wang, L., Yu, Y., Huang, C.: A deep visual correspondence embedding model for stereo matching costs. In: ICCV, pp. 972–980 (2015)Google Scholar
  27. 27.
    Zagoruyko, S., Komodakis, N.: Learning to compare image patches via convolutional neural networks. In: CVPR, pp. 4353–4361 (2015)Google Scholar
  28. 28.
    Luo, W., Schwing, A., Urtasun, R.: Efficient deep learning for stereo matching. In: CVPR, July 2016Google Scholar
  29. 29.
    Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Smagt, P., Cremers, D., Brox, T.: FlowNet: learning optical flow with convolutional networks. In: ICCV (2015)Google Scholar
  30. 30.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)CrossRefGoogle Scholar
  31. 31.
    Tola, E., Lepetit, V., Fua, P.: DAISY: an efficient dense descriptor applied to wide baseline stereo. In: PAMI (2010)Google Scholar
  32. 32.
    Zabih, R., Woodfill, J.: Non-parametric local transforms for computing visual correspondence. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 801, pp. 151–158. Springer, Heidelberg (1994). doi: 10.1007/BFb0028345 CrossRefGoogle Scholar
  33. 33.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of CVPR (2012)Google Scholar
  34. 34.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
  35. 35.
    Girshick, R.: Fast r-cnn. In: ICCV, pp. 1440–1448 (2015)Google Scholar
  36. 36.
    Zhang, Z., Fidler, S., Urtasun, R.: Instance-level segmentation with deep densely connected MRFs. In: CVPR (2016)Google Scholar
  37. 37.
    Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., Urtasun, R.: Monocular 3D object detection for autonomous driving. In: CVPR (2016)Google Scholar
  38. 38.
    Fidler, S., Dickinson, S., Urtasun, R.: 3D object detection and viewpoint estimation with a deformable 3D cuboid model. In: NIPS (2012)Google Scholar
  39. 39.
    Hartley, R.: In defence of the eight-point algorithm. In: PAMI (1997)Google Scholar
  40. 40.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  41. 41.
    Ren, M., Zemel, R.: End-to-end instance segmentation and counting with recurrent attention. arXiv preprint arXiv:1605.09410 (2016)

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of TorontoTorontoCanada

Personalised recommendations