View-Consistent 3D Scene Flow Estimation over Multiple Frames

  • Christoph Vogel
  • Stefan Roth
  • Konrad Schindler
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8692)


We propose a method to recover dense 3D scene flow from stereo video. The method estimates the depth and 3D motion field of a dynamic scene from multiple consecutive frames in a sliding temporal window, such that the estimate is consistent across both viewpoints of all frames within the window. The observed scene is modeled as a collection of planar patches that are consistent across views, each undergoing a rigid motion that is approximately constant over time. Finding the patches and their motions is cast as minimization of an energy function over the continuous plane and motion parameters and the discrete pixel-to-plane assignment. We show that such a view-consistent multi-frame scheme greatly improves scene flow computation in the presence of occlusions, and increases its robustness against adverse imaging conditions, such as specularities. Our method currently achieves leading performance on the KITTI benchmark, for both flow and stereo.


Rigid Motion Data Term Multiple Frame Reference View Canonical View 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Basha, T., Moses, Y., Kiryati, N.: Multi-view scene flow estimation: A view centered variational approach. In: CVPR (2010)Google Scholar
  2. 2.
    Black, M.J., Anandan, P.: Robust dynamic motion estimation over time. In: CVPR (1991)Google Scholar
  3. 3.
    Bleyer, M., Rother, C., Kohli, P.: Surface stereo with soft segmentation. In: CVPR (2010)Google Scholar
  4. 4.
    Bleyer, M., Rother, C., Kohli, P., Scharstein, D., Sinha, S.N.: Object stereo – Joint stereo matching and object segmentation. In: CVPR (2011)Google Scholar
  5. 5.
    Brox, T., Malik, J.: Large displacement optical flow: Descriptor matching in variational motion estimation. TPAMI 33(3), 500–513 (2011)CrossRefGoogle Scholar
  6. 6.
    Courchay, J., Pons, J.-P., Monasse, P., Keriven, R.: Dense and accurate spatio-temporal multi-view stereovision. In: Zha, H., Taniguchi, R.-i., Maybank, S. (eds.) ACCV 2009, Part II. LNCS, vol. 5995, pp. 11–22. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  7. 7.
    Devernay, F., Mateus, D., Guilbert, M.: Multi-camera scene flow by tracking 3-D points and surfels. In: CVPR (2006)Google Scholar
  8. 8.
    Garg, R., Roussos, A., Agapito, L.: A variational approach to video registration with subspace constraints. IJCV, 1–29 (2013)Google Scholar
  9. 9.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? In: CVPR (2012)Google Scholar
  10. 10.
    Hirschmüller, H.: Stereo processing by semiglobal matching and mutual information. TPAMI 30(2), 328–341 (2008)CrossRefGoogle Scholar
  11. 11.
    Huguet, F., Devernay, F.: A variational method for scene flow estimation from stereo sequences. In: ICCV (2007)Google Scholar
  12. 12.
    Hung, C.H., Xu, L., Jia, J.: Consistent binocular depth and scene flow with chained temporal profiles. IJCV 102(1-3), 271–292 (2013)CrossRefzbMATHMathSciNetGoogle Scholar
  13. 13.
    Klaudiny, M., Hilton, A.: Cooperative patch-based 3D surface tracking. In: Proc. of the 8th International Conference on Visual Media Production (2011)Google Scholar
  14. 14.
    Lempitsky, V., Rother, C., Roth, S., Blake, A.: Fusion moves for Markov random field optimization. TPAMI 32(8), 1392–1405 (2010)CrossRefGoogle Scholar
  15. 15.
    Meister, S., Jähne, B., Kondermann, D.: Outdoor stereo camera system for the generation of real-world benchmark data sets. Optical Engineering 51(02) (2012)Google Scholar
  16. 16.
    Müller, T., Rannacher, J., Rabe, C., Franke, U.: Feature- and depth-supported modified total variation optical flow for 3D motion field estimation in real scenes. In: CVPR (2011)Google Scholar
  17. 17.
    Murray, D.W., Buxton, B.F.: Scene segmentation from visual motion using global optimization. TPAMI 9(2), 220–228 (1987)CrossRefGoogle Scholar
  18. 18.
    Park, J., Oh, T.H., Jung, J., Tai, Y.-W., Kweon, I.S.: A tensor voting approach for multi-view 3D scene flow estimation and refinement. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 288–302. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  19. 19.
    Rabe, C., Müller, T., Wedel, A., Franke, U.: Dense, robust, and accurate motion field estimation from stereo image sequences in real-time. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 582–595. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  20. 20.
    Schoenemann, T., Cremers, D.: High resolution motion layer decomposition using dual-space graph cuts. In: CVPR (2008)Google Scholar
  21. 21.
    Sun, D., Sudderth, E.B., Black, M.J.: Layered image motion with explicit occlusions, temporal consistency, and depth ordering. In: NIPS (2010)Google Scholar
  22. 22.
    Sun, D., Wulff, J., Sudderth, E., Pfister, H., Black, M.: A fully-connected layered model of foreground and background flow. In: CVPR (2013)Google Scholar
  23. 23.
    Tao, H., Sawhney, H.S.: Global matching criterion and color segmentation based stereo. In: WACV (2000)Google Scholar
  24. 24.
    Unger, M., Werlberger, M., Pock, T., Bischof, H.: Joint motion estimation and segmentation of complex scenes with label costs and occlusion modeling. In: CVPR (2012)Google Scholar
  25. 25.
    Valgaerts, L., Bruhn, A., Zimmer, H., Weickert, J., Stoll, C., Theobalt, C.: Joint estimation of motion, structure and geometry from stereo sequences. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 568–581. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  26. 26.
    Vedula, S., Baker, S., Collins, R., Kanade, T., Rander, P.: Three-dimensional scene flow. In: CVPR (1999)Google Scholar
  27. 27.
    Veksler, O., Boykov, Y., Mehrani, P.: Superpixels and supervoxels in an energy optimization framework. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 211–224. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  28. 28.
    Vogel, C., Schindler, K., Roth, S.: Piecewise rigid scene flow. In: ICCV (2013)Google Scholar
  29. 29.
    Vogel, C., Roth, S., Schindler, K.: An evaluation of data costs for optical flow. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 343–353. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  30. 30.
    Vogel, C., Schindler, K., Roth, S.: 3D scene flow estimation with a rigid motion prior. In: ICCV (2011)Google Scholar
  31. 31.
    Volz, S., Bruhn, A., Valgaerts, L., Zimmer, H.: Modeling temporal coherence for optical flow. In: ICCV (2011)Google Scholar
  32. 32.
    Wang, J.Y.A., Edward, A.H.: Representing moving images with layers. IEEE Transactions on Image Processing 3, 625–638 (1994)CrossRefGoogle Scholar
  33. 33.
    Wedel, A., Rabe, C., Vaudrey, T., Brox, T., Franke, U., Cremers, D.: Efficient dense scene flow from sparse or dense stereo data. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 739–751. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  34. 34.
    Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., Bischof, H.: Anisotropic Huber-L1 optical flow. In: BMVC (2009)Google Scholar
  35. 35.
    Yamaguchi, K., Hazan, T., McAllester, D., Urtasun, R.: Continuous Markov random fields for robust stereo estimation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 45–58. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  36. 36.
    Yamaguchi, K., McAllester, D., Urtasun, R.: Robust monocular epipolar flow estimation. In: CVPR (2013)Google Scholar
  37. 37.
    Zabih, R., Woodfill, J.: Non-parametric local transforms for computing visual correspondence. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 801, pp. 151–158. Springer, Heidelberg (1994)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Christoph Vogel
    • 1
  • Stefan Roth
    • 2
  • Konrad Schindler
    • 1
  1. 1.Photogrammetry and Remote SensingETH ZurichSwitzerland
  2. 2.Department of Computer ScienceTUDarmstadtGermany

Personalised recommendations