International Journal of Computer Vision

, Volume 71, Issue 1, pp 89–110 | Cite as

Efficient Dense Stereo with Occlusions for New View-Synthesis by Four-State Dynamic Programming

  • A. Criminisi
  • A. Blake
  • C. Rother
  • J. Shotton
  • P. H. S. Torr


A new algorithm is proposed for efficient stereo and novel view synthesis. Given the video streams acquired by two synchronized cameras the proposed algorithm synthesises images from a virtual camera in arbitrary position near the physical cameras. The new technique is based on an improved, dynamic-programming, stereo algorithm for efficient novel view generation. The two main contributions of this paper are: (i) a new four state matching graph for dense stereo dynamic programming, that supports accurate occlusion labelling; (ii) a compact geometric derivation for novel view synthesis by direct projection of the minimum cost surface. Furthermore, the paper presents an algorithm for the temporal maintenance of a background model to enhance the rendering of occlusions and reduce temporal artefacts (flicker); and a cost aggregation algorithm that acts directly in the three-dimensional matching cost space.

The proposed algorithm has been designed to work with input images with large disparity range, a common practical situation. The enhanced occlusion handling capabilities of the new dynamic programming algorithm are evaluated against those of the most powerful state-of-the-art dynamic programming and graph-cut techniques. Four-state DP is also evaluated against the disparity-based Middlebury error metrics and its performance found to be amongst the best of the efficient algorithms. A number of examples demonstrate the robustness of four-state DP to artefacts in stereo video streams. This includes demonstrations of cyclopean view synthesis in extended conversational sequences, synthesis from a freely translating virtual camera and, finally, basic 3D scene editing.


dense stereo image-based rendering video-conferencing gaze correction 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Buehler, C., Gortler, S., Cohen, M., and McMillan, L. 2002. Min surfaces for stereo. In Proc. Europ. Conf. Computer Vision, Copenhagen, Denmark.Google Scholar
  2. Belhumeur, P.N. and Mumford, D. 1992. A Bayesian treatment of the stereo correspondence problem using half-occluded regions. In IEEE Comp. Soc. Conf. on Comp. Vision and Pattern Recognition, pp. 506–512.Google Scholar
  3. Cox, I.J., Hingorani, S.L., Rao, S.B., and Maggs, B.M. 1996. A maximum-likelihood stereo algorithm. Computer Vision and Image Understanding, 63(3):542–567.CrossRefGoogle Scholar
  4. Cox, I., Ott, M., and Lewis, J.P. 1993. Videoconference system using a virtual camera image. US Patent, 5,359,362.Google Scholar
  5. Criminisi, A., Shotton, J., Blake, A., and Torr, P. 2003. Gaze manipulation for one-to-one teleconferencing. In Proc. International Conference on Computer Vision, Nice.Google Scholar
  6. Chen, E. and Williams, L. 1993. View interpolation for image synthesis. In SIGGRAPH, pp. 279–288.Google Scholar
  7. Gemmell, J., Toyama, K., Zitnick, C., Kang, T., and Seitz, S. 2000. Gaze awareness for video-conferencing: A software approach. IEEE Multimedia, 7(4).Google Scholar
  8. Hartley, R.I. and Zisserman, A. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521623049.Google Scholar
  9. Ishikawa, H. and Geiger, D. 1998. Occlusions, discontinuities, and epipolar lines in stereo. In European Conference on Computer Vision, Freiburg, Germany, pp. 232–248.Google Scholar
  10. Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., and Rother, C. 2005. Bi-layer segmentation of binocular stereo video. In Computer Vision and Pattern Recognition (CVPR). Best Paper Honorable Mention Award, San Diego.Google Scholar
  11. Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., and Rother, C. 2006. Probabilistic fusion of stereo with color and contrast in bi-layer segmentation. Pattern Analysis and Machine Intelligence (PAMI) (In press).Google Scholar
  12. Kolmogorov, V. and Zabih, R. 2001. Computing visual correspondence with occlusions using graph cuts. In International Conference on Computer Vision, Vancouver, Canada., pp. II:508–515.Google Scholar
  13. Kolmogorov, V. and Zabih, R. 2002. Multi-camera scene reconstruction via graph cuts. In Proc. Europ. Conf. Computer Vision, Copenhagen, Denmark, pp. 82–96.Google Scholar
  14. Ohta, Y. and Kanade, T. 1985. Stereo by intra- and inter-scanline search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 7(2):139–154.CrossRefGoogle Scholar
  15. Scharstein, D. 1999. View Synthesis Using Stereo Vision, vol. 1583 of Lecture Notes in Computer Science (LNCS). Springer-Verlag.Google Scholar
  16. Scharstein, D. and Szeliski, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Computer Vision, 47(1–3):7–42.zbMATHCrossRefGoogle Scholar
  17. Sun, J., Shum, H.Y., and Zheng, N.N. 2002. Stereo matching using belief propagation. In Proc. Europ. Conf. Computer Vision, Copenhagen, Denmark.Google Scholar
  18. Szeliski, R. 1999. Prediction error as a quality metric for motion and stereo. In Proc. Int. Conf. on Computer Vision, Kerkyra, Greece, pp. 781–788.Google Scholar
  19. Vetter, T. 1998. Synthesis of novel views from a single face image. Int. J. Computer Vision, 28(2):103–116.Google Scholar
  20. Yang, R. and Zhang, Z. 2002. Eye gaze correction with stereovision for video tele-conferencing. In Proc. Europ. Conf. Computer Vision, Copenhagen, Denmark, 2:479–494.Google Scholar

Copyright information

© Springer Science + Business Media, LLC 2006

Authors and Affiliations

  • A. Criminisi
    • 1
  • A. Blake
    • 1
  • C. Rother
    • 1
  • J. Shotton
    • 2
  • P. H. S. Torr
    • 3
  1. 1.Microsoft Research LtdCambridgeUK
  2. 2.University of CambridgeCambridgeUK
  3. 3.Oxford Brookes University, WheatleyOxfordUK

Personalised recommendations