Efficient Dense Stereo with Occlusions for New View-Synthesis by Four-State Dynamic Programming
- 389 Downloads
A new algorithm is proposed for efficient stereo and novel view synthesis. Given the video streams acquired by two synchronized cameras the proposed algorithm synthesises images from a virtual camera in arbitrary position near the physical cameras. The new technique is based on an improved, dynamic-programming, stereo algorithm for efficient novel view generation. The two main contributions of this paper are: (i) a new four state matching graph for dense stereo dynamic programming, that supports accurate occlusion labelling; (ii) a compact geometric derivation for novel view synthesis by direct projection of the minimum cost surface. Furthermore, the paper presents an algorithm for the temporal maintenance of a background model to enhance the rendering of occlusions and reduce temporal artefacts (flicker); and a cost aggregation algorithm that acts directly in the three-dimensional matching cost space.
The proposed algorithm has been designed to work with input images with large disparity range, a common practical situation. The enhanced occlusion handling capabilities of the new dynamic programming algorithm are evaluated against those of the most powerful state-of-the-art dynamic programming and graph-cut techniques. Four-state DP is also evaluated against the disparity-based Middlebury error metrics and its performance found to be amongst the best of the efficient algorithms. A number of examples demonstrate the robustness of four-state DP to artefacts in stereo video streams. This includes demonstrations of cyclopean view synthesis in extended conversational sequences, synthesis from a freely translating virtual camera and, finally, basic 3D scene editing.
Keywordsdense stereo image-based rendering video-conferencing gaze correction
Unable to display preview. Download preview PDF.
- Buehler, C., Gortler, S., Cohen, M., and McMillan, L. 2002. Min surfaces for stereo. In Proc. Europ. Conf. Computer Vision, Copenhagen, Denmark.Google Scholar
- Belhumeur, P.N. and Mumford, D. 1992. A Bayesian treatment of the stereo correspondence problem using half-occluded regions. In IEEE Comp. Soc. Conf. on Comp. Vision and Pattern Recognition, pp. 506–512.Google Scholar
- Cox, I., Ott, M., and Lewis, J.P. 1993. Videoconference system using a virtual camera image. US Patent, 5,359,362.Google Scholar
- Criminisi, A., Shotton, J., Blake, A., and Torr, P. 2003. Gaze manipulation for one-to-one teleconferencing. In Proc. International Conference on Computer Vision, Nice.Google Scholar
- Chen, E. and Williams, L. 1993. View interpolation for image synthesis. In SIGGRAPH, pp. 279–288.Google Scholar
- Gemmell, J., Toyama, K., Zitnick, C., Kang, T., and Seitz, S. 2000. Gaze awareness for video-conferencing: A software approach. IEEE Multimedia, 7(4).Google Scholar
- Hartley, R.I. and Zisserman, A. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521623049.Google Scholar
- Ishikawa, H. and Geiger, D. 1998. Occlusions, discontinuities, and epipolar lines in stereo. In European Conference on Computer Vision, Freiburg, Germany, pp. 232–248.Google Scholar
- Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., and Rother, C. 2005. Bi-layer segmentation of binocular stereo video. In Computer Vision and Pattern Recognition (CVPR). Best Paper Honorable Mention Award, San Diego.Google Scholar
- Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., and Rother, C. 2006. Probabilistic fusion of stereo with color and contrast in bi-layer segmentation. Pattern Analysis and Machine Intelligence (PAMI) (In press).Google Scholar
- Kolmogorov, V. and Zabih, R. 2001. Computing visual correspondence with occlusions using graph cuts. In International Conference on Computer Vision, Vancouver, Canada., pp. II:508–515.Google Scholar
- Kolmogorov, V. and Zabih, R. 2002. Multi-camera scene reconstruction via graph cuts. In Proc. Europ. Conf. Computer Vision, Copenhagen, Denmark, pp. 82–96.Google Scholar
- Scharstein, D. 1999. View Synthesis Using Stereo Vision, vol. 1583 of Lecture Notes in Computer Science (LNCS). Springer-Verlag.Google Scholar
- Sun, J., Shum, H.Y., and Zheng, N.N. 2002. Stereo matching using belief propagation. In Proc. Europ. Conf. Computer Vision, Copenhagen, Denmark.Google Scholar
- Szeliski, R. 1999. Prediction error as a quality metric for motion and stereo. In Proc. Int. Conf. on Computer Vision, Kerkyra, Greece, pp. 781–788.Google Scholar
- Vetter, T. 1998. Synthesis of novel views from a single face image. Int. J. Computer Vision, 28(2):103–116.Google Scholar
- Yang, R. and Zhang, Z. 2002. Eye gaze correction with stereovision for video tele-conferencing. In Proc. Europ. Conf. Computer Vision, Copenhagen, Denmark, 2:479–494.Google Scholar