A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms

  • Daniel Scharstein
  • Richard Szeliski

Abstract

Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods. Our taxonomy is designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms. We have also produced several new multi-frame stereo data sets with ground truth and are making both the code and data sets available on the Web. Finally, we include a comparative evaluation of a large set of today's best-performing stereo algorithms.

stereo matching survey stereo correspondence software evaluation of performance 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anandan, P. 1989. A computational framework and an algorithm for the measurement of visual motion. IJCV, 2(3):283–310.Google Scholar
  2. Arnold, R.D. 1983. Automated stereo perception. Technical Report AIM-351, Artificial Intelligence Laboratory, Stanford University.Google Scholar
  3. Baker, H.H. 1980. Edge based stereo correlation. In Image Understanding Workshop, L.S. Baumann (Ed.). Science Applications International Corporation, pp. 168–175.Google Scholar
  4. Baker, H. and Binford, T. 1981. Depth from edge and intensity based stereo. In IJCAI, pp. 631–636.Google Scholar
  5. Baker, S., Szeliski, R., and Anandan, P. 1998. A layered approach to stereo reconstruction. In CVPR, pp. 434–441.Google Scholar
  6. Barnard, S.T. 1989. Stochastic stereo matching over scale. IJCV, 3(1):17–32.Google Scholar
  7. Barnard, S.T. and Fischler, M.A. 1982. Computational stereo. ACM Comp. Surveys, 14(4):553–572.Google Scholar
  8. Barron, J.L., Fleet, D.J., and Beauchemin, S.S. 1994. Performance of optical flow techniques. IJCV, 12(1):43–77.Google Scholar
  9. Belhumeur, P.N. 1996. A Bayesian approach to binocular stereopsis. IJCV, 19(3):237–260.Google Scholar
  10. Belhumeur, P.N. and Mumford, D. 1992. A Bayesian treatment of the stereo correspondence problem using half-occuluded regions. In CVPR, pp. 506–512.Google Scholar
  11. Bergen, J.R., Anandan, P., Hanna, K.J., and Hingorani, R. 1992. Hierarchical model-based motion estimation. In ECCV, pp. 237–252.Google Scholar
  12. Birchfield, S. and Tomasi, C. 1998a. A pixel dissimilarity measure that is insensitive to image sampling. IEEE TPAMI, 20(4):401–406.Google Scholar
  13. Birchfield, S. and Tomasi, C. 1998b. Depth discontinuities by pixel-to-pixel stereo. In ICCV, pp. 1073–1080.Google Scholar
  14. Birchfield, S. and Tomasi, C. 1999. Multiway cut for stereo and motion with slanted surfaces. In ICCV, pp. 489–495.Google Scholar
  15. Black, M.J. and Anandan, P. 1993. A framework for the robust estimation of optical flow. In ICCV, pp. 231–236.Google Scholar
  16. Black, M.J. and Rangarajan, A. 1996. On the unification of line processes, outlier rejection, and robust statistics with applications in early vision. IJCV, 19(1):57–91.Google Scholar
  17. Blake, A. and Zisserman, A. 1987. Visual Reconstruction, MITPress: Cambridge, MA.Google Scholar
  18. Bobick, A.F. and Intille, S.S. 1999. Large occlusion stereo. IJCV, 33(3):181–200.Google Scholar
  19. Bolles, R.C., Baker, H.H., and Hannah, M.J. 1993. The JISCT stereo evaluation. In DARPA Image Understanding Workshop, pp. 263–274.Google Scholar
  20. Bolles, R.C., Baker, H.H., and Marimont, D.H. 1987. Epipolar-plane image analysis: An approach to determining structure from motion. IJCV, 1:7–55.Google Scholar
  21. Boykov, Y. and Kolmogorov, V. 2001. An experimental comparison of min-cut/max-flow algorithms for energy minimization in computer vision. In Intl. Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, pp. 205–220.Google Scholar
  22. Boykov, Y., Veksler, O., and Zabih, R. 1998. A variable window approach to early vision. IEEE TPAMI, 20(12):1283–1294.Google Scholar
  23. Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE TPAMI, 23(11):1222–1239.Google Scholar
  24. Broadhurst, A., Drummond, T., and Cipolla, R. 2001. A probabilistic framework for space carving. In ICCV, Vol. I, pp. 388–393.Google Scholar
  25. Brown, L.G. 1992. A survey of image registration techniques. Computing Surveys, 24(4):325–376.Google Scholar
  26. Burt, P.J. and Adelson, E.H. 1983. The Laplacian pyramid as a compact image code. IEEE Transactions on Communications, COM-31(4):532–540.Google Scholar
  27. Canny, J.F. 1986. A computational approach to edge detection. IEEE TPAMI, 8(6):34–43.Google Scholar
  28. Chou, P.B. and Brown, C.M. 1990. The theory and practice of Bayesian image labeling. IJCV, 4(3):185–210.Google Scholar
  29. Cochran, S.D. and Medioni, G. 1992. 3-D surface description from binocular stereo. IEEE TPAMI, 14(10):981–994.Google Scholar
  30. Collins, R.T. 1996. A space-sweep approach to true multi-image matching. In CVPR, pp. 358–363.Google Scholar
  31. Cox, I.J., Hingorani, S.L., Rao, S.B., and Maggs, B.M. 1996. A maximum likelihood stereo algorithm. CVIU, 63(3):542–567.Google Scholar
  32. Cox, I.J., Roy, S., and Hingorani, S.L. 1995. Dynamic histogram warping of image pairs for constant image brightness. In IEEE International Conference on Image Processing, Vol. 2, pp. 366–369.Google Scholar
  33. Culbertson, B., Malzbender, T., and Slabaugh, G. 1999. Generalized voxel coloring. In International Workshop on Vision Algorithms, Kerkyra, Greece. Springer: Berlin, pp. 100–114.Google Scholar
  34. De Bonet, J.S. and Viola, P. 1999. Poxels: Probabilistic voxelized volume reconstruction. In ICCV, pp. 418–425.Google Scholar
  35. Deriche, R. 1990. Fast algorithms for low-level vision. IEEE TPAMI, 12(1):78–87.Google Scholar
  36. Dev, P. 1974. Segmentation processes in visual perception: A cooperative neural model. University of Massachusetts at Amherst, COINS Technical Report 74C-5.Google Scholar
  37. Dhond, U.R. and Aggarwal, J.K. 1989. Structure from stereo—a review. IEEE Trans. on Systems, Man, and Cybern., 19(6):1489–1510.Google Scholar
  38. Faugeras, O. and Keriven, R. 1998. Variational principles, surface evolution, PDE's, level set methods, and the stereo problem. IEEE Trans. Image Proc., 7(3):336–344.Google Scholar
  39. Faugeras, O. and Luong, Q.-T. 2001. The Geometry of Multiple Images. MIT Press: Cambridge, MA.Google Scholar
  40. Fleet, D.J., Jepson, A.D., and Jenkin, M.R.M. 1991. Phase-based disparity measurement. CVGIP, 53(2):198–210.Google Scholar
  41. Frohlinghaus, T. and Buhmann, J.M. 1996. Regularizing phase-based stereo. In ICPR, Vol. A, pp. 451–455.Google Scholar
  42. Fua, P. 1993. A parallel stereo algorithm that produces dense depth maps and preserves image features. Machine Vision and Applications, 6:35–49.Google Scholar
  43. Fua, P. and Leclerc, Y.G. 1995. Object-centered surface reconstruction: Combining multi-image stereo and shading. IJCV, 16:35–56.Google Scholar
  44. Gamble, E. and Poggio, T. 1987. Visual integration and detection of discontinuities: The key role of intensity edges. AI Lab, MIT, A.I. Memo 970.Google Scholar
  45. Geiger, D. and Girosi, F. 1991. Parallel and deterministic algorithms for MRF's: Surface reconstruction. IEEE TPAMI, 13(5):401–412.Google Scholar
  46. Geiger, D., Ladendorf, B., and Yuille, A. 1992. Occlusions and binocular stereo. In ECCV, pp. 425–433.Google Scholar
  47. Geman, S. and Geman, D. 1984. Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. IEEE TPAMI, 6(6):721–741.Google Scholar
  48. Gennert, M.A. 1988. Brightness-based stereo matching. In ICCV, pp. 139–143.Google Scholar
  49. Gong, M. and Yang, Y.-H. 2002. Genetic-based stereo algorithm and disparity map evaluation. IJCV, 47(1/2/3):63–77.Google Scholar
  50. Grimson, W.E.L. 1985. Computational experiments with a feature based stereo algorithm. IEEE TPAMI, 7(1):17–34.Google Scholar
  51. Hannah, M.J. 1974. Computer Matching of Areas in Stereo Images. Ph.D. Thesis, Stanford University.Google Scholar
  52. Hartley, R.I. and Zisserman, A. 2000. Multiple Views Geometry. Cambridge University Press: Cambridge, UK.Google Scholar
  53. Hirschmüller, H. 2002. Real-time correlation-based stereo vision with reduced border errors. IJCV, 47(1/2/3):229–246.Google Scholar
  54. Hsieh, Y.C., McKeown, D., and Perlant, F.P. 1992. Performance evaluation of scene registration and stereo matching for cartographic feature extraction. IEEE TPAMI, 14(2):214–238.Google Scholar
  55. Ishikawa, H. and Geiger, D. 1998. Occlusions, discontinuities, and epipolar lines in stereo. In ECCV, pp. 232–248.Google Scholar
  56. Jenkin, M.R.M., Jepson, A.D., and Tsotsos, J.K. 1991. Techniques for disparity measurement. CVGIP: Image Understanding, 53(1):14–30.Google Scholar
  57. Jones, D.G. and Malik, J. 1992. A computational framework for determining stereo correspondence from a set of linear spatial filters. In ECCV, pp. 395–410.Google Scholar
  58. Kanade, T. 1994. Development of a videorate stereo machine. In Image Understanding Workshop, Monterey, CA, 1994. Morgan Kaufmann Publishers: San Mateo, CA, pp. 549–557.Google Scholar
  59. Kanade, T. and Okutomi, M. 1994. A stereo matching algorithm with an adaptive window: Theory and experiment. IEEE TPAMI, 16(9):920–932.Google Scholar
  60. Kanade, T., Yoshida, A., Oda, K., Kano, H., and Tanaka, M. 1996. A stereo machine for video-rate dense depth mapping and its new applications. In CVPR, pp. 196–202.Google Scholar
  61. Kang, S.B., Szeliski, R., and Chai, J. 2001. Handling occlusions in dense multi-view stereo. In CVPR, pp. 103–110.Google Scholar
  62. Kang, S.B., Webb, J., Zitnick, L., and Kanade, T. 1995. A multibase-line stereo system with active illumination and realtime image acquisition. In ICCV, pp. 88–93.Google Scholar
  63. Kass, M. 1988. Linear image features in stereopsis. IJCV, 1(4):357–368.Google Scholar
  64. Kimura, R. et al. 1999. A convolver-based real-time stereo machine (SAZAN). In CVPR, Vol. 1, pp. 457–463.Google Scholar
  65. Kolmogorov, V. and Zabih, R. 2001. Computing visual correspondence with occlusions using graph cuts. In ICCV, Vol. II, pp. 508–515.Google Scholar
  66. Kutulakos, K.N. 2000. Approximate N-view stereo. In ECCV, Vol. I, pp. 67–83.Google Scholar
  67. Kutulakos, K.N. and Seitz, S.M. 2000. A theory of shape by space carving. IJCV, 38(3):199–218.Google Scholar
  68. Lee, S.H., Kanatsugu, Y., and Park, J.-I. 2002. MAP-based stochastic diffusion for stereo matching and line fields estimation. IJCV, 47(1/2/3):195–218.Google Scholar
  69. Lin, M. and Tomasi, C. Surfaces with occlusions from layered stereo. Technical report, Stanford University. In preparation.Google Scholar
  70. Loop, C. and Zhang, Z. 1999. Computing rectifying homographies for stereo vision. In CVPR, Vol. I, pp. 125–131.Google Scholar
  71. Lucas, B.D. and Kanade, T. 1981. An iterative image registration technique with an application in stereo vision. In IJCAI, pp. 674–679.Google Scholar
  72. Marr, D. 1982. Vision. Freeman: New York.Google Scholar
  73. Marr, D. and Poggio, T. 1976. Cooperative computation of stereo disparity. Science, 194:283–287.Google Scholar
  74. Marr, D.C. and Poggio, T. 1979. A computational theory of human stereo vision. Proceedings of the Royal Society of London, B 204:301–328.Google Scholar
  75. Marroquin, J.L. 1983. Design of cooperative networks. AI Lab, MIT, Working Paper 253.Google Scholar
  76. Marroquin, J., Mitter, S., and Poggio, T. 1987. Probabilistic solution of ill-posed problems in computational vision. Journal of the American Statistical Association, 82(397):76–89.Google Scholar
  77. Matthies, L., Szeliski, R., and Kanade, T. 1989. Kalman filter-based algorithms for estimating depth from image sequences. IJCV, 3:209–236.Google Scholar
  78. Mitiche, A. and Bouthemy, P. 1996. Computation and analysis of image motion: Asynopsis of current problems and methods. IJCV, 19(1):29–55.Google Scholar
  79. Mühlmann, K., Maier, D., Hesser, J., and Männer, R. 2002. Calculating dense disparity maps from color stereo images, an efficient implementation. IJCV, 47(1/2/3):79–88.Google Scholar
  80. Mulligan, J., Isler, V., and Danulidis, K. 2001. Performance evaluation of stereo for telepresence. In ICCV, Vol. II, pp. 558–565.Google Scholar
  81. Nakamura, Y., Matsuura, T., Satoh, K., and Ohta, Y. 1996. Occlusion detectable stereo—occlusion patterns in camera matrix. In CVPR, pp. 371–378.Google Scholar
  82. Nishihara, H.K. 1984. Practical real-time imaging stereo matcher. Optical Engineering, 23(5):536–545.Google Scholar
  83. Ohta, Y. and Kanade, T. 1985. Stereo by intra-and interscanline search using dynamic programming. IEEE TPAMI, 7(2):139–154.Google Scholar
  84. Okutomi, M. and Kanade, T. 1992. A locally adaptive window for signal matching. IJCV, 7(2):143–162.Google Scholar
  85. Okutomi, M. and Kanade, T. 1993. A multiple-baseline stereo. IEEE TPAMI, 15(4):353–363.Google Scholar
  86. Otte, M. and Nagel, H.-H. 1994. Optical flow estimation: Advances and comparisons. In ECCV, Vol. 1, pp. 51–60.Google Scholar
  87. Poggio, T., Torre, V., and Koch, C. 1985. Computational vision and regularization theory. Nature, 317(6035):314–319.Google Scholar
  88. Pollard, S.B., Mayhew, J.E.W., and Frisby, J.P. 1985. PMF: A stereo correspondence algorithm using a disparity gradient limit. Perception, 14:449–470.Google Scholar
  89. Prazdny, K. 1985. Detection of binocular disparities. Biological Cybernetics, 52(2):93–99.Google Scholar
  90. Quam, L.H. 1984. Hierarchical warp stereo. In Image Understanding Workshop, New Orleans, Louisiana, 1984. Science Applications International Corporation, pp. 149–155.Google Scholar
  91. Roy, S. 1999. Stereo without epipolar lines: A maximum flow formulation. IJCV, 34(2/3):147–161.Google Scholar
  92. Roy, S. and Cox, I.J. 1998. A maximum-flow formulation of the N-camera stereo correspondence problem. In ICCV, pp. 492–499.Google Scholar
  93. Ryan, T.W., Gray, R.T., and Hunt, B.R. 1980. Prediction of correlation errors in stereo-pair images. Optical Engineering, 19(3):312–322.Google Scholar
  94. Saito, H. and Kanade, T. 1999. Shape reconstruction in projective grid space from large number of images. In CVPR, Vol. 2, pp. 49–54.Google Scholar
  95. Scharstein, D. 1994. Matching images by comparing their gradient fields. In ICPR, Vol. 1, pp. 572–575.Google Scholar
  96. Scharstein, D. 1999. View synthesis Using Stereo Vision, Vol. 1583 of Lecture Notes in Computer Science (LNCS). Springer-Verlag: Berlin.Google Scholar
  97. Scharstein, D. and Szeliski, R. 1998. Stereo matching with nonlinear diffusion. IJCV, 28(2):155–174.Google Scholar
  98. Scharstein, D. and Szeliski, R. 2001. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Microsoft Research, Technical Report MSR-TR-2001–81.Google Scholar
  99. Scharstein, D., Szeliski, R., and Zabih, R. 2001. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. In IEEE Workshop on Stereo and Multi-Baseline Vision.Google Scholar
  100. Seitz, P. 1989. Using local orientation information as image primitive for robust object recognition. In SPIE Visual Communications and Image Processing IV, Vol. 1199, pp. 1630–1639.Google Scholar
  101. Seitz, S.M. and Dyer, C.M. 1999. Photorealistic scene reconstruction by voxel coloring. IJCV, 35(2):1–23.Google Scholar
  102. Shade, J., Gortler, S., He, L.-W., and Szeliski, R. 1998. Layered depth images. In SIGGRAPH, pp. 231–242.Google Scholar
  103. Shah, J. 1993. A nonlinear diffusion model for discontinuous disparity and half-occlusion in stereo. In CVPR, pp. 34–40.Google Scholar
  104. Shao, J. 2002. Generation of temporally consistent multiple virtual camera views from stereoscopic image sequences. IJCV, 47(1/2/3):171–180.Google Scholar
  105. Shimizu, M. and Okutomi, M. 2001. Precise sub-pixel estimation on area-based matching. In ICCV, Vol. I, pp. 90–97.Google Scholar
  106. Shum, H.-Y. and Szeliski, R. 1999. Stereo reconstruction from mul-tiperspective panoramas. In ICCV, pp. 14–21.Google Scholar
  107. Simoncelli, E.P., Adelson, E.H., and Heeger, D.J. 1991. Probability distributions of optic flow. In CVPR, pp. 310–315.Google Scholar
  108. Sun, C. 2002. Fast stereo matching using rectangular subregioning and 3D maximum-surface techniques. IJCV, 47(1/2/3):99–117.Google Scholar
  109. Sun, J., Shum, H.Y., and Zheng, N.N. 2002. Stereo matching using belief propagation. In ECCV.Google Scholar
  110. Szeliski, R. 1989. Bayesian Modeling of Uncertainty in Low-Level Vision. Kluwer: Boston, MA.Google Scholar
  111. Szeliski, R. 1999. Prediction error as a quality metric for motion and stereo. In ICCV, pp. 781–788.Google Scholar
  112. Szeliski, R. and Coughlan, J. 1997. Spline-based image registration. IJCV, 22(3):199–218.Google Scholar
  113. Szeliski, R. and Golland, P. 1999. Stereo matching with transparency and matting. IJCV, 32(1):45–61. Special Issue for Marr Prize papers.Google Scholar
  114. Szeliski, R. and Hinton, G. 1985. Solving random-dot stereograms using the heat equation. In CVPR, pp. 284–288.Google Scholar
  115. Szeliski, R. and Scharstein, D. 2002. Symmetric sub-pixel stereo matching. In ECCV.Google Scholar
  116. Szeliski, R. and Zabih, R. 1999. An experimental comparison of stereo algorithms. In International Workshop on Vision Algorithms, Kerkyra, Greece, 1999. Springer: Berlin, pp. 1–19.Google Scholar
  117. Tao, H., Sawhney, H., and Kumar, R. 2001. Aglobal matching frame-work for stereo computation. In ICCV, Vol. I, pp. 532–539.Google Scholar
  118. Tekalp, M. 1995. Digital Video Processing. Prentice Hall: Upper Saddle River, NJ.Google Scholar
  119. Terzopoulos, D. 1986. Regularization of inverse visual problems involving discontinuities. IEEE TPAMI, 8(4):413–424.Google Scholar
  120. Terzopoulos, D. and Fleischer, K. 1988. Deformable models. The Visual Computer, 4(6):306–331.Google Scholar
  121. Terzopoulos, D. and Metaxas, D. 1991. Dynamic 3D models with local and global deformations: Deformable superquadrics. IEEE TPAMI, 13(7):703–714.Google Scholar
  122. Tian, Q. and Huhns, M.N. 1986. Algorithms for subpixel registration. CVGIP, 35:220–233.Google Scholar
  123. Veksler, O. 1999. Efficient Graph-based Energy Minimization Methods in Computer Vision. Ph.D. Thesis, Cornell University.Google Scholar
  124. Veksler, O. 2001. Stereo matching by compact windows via minimum ratio cycle. In ICCV, Vol. I, pp. 540–547.Google Scholar
  125. Wang, J.Y.A. and Adelson, E.H. 1993. Layered representation for motion analysis. In CVPR, pp. 361–366.Google Scholar
  126. Witkin, A., Terzopoulos, D., and Kass, M. 1987. Signal matching through scale space. IJCV, 1:133–144.Google Scholar
  127. Yang, Y., Yuille, A., and Lu, J. 1993. Local, global, and multilevel stereo matching. In CVPR, pp. 274–279.Google Scholar
  128. Yuille, A.L. and Poggio, T. 1984. A generalized ordering constraint for stereo correspondence. AI Lab, MIT, A.I. Memo 777.Google Scholar
  129. Zabih, R. and Woodfill, J. 1994. Non-parametric local transforms for computing visual correspondence. In ECCV, Vol. II, pp. 151–158.Google Scholar
  130. Zhang, Z. 1998. Determining the epipolar geometry and its uncertainty: A review. IJCV, 27(2):161–195.Google Scholar
  131. Zhang, Z. 2000. A flexible new technique for camera calibration. IEEE TPAMI, 22(11):1330–1334.Google Scholar
  132. Zitnick, C.L. and Kanade, T. 2000. A cooperative algorithm for stereo matching and occlusion detection. IEEE TPAMI, 22(7):675–684.Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • Daniel Scharstein
    • 1
  • Richard Szeliski
    • 2
  1. 1.Department of Mathematics and Computer ScienceMiddlebury CollegeMiddleburyUSA
  2. 2.Microsoft ResearchMicrosoft CorporationRedmondUSA

Personalised recommendations