International Journal of Computer Vision

, Volume 5, Issue 3, pp 271–301 | Cite as

Bayesian modeling of uncertainty in low-level vision

  • Richard Szeliski
Article

Abstract

The need for error modeling, multisensor fusion, and robust algorithms is becoming increasingly recognized in computer vision. Bayesian modeling is a powerful, practical, and general framework for meeting these requirements. This article develops a Bayesian model for describing and manipulating the dense fields, such as depth maps, associated with low-level computer vision. Our model consists of three components: a prior model, a sensor model, and a posterior model. The prior model captures a priori information about the structure of the field. We construct this model using the smoothness constraints from regularization to define a Markov Random Field. The sensor model describes the behavior and noise characteristics of our measurement system. We develop a number of sensor models for both sparse and dense measurements. The posterior model combines the information from the prior and sensor models using Bayes' rule. We show how to compute optimal estimates from the posterior model and also how to compute the uncertainty (variance) in these estimates. To demonstrate the utility of our Bayesian framework, we present three examples of its application to real vision problems. The first application is the on-line extraction of depth from motion. Using a two-dimensional generalization of the Kalman filter, we develop an incremental algorithm that provides a dense on-line estimate of depth whose accuracy improves over time. In the second application, we use a Bayesian model to determine observer motion from sparse depth (range) measurements. In the third application, we use the Bayesian interpretation of regularization to choose the optimal smoothing parameter for interpolation. The uncertainty modeling techniques that we develop, and the utility of these techniques in various applications, support our claim that Bayesian modeling is a powerful and practical framework for low-level vision.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ackley, D.H., Hinton, G.E., and Sejnowski, T.J. 1985. A learning algorithm for Boltzmann machines. Cognitive Science, 9: 147–169.Google Scholar
  2. Adelson, E.H., and Bergen, J.R. 1985. Spatioternporal energy models for the perception of motion. J. Opt. Soc. Amer. A2: 284–299.Google Scholar
  3. Aloimonos, J., Weiss, I., and Bandyopadhyay, A. 1987. Active vision. In Proc. 1st Intern. Conf. Comput. Vision, London, pp. 35–54.Google Scholar
  4. Anandan, P. 1989. A computational framework and an algorithm for the measurement of visual motion. Intern. J. Comput. Vision 2: 283–310.Google Scholar
  5. Anandan, P., and Weiss, R. 1985. Introducing a smoothness constraint in a matching approach for the computation of displacement fields. In Proc. DARPA Image Understanding Workshop, Miami Reach, FL, pp. 186–196.Google Scholar
  6. Baker, H.H. and Bolles, R.C., 1989. Generalizing epipolar-plane image analysis on the spatiotemporal surface, Intern. J. Comput. Vision 3: 33–49.Google Scholar
  7. Barnard, S.T. 1989. Stochastic stereo matching over scale, Intern. J. Comput. Vision 3: 17–32.Google Scholar
  8. Barnard, S.T., and Fischler, M.A. 1982. Computational stereo. Computing Surveys 14: 553–572.Google Scholar
  9. Barrow, H.G., and Tenenbaum, J.M. 1978. Recovering intrinsic scene characteristics from images. In Allen R.Hanson and Edward M.Riseman (eds.) Computer Vision Systems, pp. 3–26. Academic Press: New York.Google Scholar
  10. Bertero, M., Pogglo, T., and Torre, V. 1987. III-posed problems in early vision. A.I. Memo 924, Massachusetts Institute of Technology.Google Scholar
  11. Bierman, G.J. 1977. Factorization Methods for Discrete Sequential Estimation. Academic Press: New York.Google Scholar
  12. Blake, A., and Zisserman, A. 1987. Visual Reconstruction MIT Press: Cambridge, MA.Google Scholar
  13. Bolles, R.C., Baker, H.H., and Marimont, D.H. 1987. Epipolar-plane image analysis: An approach to determining structure from motion. Intern. J. Comput. Vision 1: 7–55.Google Scholar
  14. Boult, T.E. 1986. Information based complexity in non-linear equations and computer vision, Ph.D. thesis, Columbia University.Google Scholar
  15. Bracewell, R.N. 1978. The Fourier Transform and Its Applications. 2nd ed. McGraw-Hill: New York.Google Scholar
  16. Briggs, W.L. 1987. A Multigrid Tutorial. Society for Industrial and Applied Mathematics: Philadelphia.Google Scholar
  17. Burt, P.J., and Adelson, E.H. 1983. The Laplacian pyramid as a compact image code. IEEE. Trans. Commun. COM-31: 532–540.Google Scholar
  18. Canny, J. 1986. A computational approach to edge detection. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-8: 679–698.Google Scholar
  19. Chen, L.-H., and Boult, T.E. 1988. An integrated approach to stereo matching, surface reconstruction and depth segmentation using consistent smoothness assumptions. In Proc. DARPA Image Understanding Workshop, Cambridge, MA, pp. 166–176.Google Scholar
  20. Choi, D.J. 1987. Solving the depth interpolation problem on a fine grained, mesh-and tree-connected SIMD machine. In Proc. DARPA Image Understanding Workshop, Los Angeles, pp. 639–643.Google Scholar
  21. Christ, J.P. 1987. Shape estimation and object recognition using spatial probability distributions. Ph.D. thesis, Carnegie Mellon University.Google Scholar
  22. Craven, P., and Wahba, G. 1979. Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik 31: 377–403.Google Scholar
  23. Crowley, J.L., and Stern, R.M. 1982. Fast computation of the difference of low-pass transform. Tech. Rept. CMU-RI-TR-82-18: The Robotics Institute, Carnegic Mellon UniversityGoogle Scholar
  24. Dev, P. 1974. Segmentation processes in visual perception: a cooperative neural model. COINS Technical Report 74C-5, University of Massachusetts at Amherst.Google Scholar
  25. Duda, R.O., and Hart, P.E. 1973. Pattern Classification and Scene Analysis, Wiley: New York.Google Scholar
  26. Durbin, R., Szeliski, R., and Yuille, A. 1989. An analysis of the elastic net approach to the travelling salesman problem. Neural Computation 1: 348–358.Google Scholar
  27. Durbin, R., and Willshaw, D. 1987. An analogue approach to the traveling salesman problem using an elastic net method. Nature 326: 689–691.Google Scholar
  28. Durrant-Whyte, H.F. 1987. Consistent integration and propagation of disparate sensor observations. Intern. J. Robotics Res. 6: 3–24.Google Scholar
  29. Elfes, A., and Matthies, L. 1987. Sensor integration for robot navigation: Combining sonar and stereo range data in a grid-based representation. In Proc. IEEE Conf. Decision and Control.Google Scholar
  30. Faugeras, O.D., Ayache, N., and Faverjon B. 1986. Building visual maps by combining noisy stereo measurements. In Proc. IEEE Intern. Conf. Robotics and Automation, San Francisco, pp. 1433–1438.Google Scholar
  31. Faugeras, O.D., and Hebert, M. 1987. The representation, recognition and positioning of 3-D shapes from range data. In TakeoKanade (ed.), Three-Dimensional Machine Vision. Kluwer Academic Publishers: Boston, pp. 301–353.Google Scholar
  32. Gamble, E., and Poggio, T. 1987. Visual integration and detection of discontinuities: The key role of intensity edges. A.I. Memo 970, Artif. Intell. Lab., Massachusetts Institute of Technology.Google Scholar
  33. Geiger, D., and Girosi, F. 1989. Mean field theory for surface reconstruction. In Proc. Image Understanding Workshop, Palo Alto, CA, pp. 617–630.Google Scholar
  34. Gelb, Arthur (ed.). 1974, Applied Optimal Estimation, MIT Press: Cambridge, MA.Google Scholar
  35. Geman, S., and Geman, D. 1984. Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. In IEEE Trans. Patt. Anal. Mach. Intell. PAMI-6: 721–741.Google Scholar
  36. Geman, S., and McClure, D.E. 1987. Statistical methods for tomographic image reconstruction. In Proc. 46th Session of the Intern. Statistical Inst.Google Scholar
  37. Grimson, W.E.L. 1981. From Images to Surfaces: a Computional Study of the Human Early Visual System. MIT Press: Cambridge, MA.Google Scholar
  38. Grimson, W.E.L. 1983. An implementation of a computational theory of visual surface interpolation. Comput. Vision, Graphics, and Image Process. 22: 39–69.Google Scholar
  39. Hackbusch, W. 1985. Multigrid Methods and Applications. Springer-Verlag: Berlin.Google Scholar
  40. Heeger, D.J. 1987. Optical flow from spatiotemporal filters. In Proc. Ist Intern. Conf. Comput. Vision, London, pp. 181–190.Google Scholar
  41. Heel, J. 1989. Dynamic motion vision. In Proc. Image Understanding Workshop, Palo Alto, CA, pp. 702–713.Google Scholar
  42. Hinton, G.E. 1977. Relaxation and its role in vision. Ph.D. thesis, University of Edinburgh.Google Scholar
  43. Hinton, G.E., Sejnowski, T.J. 1983. Optimal perceptual inference In Proc. Conf. Comput. Vision and Patt. Recog., Washington, D.C., pp. 448–453.Google Scholar
  44. Hoff, W., and Ahuja, N. 1986. Surfaces from stereo. In Proc. 8th Intern. Conf. Patt. Recog., Paris, pp. 516–518.Google Scholar
  45. Horn, B.K.P. 1977. Understanding image intensities. Artificial Intelligence 8: 201–231.Google Scholar
  46. Horn, B.K.P., and Brooks, M.J. 1986. The variational approach to shape from shading. Comput. Vision, Graphics, Image Process. 33: 174–208.Google Scholar
  47. Horn, B.K.P., and Schunck, B.-G., 1981. Determining optical flow, Artificial Intelligence 17: 185–203.Google Scholar
  48. Hueckel, M.H. 1971. An operator which locates edges in digitized pictures. J. Assoc. Comput. Mach. 18: 113–125.Google Scholar
  49. Ikeuchi, K., and Horn, B.K.P. 1981. Numerical shape from shading and occluding boundaries. Artificial Intelligence 17: 141–184.Google Scholar
  50. Julesz, B. 1971. Foundations of Cyclopean Perception. Chicago University Press: Chicago.Google Scholar
  51. Kass, M., Witkin, A., and Terzopoulos, D. 1988. Snakes: Active contour models. Intern. J. Comput. Vision 1: 321–331.Google Scholar
  52. Kimeldorf, G., and Wahba, G. 1970. A correspondence between Bayesian estimation on stochastic processes and smoothing by splines. Ann. Math. Stat. 41: 495–502.Google Scholar
  53. Kirkpatrick, S., Gelatt, C.D.Jr., and Vecchi, M.P. 1983. Optimization by simulated annealing. Science 220: 671–680.Google Scholar
  54. Koch, C., Marroquin, J., and Yuille, A. 1986. Analog “neuronal” networks in early vision. Proc. Nat. Acad. Sci. U.S.A. 83: 4263–4267.Google Scholar
  55. Konrad, J., and Dubois, E. 1988. Multigrid Bayesian estimation of image motion fields using stochastic relaxation. In Proc. 2nd Intern. Conf. Comput. Vision, Tampa, FL, pp. 354–362.Google Scholar
  56. Leelere, Y.G., 1989. Constructing simple stable descriptions for image partitioning. Intern. J. Comput. Vision 3: 75–102.Google Scholar
  57. Lowe, D.G. 1985. Perceptual Organization and Visual Recognition. Kluwer Academic Publishers: Boston.Google Scholar
  58. Mandelbrot, B.B. 1982. The Fractal Geometry of Nature. W.H. Frecman: San Francisco.Google Scholar
  59. Marr, D. 1978. Representing visual information. In Allen R.Hanson and Edward M.Riseman (eds.), Computer Vision Systems, pp. 61–80, Academic Press: New York.Google Scholar
  60. Marr, D. 1982, Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W.H. Freeman: San Francisco.Google Scholar
  61. Marr, D., and Hildreth, E. 1980, Theory of edge detection. Proc. Roy. Soc. London B 207: 187–217.Google Scholar
  62. Marr, D., and Poggio, T. 1976. Cooperative computation of stereo disparity. Science 194: 283–287.Google Scholar
  63. Marroquin, J.L. 1984, Surface reconstruction preserving discontinuities. A.I. Memo 792, Artificial Intelligence Laboratory, Massachusetts Institute of Technology.Google Scholar
  64. Marroquin, J.L. 1985. Probabilistic Solution of Inverse Problems. Ph.D. thesis, Massachusetts of Technology.Google Scholar
  65. Matthies, L., and Shafer, S.A. 1987. Error inodeling in stereo navigation. IEEE J. Robotics Automation RA-3: 239–248.Google Scholar
  66. Matthies, L.H., Kanade, T., and Szeliski, R. 1989. Kalman filter-based algorithms for estimating depth from image sequences. Intern. J. Comput. Vision 3: 209–236.Google Scholar
  67. McDermott, D. 1980. Spatial inferences with ground, metric formulas on simple objects. Department of Computer Science, Yale University, Res. Rept. 173.Google Scholar
  68. Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., and Teller, E. 1953. Equations of state calculations by fast computing machines. J. Chem. Physics 21: 1087–1091.Google Scholar
  69. Moravec, H.P. 1988. Sensor fusion in certainty grids for mobile robots. Al Magazine 9: 61–74.Google Scholar
  70. Pentland, A.P. 1986. Perceptual organization and the representation of natural form. Artificial Intelligence 28: 293–331.Google Scholar
  71. Poggio, T., Torre, V., and Koch, C. 1985. Computational vision and regularization theory. Nature 317: 314–319.Google Scholar
  72. Poggio, T., Voorhees, H., and Yuille, A. 1985. A regularized solution to edge detection. A. I. Memo 833. Artificial Intelligence Laboratory, Massachusetts Institute of Technology.Google Scholar
  73. Poggio, T., et al. 1988. The MIT vision machine. In Proc. DARPA Image Understanding Workshop, Boston, pp. 177–198.Google Scholar
  74. Rensink, R.A. 1986. On the Visual Discrimination of Self-Similar Random Textures. Master's thesis, The University of British Columbia.Google Scholar
  75. Rives, P., Breuil, E., and Espiau, B. 1986. Recursive estimation of 3D features using optical flow and camera motion. In Proc. Conf. Intell. Autonomous Systems. pp. 522–532. (Also in 1987 Proc. IEEE Intern. Conf. Robotics and Automation.)Google Scholar
  76. Roberts, L.G. 1965. Machine perception of three dimensional solids. In Tippett et al., (eds.), Optical and Electro-Optical Information Processing, ch. 9, pp. 159–197, MIT Press: Cambridge, MA.Google Scholar
  77. Rosenfeld, A. 1980. Quadtrees and pyramids for pattern recognition and image processing. In 5th Intern. Conf. Patt. Recog., Miami Beach, FL, pp. 802–809.Google Scholar
  78. Rosenfeld, A. (ed.). 1984, Multiresolution Image Processing and Analysis, Springer-Verlag: New York.Google Scholar
  79. Rosenfeld, A., Hummel, R.A., and Zucker, S.W. 1976. Scene labeling by relaxation operations, IEEE Trans. Syst., Man, and Cybern. SMC-6: 420–433.Google Scholar
  80. Szeliski, R. 1986. Cooperative algorithms for solving random-dot stereograms. Tech. Rept. CMU-CS-86-133, Computer Science Department, Carnegie Mellon University.Google Scholar
  81. Szeliski, R. 1987. Regularization uses fractal priors. In Proc. 6th Nat. Conf. Artif. Intell., Seattle, pp. 749–754.Google Scholar
  82. Szeliski, R. 1988. Estimating motion from sparse range data without correspondence. In Proc. 2nd Intern. Conf. Comput. Vision, Tampa, FL, pp. 207–216.Google Scholar
  83. Szeliski, R. 1989. Bayesian Modeling of Uncertainty in Low-Level Vision. Kluwer Academic Publishers: Boston.Google Scholar
  84. Szeliski, R. 1990a. Fast shape from shading. In Proc. 1st European Conf. Comput. Vision, Antibes, Frane, pp. 359–368.Google Scholar
  85. Szeliski, R. 1990b. Fast surface interpolation using hierarchical basis functions. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-12: 513–528.Google Scholar
  86. Szeliski, R., and Terzopoulos, D. 1989a. From splines to fractals. Computer Graphies 23: 51–60.Google Scholar
  87. Szeliski, R., and Terzopoulos, D. 1988b. Parallel multigrid algorithms and computer vision applications. In 4th Copper Mountain Conf. on Multigrid Methods, Copper Mountain, Colorado, pp. 383–398.Google Scholar
  88. Terzopoulos, D. 1983. Multilevel computational processes for visual surface reconstruction. Comput. Vision, Graphics, Image Process. 24: 52–96.Google Scholar
  89. Terzopoulos, D. 1986a. Image analysis using multigrid relaxation methods. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-8: 129–139.Google Scholar
  90. Terzopoulos, D. 1986b. Regularization of inverse visual problems involving discontinuities. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-8: 413–424.Google Scholar
  91. Terzopoulos, D. 1987. Matching deformable models to images: Direct and iterative solutions. In Topical Meeting on Machine Vision, Washington, D.C., pp. 164–167.Google Scholar
  92. Terzopoulos, D. 1988. The computation of visible-surface representations. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-10: 417–438.Google Scholar
  93. Terzopoulos, D., Witkin, A., and Kass, M. 1987. Symmetry-seeking models and 3D object reconstruction. Intern. J. Comput. Vision 1: 211–221.Google Scholar
  94. Tikhonov, A.N., and Arsenin, V.Y. 1977. Solutions of Ill-Posed Problems, V.H. Winston: Washington, D.C.Google Scholar
  95. Tsai, R.Y., and Huang, T.S. 1984. Uniqueness and estimation of threedimensional motion parameters of rigid objects with curved surfaces. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-6: 13–27.Google Scholar
  96. Ullman, S. 1979. The Interpretation of Visual Motion. MIT Press: Cambridge, MA.Google Scholar
  97. VanEssen, D.C., and Maunsell, J.H.R. 1983. Hierarchical organization and functional streams in the visual cortex. Trends in Neuroscience 6: 370–375.Google Scholar
  98. Voss, R.F., 1985. Random fractal forgeries. In R.A.Earnshaw (ed.), Fundamental Algorithms for Computer Graphics, Springer-Verlag, Berlin.Google Scholar
  99. Wahba, G. 1983. Bayesian “confidence intervals” for the crossvalidated smoothing spline. J. Roy. Statist. Soc. B 45: 133–150.Google Scholar
  100. Waltz, D.L. 1975. Understanding line drawing of scenes with shadows. In P.Winston, (ed.), The Psychology of Computer Vision, McGraw-Hill, New York.Google Scholar
  101. Witkin, A., Terzopoulos, D., and Kass, M. 1987, Signal matching through scale space. Intern. J. Comput. Vision 1: 133–144.Google Scholar
  102. Yserentant, H. 1986. On the multi-level splitting of finite element spaces. Numerische Mathematik 49: 379–412.Google Scholar

Copyright information

© Kluwer Academic Publishers 1990

Authors and Affiliations

  • Richard Szeliski
    • 1
  1. 1.Cambridge Research LabDigital Equipment CorporationCambridge

Personalised recommendations