Bayesian modeling of uncertainty in low-level vision

Szeliski, Richard

doi:10.1007/BF00126502

Bayesian modeling of uncertainty in low-level vision

Published: December 1990

Volume 5, pages 271–301, (1990)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Richard Szeliski¹

549 Accesses
92 Citations
6 Altmetric
Explore all metrics

Abstract

The need for error modeling, multisensor fusion, and robust algorithms is becoming increasingly recognized in computer vision. Bayesian modeling is a powerful, practical, and general framework for meeting these requirements. This article develops a Bayesian model for describing and manipulating the dense fields, such as depth maps, associated with low-level computer vision. Our model consists of three components: a prior model, a sensor model, and a posterior model. The prior model captures a priori information about the structure of the field. We construct this model using the smoothness constraints from regularization to define a Markov Random Field. The sensor model describes the behavior and noise characteristics of our measurement system. We develop a number of sensor models for both sparse and dense measurements. The posterior model combines the information from the prior and sensor models using Bayes' rule. We show how to compute optimal estimates from the posterior model and also how to compute the uncertainty (variance) in these estimates. To demonstrate the utility of our Bayesian framework, we present three examples of its application to real vision problems. The first application is the on-line extraction of depth from motion. Using a two-dimensional generalization of the Kalman filter, we develop an incremental algorithm that provides a dense on-line estimate of depth whose accuracy improves over time. In the second application, we use a Bayesian model to determine observer motion from sparse depth (range) measurements. In the third application, we use the Bayesian interpretation of regularization to choose the optimal smoothing parameter for interpolation. The uncertainty modeling techniques that we develop, and the utility of these techniques in various applications, support our claim that Bayesian modeling is a powerful and practical framework for low-level vision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Ackley, D.H., Hinton, G.E., and Sejnowski, T.J. 1985. A learning algorithm for Boltzmann machines. Cognitive Science, 9: 147–169.
Google Scholar
Adelson, E.H., and Bergen, J.R. 1985. Spatioternporal energy models for the perception of motion. J. Opt. Soc. Amer. A2: 284–299.
Google Scholar
Aloimonos, J., Weiss, I., and Bandyopadhyay, A. 1987. Active vision. In Proc. 1st Intern. Conf. Comput. Vision, London, pp. 35–54.
Anandan, P. 1989. A computational framework and an algorithm for the measurement of visual motion. Intern. J. Comput. Vision 2: 283–310.
Google Scholar
Anandan, P., and Weiss, R. 1985. Introducing a smoothness constraint in a matching approach for the computation of displacement fields. In Proc. DARPA Image Understanding Workshop, Miami Reach, FL, pp. 186–196.
Baker, H.H. and Bolles, R.C., 1989. Generalizing epipolar-plane image analysis on the spatiotemporal surface, Intern. J. Comput. Vision 3: 33–49.
Google Scholar
Barnard, S.T. 1989. Stochastic stereo matching over scale, Intern. J. Comput. Vision 3: 17–32.
Google Scholar
Barnard, S.T., and Fischler, M.A. 1982. Computational stereo. Computing Surveys 14: 553–572.
Google Scholar
Barrow, H.G., and Tenenbaum, J.M. 1978. Recovering intrinsic scene characteristics from images. In Allen R.Hanson and Edward M.Riseman (eds.) Computer Vision Systems, pp. 3–26. Academic Press: New York.
Google Scholar
Bertero, M., Pogglo, T., and Torre, V. 1987. III-posed problems in early vision. A.I. Memo 924, Massachusetts Institute of Technology.
Bierman, G.J. 1977. Factorization Methods for Discrete Sequential Estimation. Academic Press: New York.
Google Scholar
Blake, A., and Zisserman, A. 1987. Visual Reconstruction MIT Press: Cambridge, MA.
Google Scholar
Bolles, R.C., Baker, H.H., and Marimont, D.H. 1987. Epipolar-plane image analysis: An approach to determining structure from motion. Intern. J. Comput. Vision 1: 7–55.
Google Scholar
Boult, T.E. 1986. Information based complexity in non-linear equations and computer vision, Ph.D. thesis, Columbia University.
Bracewell, R.N. 1978. The Fourier Transform and Its Applications. 2nd ed. McGraw-Hill: New York.
Google Scholar
Briggs, W.L. 1987. A Multigrid Tutorial. Society for Industrial and Applied Mathematics: Philadelphia.
Google Scholar
Burt, P.J., and Adelson, E.H. 1983. The Laplacian pyramid as a compact image code. IEEE. Trans. Commun. COM-31: 532–540.
Google Scholar
Canny, J. 1986. A computational approach to edge detection. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-8: 679–698.
Google Scholar
Chen, L.-H., and Boult, T.E. 1988. An integrated approach to stereo matching, surface reconstruction and depth segmentation using consistent smoothness assumptions. In Proc. DARPA Image Understanding Workshop, Cambridge, MA, pp. 166–176.
Choi, D.J. 1987. Solving the depth interpolation problem on a fine grained, mesh-and tree-connected SIMD machine. In Proc. DARPA Image Understanding Workshop, Los Angeles, pp. 639–643.
Christ, J.P. 1987. Shape estimation and object recognition using spatial probability distributions. Ph.D. thesis, Carnegie Mellon University.
Craven, P., and Wahba, G. 1979. Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik 31: 377–403.
Google Scholar
Crowley, J.L., and Stern, R.M. 1982. Fast computation of the difference of low-pass transform. Tech. Rept. CMU-RI-TR-82-18: The Robotics Institute, Carnegic Mellon University
Dev, P. 1974. Segmentation processes in visual perception: a cooperative neural model. COINS Technical Report 74C-5, University of Massachusetts at Amherst.
Duda, R.O., and Hart, P.E. 1973. Pattern Classification and Scene Analysis, Wiley: New York.
Google Scholar
Durbin, R., Szeliski, R., and Yuille, A. 1989. An analysis of the elastic net approach to the travelling salesman problem. Neural Computation 1: 348–358.
Google Scholar
Durbin, R., and Willshaw, D. 1987. An analogue approach to the traveling salesman problem using an elastic net method. Nature 326: 689–691.
Google Scholar
Durrant-Whyte, H.F. 1987. Consistent integration and propagation of disparate sensor observations. Intern. J. Robotics Res. 6: 3–24.
Google Scholar
Elfes, A., and Matthies, L. 1987. Sensor integration for robot navigation: Combining sonar and stereo range data in a grid-based representation. In Proc. IEEE Conf. Decision and Control.
Faugeras, O.D., Ayache, N., and Faverjon B. 1986. Building visual maps by combining noisy stereo measurements. In Proc. IEEE Intern. Conf. Robotics and Automation, San Francisco, pp. 1433–1438.
Faugeras, O.D., and Hebert, M. 1987. The representation, recognition and positioning of 3-D shapes from range data. In TakeoKanade (ed.), Three-Dimensional Machine Vision. Kluwer Academic Publishers: Boston, pp. 301–353.
Google Scholar
Gamble, E., and Poggio, T. 1987. Visual integration and detection of discontinuities: The key role of intensity edges. A.I. Memo 970, Artif. Intell. Lab., Massachusetts Institute of Technology.
Geiger, D., and Girosi, F. 1989. Mean field theory for surface reconstruction. In Proc. Image Understanding Workshop, Palo Alto, CA, pp. 617–630.
Gelb, Arthur (ed.). 1974, Applied Optimal Estimation, MIT Press: Cambridge, MA.
Google Scholar
Geman, S., and Geman, D. 1984. Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. In IEEE Trans. Patt. Anal. Mach. Intell. PAMI-6: 721–741.
Geman, S., and McClure, D.E. 1987. Statistical methods for tomographic image reconstruction. In Proc. 46th Session of the Intern. Statistical Inst.
Grimson, W.E.L. 1981. From Images to Surfaces: a Computional Study of the Human Early Visual System. MIT Press: Cambridge, MA.
Google Scholar
Grimson, W.E.L. 1983. An implementation of a computational theory of visual surface interpolation. Comput. Vision, Graphics, and Image Process. 22: 39–69.
Google Scholar
Hackbusch, W. 1985. Multigrid Methods and Applications. Springer-Verlag: Berlin.
Google Scholar
Heeger, D.J. 1987. Optical flow from spatiotemporal filters. In Proc. Ist Intern. Conf. Comput. Vision, London, pp. 181–190.
Heel, J. 1989. Dynamic motion vision. In Proc. Image Understanding Workshop, Palo Alto, CA, pp. 702–713.
Hinton, G.E. 1977. Relaxation and its role in vision. Ph.D. thesis, University of Edinburgh.
Hinton, G.E., Sejnowski, T.J. 1983. Optimal perceptual inference In Proc. Conf. Comput. Vision and Patt. Recog., Washington, D.C., pp. 448–453.
Hoff, W., and Ahuja, N. 1986. Surfaces from stereo. In Proc. 8th Intern. Conf. Patt. Recog., Paris, pp. 516–518.
Horn, B.K.P. 1977. Understanding image intensities. Artificial Intelligence 8: 201–231.
Google Scholar
Horn, B.K.P., and Brooks, M.J. 1986. The variational approach to shape from shading. Comput. Vision, Graphics, Image Process. 33: 174–208.
Google Scholar
Horn, B.K.P., and Schunck, B.-G., 1981. Determining optical flow, Artificial Intelligence 17: 185–203.
Google Scholar
Hueckel, M.H. 1971. An operator which locates edges in digitized pictures. J. Assoc. Comput. Mach. 18: 113–125.
Google Scholar
Ikeuchi, K., and Horn, B.K.P. 1981. Numerical shape from shading and occluding boundaries. Artificial Intelligence 17: 141–184.
Google Scholar
Julesz, B. 1971. Foundations of Cyclopean Perception. Chicago University Press: Chicago.
Google Scholar
Kass, M., Witkin, A., and Terzopoulos, D. 1988. Snakes: Active contour models. Intern. J. Comput. Vision 1: 321–331.
Google Scholar
Kimeldorf, G., and Wahba, G. 1970. A correspondence between Bayesian estimation on stochastic processes and smoothing by splines. Ann. Math. Stat. 41: 495–502.
Google Scholar
Kirkpatrick, S., Gelatt, C.D.Jr., and Vecchi, M.P. 1983. Optimization by simulated annealing. Science 220: 671–680.
Google Scholar
Koch, C., Marroquin, J., and Yuille, A. 1986. Analog “neuronal” networks in early vision. Proc. Nat. Acad. Sci. U.S.A. 83: 4263–4267.
Google Scholar
Konrad, J., and Dubois, E. 1988. Multigrid Bayesian estimation of image motion fields using stochastic relaxation. In Proc. 2nd Intern. Conf. Comput. Vision, Tampa, FL, pp. 354–362.
Leelere, Y.G., 1989. Constructing simple stable descriptions for image partitioning. Intern. J. Comput. Vision 3: 75–102.
Google Scholar
Lowe, D.G. 1985. Perceptual Organization and Visual Recognition. Kluwer Academic Publishers: Boston.
Google Scholar
Mandelbrot, B.B. 1982. The Fractal Geometry of Nature. W.H. Frecman: San Francisco.
Google Scholar
Marr, D. 1978. Representing visual information. In Allen R.Hanson and Edward M.Riseman (eds.), Computer Vision Systems, pp. 61–80, Academic Press: New York.
Google Scholar
Marr, D. 1982, Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W.H. Freeman: San Francisco.
Google Scholar
Marr, D., and Hildreth, E. 1980, Theory of edge detection. Proc. Roy. Soc. London B 207: 187–217.
Google Scholar
Marr, D., and Poggio, T. 1976. Cooperative computation of stereo disparity. Science 194: 283–287.
Google Scholar
Marroquin, J.L. 1984, Surface reconstruction preserving discontinuities. A.I. Memo 792, Artificial Intelligence Laboratory, Massachusetts Institute of Technology.
Marroquin, J.L. 1985. Probabilistic Solution of Inverse Problems. Ph.D. thesis, Massachusetts of Technology.
Matthies, L., and Shafer, S.A. 1987. Error inodeling in stereo navigation. IEEE J. Robotics Automation RA-3: 239–248.
Google Scholar
Matthies, L.H., Kanade, T., and Szeliski, R. 1989. Kalman filter-based algorithms for estimating depth from image sequences. Intern. J. Comput. Vision 3: 209–236.
Google Scholar
McDermott, D. 1980. Spatial inferences with ground, metric formulas on simple objects. Department of Computer Science, Yale University, Res. Rept. 173.
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., and Teller, E. 1953. Equations of state calculations by fast computing machines. J. Chem. Physics 21: 1087–1091.
Google Scholar
Moravec, H.P. 1988. Sensor fusion in certainty grids for mobile robots. Al Magazine 9: 61–74.
Google Scholar
Pentland, A.P. 1986. Perceptual organization and the representation of natural form. Artificial Intelligence 28: 293–331.
Google Scholar
Poggio, T., Torre, V., and Koch, C. 1985. Computational vision and regularization theory. Nature 317: 314–319.
Google Scholar
Poggio, T., Voorhees, H., and Yuille, A. 1985. A regularized solution to edge detection. A. I. Memo 833. Artificial Intelligence Laboratory, Massachusetts Institute of Technology.
Poggio, T., et al. 1988. The MIT vision machine. In Proc. DARPA Image Understanding Workshop, Boston, pp. 177–198.
Rensink, R.A. 1986. On the Visual Discrimination of Self-Similar Random Textures. Master's thesis, The University of British Columbia.
Rives, P., Breuil, E., and Espiau, B. 1986. Recursive estimation of 3D features using optical flow and camera motion. In Proc. Conf. Intell. Autonomous Systems. pp. 522–532. (Also in 1987 Proc. IEEE Intern. Conf. Robotics and Automation.)
Roberts, L.G. 1965. Machine perception of three dimensional solids. In Tippett et al., (eds.), Optical and Electro-Optical Information Processing, ch. 9, pp. 159–197, MIT Press: Cambridge, MA.
Google Scholar
Rosenfeld, A. 1980. Quadtrees and pyramids for pattern recognition and image processing. In 5th Intern. Conf. Patt. Recog., Miami Beach, FL, pp. 802–809.
Rosenfeld, A. (ed.). 1984, Multiresolution Image Processing and Analysis, Springer-Verlag: New York.
Google Scholar
Rosenfeld, A., Hummel, R.A., and Zucker, S.W. 1976. Scene labeling by relaxation operations, IEEE Trans. Syst., Man, and Cybern. SMC-6: 420–433.
Google Scholar
Szeliski, R. 1986. Cooperative algorithms for solving random-dot stereograms. Tech. Rept. CMU-CS-86-133, Computer Science Department, Carnegie Mellon University.
Szeliski, R. 1987. Regularization uses fractal priors. In Proc. 6th Nat. Conf. Artif. Intell., Seattle, pp. 749–754.
Szeliski, R. 1988. Estimating motion from sparse range data without correspondence. In Proc. 2nd Intern. Conf. Comput. Vision, Tampa, FL, pp. 207–216.
Szeliski, R. 1989. Bayesian Modeling of Uncertainty in Low-Level Vision. Kluwer Academic Publishers: Boston.
Google Scholar
Szeliski, R. 1990a. Fast shape from shading. In Proc. 1st European Conf. Comput. Vision, Antibes, Frane, pp. 359–368.
Szeliski, R. 1990b. Fast surface interpolation using hierarchical basis functions. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-12: 513–528.
Google Scholar
Szeliski, R., and Terzopoulos, D. 1989a. From splines to fractals. Computer Graphies 23: 51–60.
Google Scholar
Szeliski, R., and Terzopoulos, D. 1988b. Parallel multigrid algorithms and computer vision applications. In 4th Copper Mountain Conf. on Multigrid Methods, Copper Mountain, Colorado, pp. 383–398.
Terzopoulos, D. 1983. Multilevel computational processes for visual surface reconstruction. Comput. Vision, Graphics, Image Process. 24: 52–96.
Google Scholar
Terzopoulos, D. 1986a. Image analysis using multigrid relaxation methods. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-8: 129–139.
Google Scholar
Terzopoulos, D. 1986b. Regularization of inverse visual problems involving discontinuities. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-8: 413–424.
Google Scholar
Terzopoulos, D. 1987. Matching deformable models to images: Direct and iterative solutions. In Topical Meeting on Machine Vision, Washington, D.C., pp. 164–167.
Terzopoulos, D. 1988. The computation of visible-surface representations. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-10: 417–438.
Google Scholar
Terzopoulos, D., Witkin, A., and Kass, M. 1987. Symmetry-seeking models and 3D object reconstruction. Intern. J. Comput. Vision 1: 211–221.
Google Scholar
Tikhonov, A.N., and Arsenin, V.Y. 1977. Solutions of Ill-Posed Problems, V.H. Winston: Washington, D.C.
Google Scholar
Tsai, R.Y., and Huang, T.S. 1984. Uniqueness and estimation of threedimensional motion parameters of rigid objects with curved surfaces. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-6: 13–27.
Google Scholar
Ullman, S. 1979. The Interpretation of Visual Motion. MIT Press: Cambridge, MA.
Google Scholar
VanEssen, D.C., and Maunsell, J.H.R. 1983. Hierarchical organization and functional streams in the visual cortex. Trends in Neuroscience 6: 370–375.
Google Scholar
Voss, R.F., 1985. Random fractal forgeries. In R.A.Earnshaw (ed.), Fundamental Algorithms for Computer Graphics, Springer-Verlag, Berlin.
Google Scholar
Wahba, G. 1983. Bayesian “confidence intervals” for the crossvalidated smoothing spline. J. Roy. Statist. Soc. B 45: 133–150.
Google Scholar
Waltz, D.L. 1975. Understanding line drawing of scenes with shadows. In P.Winston, (ed.), The Psychology of Computer Vision, McGraw-Hill, New York.
Google Scholar
Witkin, A., Terzopoulos, D., and Kass, M. 1987, Signal matching through scale space. Intern. J. Comput. Vision 1: 133–144.
Google Scholar
Yserentant, H. 1986. On the multi-level splitting of finite element spaces. Numerische Mathematik 49: 379–412.
Google Scholar

Download references

Author information

Authors and Affiliations

Cambridge Research Lab, Digital Equipment Corporation, One Kendall Square, Bldg. 700, 02139, Cambridge, MA
Richard Szeliski

Authors

Richard Szeliski
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Szeliski, R. Bayesian modeling of uncertainty in low-level vision. Int J Comput Vision 5, 271–301 (1990). https://doi.org/10.1007/BF00126502

Download citation

Issue Date: December 1990
DOI: https://doi.org/10.1007/BF00126502

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian modeling of uncertainty in low-level vision

Abstract

Access this article

Similar content being viewed by others

Top–Down Bayesian Inference of Indoor Scenes

Uncertainty Computation in Large 3D Reconstruction

2D or Not 2D: Bridging the Gap Between Tracking and Structure from Motion

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayesian modeling of uncertainty in low-level vision

Abstract

Access this article

Similar content being viewed by others

Top–Down Bayesian Inference of Indoor Scenes

Uncertainty Computation in Large 3D Reconstruction

2D or Not 2D: Bridging the Gap Between Tracking and Structure from Motion

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation