International Journal of Computer Vision

, Volume 37, Issue 3, pp 231–258 | Cite as

Structure from Motion: Beyond the Epipolar Constraint

  • Tomáš Brodský
  • Cornelia Fermüller
  • Yiannis Aloimonos
Article

Abstract

The classic approach to structure from motion entails a clear separation between motion estimation and structure estimation and between two-dimensional (2D) and three-dimensional (3D) information. For the recovery of the rigid transformation between different views only 2D image measurements are used. To have available enough information, most existing techniques are based on the intermediate computation of optical flow which, however, poses a problem at the locations of depth discontinuities. If we knew where depth discontinuities were, we could (using a multitude of approaches based on smoothness constraints) accurately estimate flow values for image patches corresponding to smooth scene patches; but to know the discontinuities requires solving the structure from motion problem first. This paper introduces a novel approach to structure from motion which addresses the processes of smoothing, 3D motion and structure estimation in a synergistic manner. It provides an algorithm for estimating the transformation between two views obtained by either a calibrated or uncalibrated camera. The results of the estimation are then utilized to perform a reconstruction of the scene from a short sequence of images.

The technique is based on constraints on image derivatives which involve the 3D motion and shape of the scene, leading to a geometric and statistical estimation problem. The interaction between 3D motion and shape allows us to estimate the 3D motion while at the same time segmenting the scene. If we use a wrong 3D motion estimate to compute depth, we obtain a distorted version of the depth function. The distortion, however, is such that the worse the motion estimate, the more likely we are to obtain depth estimates that vary locally more than the correct ones. Since local variability of depth is due either to the existence of a discontinuity or to a wrong 3D motion estimate, being able to differentiate between these two cases provides the correct motion, which yields the “least varying” estimated depth as well as the image locations of scene discontinuities. We analyze the new constraints, show their relationship to the minimization of the epipolar constraint, and present experimental results using real image sequences that indicate the robustness of the method.

3D motion estimation scene reconstruction smoothing and discontinuity detection depth variability constraint 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barron, J.L., Fleet, D.J., and Beauchemin, S.S. 1994. Performance of optical flowtechniques. International Journal of Computer Vision, 12:43–77.Google Scholar
  2. Beardsley, P.A., Torr, P., and Zisserman, A. 1996. 3D model acquisition from extended image sequences. In Proc. European Conference on Computer Vision, Vol. 2, Cambridge, England, pp. 683–695.Google Scholar
  3. Bergen, J.R., Anandan, P., Hanna, K.J., and Hingorani, R. 1992. Hierarchical model-based motion estimation. In Proc. European Conference on Computer Vision, pp. 237–248.Google Scholar
  4. Brodský, T., Fermüller, C., and Aloimonos, Y. 1998a. Directions of motion fields are hardly ever ambiguous. International Journal of Computer Vision, 26:5–24.Google Scholar
  5. Brodský, T., Fermüller, C., and Aloimonos, Y. 1998b. Self-calibration from image derivatives. In Proc. International Conference on Computer Vision, pp. 83–89.Google Scholar
  6. Brodský, T., Fermüller, C., and Aloimonos, Y. 1998c. Shape from video: Beyond the epipolar constraint. In Proc. DARPA Image Understanding Workshop, pp. 1003–1012.Google Scholar
  7. Brodský, T., Fermüller, C., and Aloimonos, Y. 1998d. Simultaneous estimation of viewing geometry and structure. In Proc. European Conference on Computer Vision, pp. 342–358.Google Scholar
  8. Brodský, T., Fermüller, C., and Aloimonos, Y. 1999. Shape from video. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 146–151.Google Scholar
  9. Cheong, L., Fermüller, C., and Aloimonos, Y. 1998. Effects of errors in the viewing geometry on shape estimation. Computer Vision and Image Understanding, 71:356–372.Google Scholar
  10. Cormen, T., Leiserson, C., and Rivest, R. 1989. Introduction to Algorithms. MIT Press.Google Scholar
  11. Cox, I.J., Hingorani, S., Rao, S., and Maggs, B. 1996. A maximum-likelihood stereo algorithm. Computer Vision and Image Understanding, 63:542–567.Google Scholar
  12. Daniilidis, K. and Spetsakis, M.E. 1997. Understanding noise sensitivity in structure from motion. In Visual Navigation: From Biological Systems to Unmanned Ground Vehicles, Y. Aloimonos, (Ed.)., Lawrence Erlbaum Associates: Mahwah, NJ, Advances in Computer Vision, Ch. 4.Google Scholar
  13. Faugeras, O.D. 1992. Three-Dimensional Computer Vision. MIT Press: Cambridge, MA.Google Scholar
  14. Faugeras, O.D. and Keriven, R. 1998. Complete dense stereo vision using level set methods. In Proc. European Conference on Computer Vision, Vol. I, pp. 379–393.Google Scholar
  15. Fermüller, C. 1993. Navigational preliminaries. In Active Perception, Y. Aloimonos, (Ed.), Lawrence Erlbaum Associates: Hillsdale, NJ, Advances in Computer Vision, Ch. 3.Google Scholar
  16. Fermüller, C. and Aloimonos, Y. 1995. Direct perception of three-dimensional motion from patterns of visual motion. Science, 270:1973–1976.Google Scholar
  17. Fermüller, C., Pless, R., and Aloimonos, Y. 2000. The Ouchi illusion as an artifact of biased flow estimation. Vision Research, 40:77–96.Google Scholar
  18. Fitzgibbon, A. and Zisserman, A. 1998. Automatic camera recovery for closed and open image sequences. In Proc. European Conference on Computer Vision, Vol. 1, Freiburg, Germany, pp. 311–326.Google Scholar
  19. Foley, J.M. 1980. Binocular distance perception. Psychological Review, 87:411–434.Google Scholar
  20. Geman, S. and Geman, D. 1984. Stochastic relaxation, Gibbs distribution and Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:721–741.Google Scholar
  21. Hartley, R.I. 1994. An algorithm for self calibration from several views. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 908–912.Google Scholar
  22. Heitz, F. and Bouthemy, P. 1993. Multimodal estimation of discontinuous optical flow using Markov random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15:1217–1232.Google Scholar
  23. Horn, B.K.P. 1986. Robot Vision. McGraw Hill: New York.Google Scholar
  24. Horn, B.K.P. 1987. Motion fields are hardly ever ambiguous. International Journal of Computer Vision, 1:259–274.Google Scholar
  25. Horn, B.K.P. and Weldon, E.J., Jr. 1988. Direct methods for recovering motion. International Journal of Computer Vision, 2:51–76.Google Scholar
  26. Koch, R., Pollefeys, M., and Gool, L.V. 1998. Multi viewpoint stereo from uncalibrated video sequences. In Proc. European Conference on Computer Vision, Vol. I, pp. 55–71.Google Scholar
  27. Koenderink, J.J. and van Doorn, A.J. 1991. Affine structure from motion. Journal of the Optical Society of America, 8:377–385.Google Scholar
  28. Lucas, B.D. 1984. Generalized image matching by the method of differences. Ph.D. Thesis, Dept. of Computer Science, Carnegie-Mellon University.Google Scholar
  29. Luong, Q.-T. and Faugeras, O.D. 1996. The fundamental matrix: Theory, algorithms, and stability analysis. International Journal of Computer Vision, 17:43–75.Google Scholar
  30. Marroquin, J. 1985. Probabilistic solution of inverse problems. Ph.D. Thesis, Institute of Technology, Massachusetts.Google Scholar
  31. Maybank, S.J. 1986. Algorithm for analysing optical flow based on the least-squares method. Image and Vision Computing, 4:38–42.Google Scholar
  32. Maybank, S.J. 1987. A theoretical study of optical flow. Ph.D. Thesis, University of London.Google Scholar
  33. Mendelsohn, J., Simoncelli, E., and Bajcsy, R. 1997. Discrete-time rigidity constrained optical flow. In Proc. International Conference on Computer Analysis of Images and Patterns, Springer, Berlin, pp. 255–262.Google Scholar
  34. Mumford, D. and Shah, J. 1985. Boundary detection by minimizing functionals. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 22–25.Google Scholar
  35. Murray, D.W. and Buxton, B.F. 1987. Scene segmentation from visual motion using global optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9:220–228.Google Scholar
  36. Okutomi, M. and Kanade, T. 1993. A multiple-baseline stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15:353–363.Google Scholar
  37. Pollefeys, M., Koch, R., and Van Gool, L. 1998. Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters. In Proc. International Conference on Computer Vision, pp. 90–95.Google Scholar
  38. Robert, L. and Deriche, R. 1996. Dense depth map reconstruction: A minimization and regularization approach which preserves discontinuities. In Proc. European Conference on Computer Vision, Vol. I, pp. 439–451.Google Scholar
  39. Schunck, B.G. 1989. Image flow segmentation and estimation by constraint line clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11:1010–1027.Google Scholar
  40. Seitz, S.M. and Dyer, C. 1997. Photorealistic scene reconstruction by voxel coloring. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1067–1073.Google Scholar
  41. Spoerri, A. and Ullman, S. 1987. The early detection of motion boundaries. In Proc. International Conference on Computer Vision, pp. 209–218.Google Scholar
  42. Szeliski, R. and Golland, P. 1998. Stereo matching with transparency and matting. In Proc. International Conference on Computer Vision, pp. 517–524.Google Scholar
  43. Thompson, W.B., Mutch, K.M., and Berzins, V.A. 1985. Dynamic occlusion analysis in optical flow fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 7:374–383.Google Scholar
  44. Tittle, J.S., Todd, J.T., Perotti, V.J., and Norman, J.F. 1995. Systematic distortion of perceived three-dimensional structure from motion and binocular stereopsis. Journal of Experimental Psychology: Human Perception and Performance, 21:663–678.Google Scholar

Copyright information

© Kluwer Academic Publishers 2000

Authors and Affiliations

  • Tomáš Brodský
    • 1
  • Cornelia Fermüller
    • 1
  • Yiannis Aloimonos
    • 1
  1. 1.Computer Vision Laboratory, Center for Automation ResearchUniversity of MarylandUSA

Personalised recommendations