International Journal of Computer Vision

, Volume 9, Issue 2, pp 137–154 | Cite as

Shape and motion from image streams under orthography: a factorization method

  • Carlo Tomasi
  • Takeo Kanade


Inferring scene geometry and camera motion from a stream of images is possible in principle, but is an ill-conditioned problem when the objects are distant with respect to their size. We have developed a factorization method that can overcome this difficulty by recovering shape and motion under orthography without computing depth as an intermediate step.

An image stream can be represented by the 2F×P measurement matrix of the image coordinates of P points tracked through F frames. We show that under orthographic projection this matrix is of rank 3.

Based on this observation, the factorization method uses the singular-value decomposition technique to factor the measurement matrix into two matrices which represent object shape and camera rotation respectively. Two of the three translation components are computed in a preprocessing stage. The method can also handle and obtain a full solution from a partially filled-in measurement matrix that may result from occlusions or tracking failures.

The method gives accurate results, and does not introduce smoothing in either shape or motion. We demonstrate this with a series of experiments on laboratory and outdoor image streams, with and without occlusions.


Computer Vision Intermediate Step Factorization Method Camera Motion Measurement Matrix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Adiv, G. 1985. Determining three-dimensional motion and structure from optical flow generated by several moving objects. IEEE Trans. Patt. Anal. Mach. Intell. 7:384–401.Google Scholar
  2. Bolles, R.C., Baker, H.H., and Marimont, D.H. 1987. Epipolar-plane image analysis: An approach to determining structure from motion, Intern. J. Comput. Vis. 1(1):7–55.Google Scholar
  3. Boult, T.E., and Brown, L.G. 1991. Factorization-based segmentation of motions, Proc. IEEE Workshop on Visual Motion, pp. 179–186.Google Scholar
  4. Broida, T., Chandrashekhar, S., and Chellappa, R. 1990. Recursive 3D motion estimation from a monocular image sequence, IEEE Trans. Aerospace Electroc. Syst. 26(4):639–656.Google Scholar
  5. Bruss, A.R., and Horn, B.K.P. 1983. Passive navigation. Comput. Vis. Graph. Image Process. 21:3–20.Google Scholar
  6. Debrunner, C., and Ahuja, N. 1992. Motion and structure factorization and segmentation of long multiple motion image sequences. In Sandini, G., ed. Europ. Conf. Comput. Vision, 1992, pp. 217–221. Springer-Verlag: Berlin, Germany.Google Scholar
  7. Golub, G.H., and Reinsch, C. 1971. Singular value decomposition and least squares solutions, In Handbook for Automatic Computation, vol. 2, ch. I/10, pp. 134–151. Springer Verlag: New York.Google Scholar
  8. Golub, G.H., and VanLoan, C.F. 1989. Matrix Computations. The Johns Hopkins University Press, Baltimore, MD.Google Scholar
  9. Heeger, D.J., and Jepson, A. 1989. Visual perception of three-dimensional motion, Technical Report 124, MIT Media Laboratory, Cambridge, MA.Google Scholar
  10. Heel, J. 1989. Dynamic motion vision. Proc. DARPA Image Understanding Workshop, Palo Alto, CA, pp. 702–713.Google Scholar
  11. Horn, B.K.P., Hilden, H.M., and Negahdaripour, S. 1988. Closed-form solution of absolute orientation using orthonormal matrices. J. Op. Soc. Amer. A, 5(7):1127–1135.Google Scholar
  12. Lucas, B.D., and Kanade, T. 1981. An iterative image registration technique with an application to stereo vision, Proc. 7th Intern. Joint Conf. Artif. Intell., Vancouver.Google Scholar
  13. Matthies, L., Kanade, T., and Szeliski, R. 1989. Kalman filter-based algorithm for estimating depth from image sequences. Intern. J. Comput. Vis. 3(3):209–236.Google Scholar
  14. Prazdny, K. 1980. Egomotion and relative depth from optical flow, Biological Cybernetics 102:87–102.Google Scholar
  15. Spetsakis, M.E., and Aloimonos, J.Y. 1989. Optimal motion estimation. Proc. IEEE Workshop on Visual Motion, pp. 229–237. Irvine, CA.Google Scholar
  16. Tomasi, C., and Kanade, T. 1990. Shape and motion without depth, Proc. 3rd Intern. Conf. Comput. Vis., Osaka, Japan.Google Scholar
  17. Tomasi, C., and Kanade, T. 1991a. Shape and motion from image streams: a factorization method-2. point features in 3D motion. Technical Report CMU-CS-91–105, Carnegie Mellon University, Pittsburgh, PA.Google Scholar
  18. Tomasi, C., and Kanade, T. 1991b. Shape and motion from image streams: a factorization method-3. detection and tracking of point features. Technical Report CMU-CS-91–132, Carnegie Mellon University, Pittsburgh, PA.Google Scholar
  19. Tomasi, C. 1991. Shape and motion from image streams: a factorization method. Ph.D. thesis, Carnegie Mellon University. Also appears as Technical Report CMU-CS-91-172.Google Scholar
  20. Tsai, R.Y. and Huang, T.S. 1984. Uniqueness and estimation of three-dimensional motion parameters of rigid objects with curved surfaces. IEEE Trans. Patt. Anal. Mach. Intell. 6(1):13–27.Google Scholar
  21. Ullman, S. 1979. The Interpretation of Visual Motion. MIT Press: Cambridge, MA.Google Scholar
  22. Waxman, A.M., and Wohn, K. 1985. Contour evolution, neighborhood deformation, and global image flow: planar surfaces in motion. Intern. J. Robot. Res. 4:95–108.Google Scholar

Copyright information

© Kluwer Academic Publishers 1992

Authors and Affiliations

  • Carlo Tomasi
    • 1
  • Takeo Kanade
    • 2
  1. 1.Department of Computer ScienceCornell UniversityIthaca
  2. 2.School of Computer ScienceCarnegie Mellon UniversityPittsburgh

Personalised recommendations