International Journal of Computer Vision

, Volume 29, Issue 3, pp 159–179 | Cite as

A Multibody Factorization Method for Independently Moving Objects

  • João Paulo Costeira
  • Takeo Kanade


The structure-from-motion problem has been extensively studied in the field of computer vision. Yet, the bulk of the existing work assumes that the scene contains only a single moving object. The more realistic case where an unknown number of objects move in the scene has received little attention, especially for its theoretical treatment. In this paper we present a new method for separating and recovering the motion and shape of multiple independently moving objects in a sequence of images. The method does not require prior knowledge of the number of objects, nor is dependent on any grouping of features into an object at the image level. For this purpose, we introduce a mathematical construct of object shapes, called the shape interaction matrix, which is invariant to both the object motions and the selection of coordinate systems. This invariant structure is computable solely from the observed trajectories of image features without grouping them into individual objects. Once the matrix is computed, it allows for segmenting features into objects by the process of transforming it into a canonical form, as well as recovering the shape and motion of each object. The theory works under a broad set of projection models (scaled orthography, paraperspective and affine) but they must be linear, so it excludes projective “cameras”.

computer vision image understanding 3D vision shape from motion motin analysis invariants 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adelson, E. and Bergen, J. 1985. Spatiotemporal energy models for the perception of motion. Journal of the Optical Society of America, 2(2):284-299.Google Scholar
  2. 2.
    Bergen, J., Burt, P., Hingorani, R., and Peleg, S. 1990. Computing two motions from three frames. In Proceedings of the IEEE International Conference on Computer Vision.Google Scholar
  3. 3.
    Bienvenu, G. and Kopp, L. 1979. Principe de la noniometre passive adaptive. In Proc. 7éme Colloque GRETSI, Nice, France, pp. 106/1-106/10.Google Scholar
  4. 4.
    Boult, T. and Brown, L. 1991. Factorization-based segmentation of motions. In Proceedings of the IEEE Workshop on Visual Motion.Google Scholar
  5. 5.
    Cormen, T.H., Leiserson, C.E., and Rivest, R.L. 1986. Introduction to Algorithms. The MIT Press.Google Scholar
  6. 6.
    Costeira, J. and Kanade, T. 1997. A multi-body factorization method for independently moving objects: Full report.Technical Report CMU-RI-TR-97-30, Robotics Institute, Carnegie Mellon University. Also available at Scholar
  7. 7.
    Demmel, J. 1987. The smallest perturbation of a submatrix which lowers the rank and constrained total least squares problems. SIAM Journal of Numverical Analysis, 24(1).Google Scholar
  8. 8.
    Faugeras, O. 1994. Three Dimensional Computer Vision. MIT Press: Cambridge, MA.Google Scholar
  9. 9.
    Gear, C.W. 1994. Feature grouping in moving objects. In Proceedings of the Workshop on Motion of Non-Rigid and Articulated Objects, Austin, Texas.Google Scholar
  10. 10.
    Golub, G., Hoffman, A., and Stewart, G. 1987. A generalization of the eckart-young-mirsky approximation theorem. Linear Algebra Applications.Google Scholar
  11. 11.
    Irani, M., Benny, R., and Peleg, S. 1994. Computing occluding and transparent motions. International Journal of Computer Vision, 12(1):5-16.CrossRefGoogle Scholar
  12. 12.
    Jasinschi, R.S., Rosenfeld, A., and Sumi, K. 1992. Perceptual motion transparency: the role of geometrical information. Journal of the Optical Society of America, 9(11):1-15.Google Scholar
  13. 13.
    Koenderink, J. and van Doorn, A. 1991. Affine structure from motion. Journal of the Optical Society of America, 8(2):377- 385.Google Scholar
  14. 14.
    Poelman, C. and Kanade, T. 1993. A paraperspective factorization method for shape and motion recovery. Technical Report CS-93-219, School of Computer Science, Carnegie Mellon University.Google Scholar
  15. 15.
    Poelman, C. 1995. The paraperspective and projective factorization method for recovering shape and motion. Technical Report Also SCS Report CMU-CS-95-173, School of Computer Science, Carnegie Mellon University.Google Scholar
  16. 16.
    Schmidt, R. 1980. A signal subspace approach to multiple emitter location and spectral estimation. PhD Thesis, Stanford University, CA.Google Scholar
  17. 17.
    Sinclair, D. 1993. Motion segmentation and local structure. In Proceedings of the 4th International Conference on Computer Vision.Google Scholar
  18. 18.
    Stewart, G.W. 1992. Determining rank in the presence of error. In Proceedings of the NATO Workshop on Large Scale Linear Algebra, Leuven, Belgium. Also University of Maryland Tech. Report.Google Scholar
  19. 19.
    Tomasi, C. and Kanade, T. 1992. Shape from motion from image streams under orthography: A factorization method. International Journal of Computer Vision, 9(2):137-154. Originally published as CMU Technical Report CMU-CS-90-166, September 1990.Google Scholar
  20. 20.
    Ullman, S. 1983. Maximizing rigidity: The incremental recovery of 3D structure from rigid and rubbery motion. Technical Report A.I. Memo No. 721, MIT.Google Scholar
  21. 21.
    Van Trees. H. 1968. Detection, Estimation, and Modulation Theory, vol. 1. Wiley: New York.Google Scholar
  22. 22.
    Wilson, R. 1994. Modeling and calibration of automated zoom lenses. PhD Thesis, ECE, Carnegie Mellon University.Google Scholar

Copyright information

© Kluwer Academic Publishers 1998

Authors and Affiliations

  • João Paulo Costeira
    • 1
  • Takeo Kanade
    • 2
  1. 1.Instituto de Sistemas e Robótica, Instituto Superior TécnicoLisboa CODEXPortugal. E-mail: Email
  2. 2.Carnegie Mellon UniversityPittsburghUSA. E-mail: Email

Personalised recommendations