The Visual Computer

, Volume 19, Issue 6, pp 377–394 | Cite as

A semi-direct approach to structure from motion

  • Hailin Jin
  • Paolo Favaro
  • Stefano Soatto Email author
Special issue on computational video


The problem of structure from motion is often decomposed into two steps: feature correspondence and three-dimensional reconstruction. This separation often causes gross errors when establishing correspondence fails. Therefore, we advocate the necessity to integrate visual information not only in time (i.e. across different views), but also in space, by matching regions – rather than points – using explicit photometric deformation models. We present an algorithm that integrates image-feature tracking and three-dimensional motion estimation into a closed loop, while detecting and rejecting outlier regions that do not fit the model. Due to occlusions and the causal nature of our algorithm, a drift in the estimates accumulates over time. We describe a method to perform global registration of local estimates of motion and structure by matching the appearance of feature regions stored over long time periods. We use image intensities to construct a score function that takes into account changes in brightness and contrast. Our algorithm is recursive and suitable for real-time implementation.


Structure from motion Direct methods Extended Kalman filter Observability Tracking 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adiv G (1998) Determining 3-d motion and structure from optical flow generated by several moving objects. IEEE Trans Pattern Anal Mach Intell 7(4):384–401 Google Scholar
  2. 2.
    Alon J, Sclaroff S (2000) Recursive estimation of motion and planar structure. IEEE Comput Vision Pattern Recogn II:550–556 Google Scholar
  3. 3.
    Azarbayejani A, Pentland AP (1995) Recursive estimation of motion, structure, and focal length. IEEE Trans Pattern Anal Mach Intell 17(6):562–575 CrossRefGoogle Scholar
  4. 4.
    Bartlett MS (1956) An introduction to stochastic processes. Cambridge University Press Google Scholar
  5. 5.
    Broida TJ, Chellappa R (1986) Estimation of object motion parameters from noisy images. IEEE Trans Pattern Anal Mach Intell 8(1):90–99 CrossRefGoogle Scholar
  6. 6.
    Chiuso A, Brockett R, Soatto S (2000) Optimal structure from motion: local ambiguities and global estimates. Int J Comput Vision 39(3):195–228 CrossRefGoogle Scholar
  7. 7.
    Dellaert F, Seitz S, Thorpe C, Thrun S (2000) Structure from motion without correspondence. Proc IEEE Comput Vision Pattern Recogn II:557–564 Google Scholar
  8. 8.
    Dickmanns ED, Graefe V (1988) Applications of dynamic monocular machine vision. Mach Vision Appl 1:241–261 CrossRefGoogle Scholar
  9. 9.
    Hanna KJ (1991) Direct multi-resolution estimation of ego-motion and structure from motion. Workshop on Visual Motion, pp 156–162 Google Scholar
  10. 10.
    Jin H, Favaro P, Soatto S (2000) Real-time 3-d motion and structure of point features: front-end system for vision-based control and interaction. Proc IEEE Comput Vision Pattern Recogn II:778–779 Google Scholar
  11. 11.
    Jin H, Favaro P, Soatto S (2001) Real-time feature tracking and outlier rejection with changes in illumination. In: Proceedings International Conference on Computer Vision I, pp 684–689 Google Scholar
  12. 12.
    Matthies LH, Szeliski R, Kanade T (1989) Kalman filter-based algorithms for estimating depth from image sequences. Int J Comput Vision 3(3):209–238 CrossRefGoogle Scholar
  13. 13.
    McLauchlan PF (1999) Gauge invariance in projective 3d reconstruction. Workshop on Multi-View Modeling and Analysis of Visual Scenes, pp 37–44 Google Scholar
  14. 14.
    McLauchlan PF (2000) A batch/recursive algorithm for 3d scene reconstruction. IEEE Comput Vision Pattern Recogn II:738–743 Google Scholar
  15. 15.
    Oliensis J (2000) A new structure-from-motion ambiguity. IEEE Trans Pattern Anal Mach Intell 22(7):685–700 CrossRefGoogle Scholar
  16. 16.
    Philip J (1991) Estimation of three-dimensional motion of rigid objects from noisy observations. IEEE Trans Pattern Anal Mach Intell 13(1):61–66 CrossRefGoogle Scholar
  17. 17.
    Poelman CJ, Kanade T (1997) A paraperspective factorization for shape and motion recovery. IEEE Trans Pattern Anal Mach Intell 19(3):206–218 CrossRefGoogle Scholar
  18. 18.
    Rahimi A, Morency LP, Darrell T (2001) Reducing drift in parametric motion tracking. In Proceedings International Conference on Computer Vision I, pp 315–322 Google Scholar
  19. 19.
    Sawhney HS (1994) Simplifying motion and structure analysis using planar parallax and image warping. International Conference on Pattern Recognition A, pp 403–408 Google Scholar
  20. 20.
    Soatto S (1994) Observability/identifiability of rigid motion under perspective projection. CDC, pp 3235–3240 Google Scholar
  21. 21.
    Soatto S (1997) 3-d structure from visual motion: modeling, representation and observability. Automatica 33:1287–1312 MathSciNetCrossRefGoogle Scholar
  22. 22.
    Soatto S, Perona P (1998) Reducing structure-from-motion: a general framework for dynamic vision part 1: modeling. IEEE Trans Pattern Anal Mach Intell 20(9):933–942 CrossRefGoogle Scholar
  23. 23.
    Spetsakis M, Aloimonos JY (1988) Optimal computing of structure from motion using point correspondences in two frames. International Conference on Computer Vision, pp 449–453 Google Scholar
  24. 24.
    Sturm PF (2000) Algorithms for plane-based pose estimation. IEEE Comput Vision Pattern Recogn I:706–711 Google Scholar
  25. 25.
    Szeliski R, Kang SB (1995) Direct methods for visual scene reconstruction. IEEE Workshop on Representation of Visual Scenes, pp 26–33 Google Scholar
  26. 26.
    Thomas JI, Oliensis J (1992) Recursive multi-frame structure from motion incorporating motion error. Image Understanding Workshop, pp 507–513 Google Scholar
  27. 27.
    Tomasi C, Kanade T (1992) Shape and motion from image streams under orthography: a factorization method. Int J Comput Vision 9(2):137–154 CrossRefGoogle Scholar
  28. 28.
    Weng J, Ahuja N, Huang TS (1991) Motion and structure from point correspondences with error estimation: planar surfaces. IEEE Trans Signal Process 39(12):2691–2717 CrossRefGoogle Scholar
  29. 29.
    Weng J, Ahuja N, Huang TS (1993) Optimal motion and structure estimation. IEEE Trans Pattern Anal Mach Intell 15(9):864–884 CrossRefGoogle Scholar
  30. 30.
    Xu G, Terai JI, Shum HY (2000) A linear algorithm for camera self-calibration, motion and structure recovery for multi-planar scenes from two perspective images. IEEE Comput Vision Pattern Recogn II:474–479 Google Scholar

Copyright information

© Springer-Verlag 2003

Authors and Affiliations

  1. 1.Department of Electrical EngineeringWashington UniversitySaint LouisUSA
  2. 2.Computer Science DepartmentUCLALos AngelesUSA

Personalised recommendations