International Journal of Computer Vision

, Volume 1, Issue 4, pp 333–356 | Cite as

Active vision

  • John Aloimonos
  • Isaac Weiss
  • Amit Bandyopadhyay


We investigate several basic problems in vision under the assumption that the observer is active. An observer is called active when engaged in some kind of activity whose purpose is to control the geometric parameters of the sensory apparatus. The purpose of the activity is to manipulate the constraints underlying the observed phenomena in order to improve the quality of the perceptual results. For example a monocular observer that moves with a known or unknown motion or a binocular observer that can rotate his eyes and track environmental objects are just two examples of an observer that we call active. We prove that an active observer can solve basic vision problems in a much more efficient way than a passive one. Problems that are ill-posed and nonlinear for a passive observer become well-posed and linear for an active observer. In particular, the problems of shape from shading and depth computation, shape from contour, shape from texture, and structure from motion are shown to be much easier for an active observer than for a passive one. It has to be emphasized that correspondence is not used in our approach, i.e., active vision is not correspondence of features from multiple viewpoints. Finally, active vision here does not mean active sensing, and this paper introduces a general methodology, a general framework in which we believe low-level vision problems should be addressed.


Computer Vision Geometric Parameter Computer Image General Framework Basic Problem 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    G. Adiv, “Inherent ambiguities in recovering 3-D motion and structure from a noisy flow field”, PROC. IEEE CONF. ON COMPUTER VISION AND PATTERN RECOGNITION, pp. 70–77, 1985.Google Scholar
  2. 2.
    J. Aloimonos, “Computing intrinsic images”, Ph.D. thesis, Univ. of Rochester, 1986.Google Scholar
  3. 3.
    J. Aloimonos and A. Bandyopadhyay, “Correspondence is not necessary for motion perception”, submitted.Google Scholar
  4. 4.
    J. Aloimonos and A. Basu, “Shape from contour”, submitted, 1986.Google Scholar
  5. 5.
    J. Aloimonos and C.M. Brown, “Direct processing of curvilinear sensor motion from a sequence of perspective images”, PROC. IEEE WORKSHOP ON COMPUTER VISION, Annapolis, MD, 1984.Google Scholar
  6. 6.
    J. Aloimonos and J.-Y. Herve, “Correspondenceless detection of depth and motion for a planar surface”, to appear, 1987.Google Scholar
  7. 7.
    J. Aloimonos and D. Shulman, “Learning shape computations”, to appear, 1987.Google Scholar
  8. 8.
    J. Aloimonos and D. Tsakiris, “Shape from nonplanar contour”, in preparation, 1987.Google Scholar
  9. 9.
    S. Amari, “Feature spaces which admit and detect invariant signal transformations”, PROC. ICPR, Tokyo, 1978.Google Scholar
  10. 10.
    S. Amari and S. Maruyama, “Computation of structure from motion”, personal communication, 1986.Google Scholar
  11. 11.
    R. Bajcsy, “Active perception vs. passive perception”, PROC. IEEE WORKSHOP ON COMPUTER VISION, Ann Arbor, MI, 1986.Google Scholar
  12. 12.
    A. Bandyopadhyay, personal communication, 1987.Google Scholar
  13. 13.
    A. Bandyopadhyay, “A computational study of rigid motion perception”, Ph.D. Thesis, Dept. of Computer Science, Univ. of Rochester, 1986.Google Scholar
  14. 14.
    J., Brady and A., Yuille, “An extremum principle for shape from contour”, IEEE Trans. Pami-6, pp. 288–301, 1984.Google Scholar
  15. 15.
    C.M. Brown, Personal communication, 1986.Google Scholar
  16. 16.
    A. Bruss and B.K.P. Horn, “Passive navigation”, COMPUTER VISION, GRAPHICS AND IMAGE PROCESSING, vol. 21, 1983.Google Scholar
  17. 17.
    J.E. Cutting, “Motion parallax and visual flow: how to determine direction of locomotion”, Dept. of Psychology, Cornell Univ., 1982.Google Scholar
  18. 18.
    L. Davis, L. Janos, and S. Dunn, “Efficient recovery of shape from texture”, Tech. Report 1133, Computer Vision Laboratory, Univ. of Maryland, 1982.Google Scholar
  19. 19.
    J.Q., Fang and T.S., Huang, “Solving three dimensional small rotation motion equations: uniqueness, algorithms, and numerical results”, Computer Vision, Graphics and Image Processing, vol. 26, pp. 183–206, 1984.Google Scholar
  20. 20.
    J. Feldman, “Four frames suffice: A provisional model of vision and space”, BEHAVIORAL AND BRAIN SCIENCES, June, 1985.Google Scholar
  21. 21.
    J.J., Gibson, The Perception of the Visual World. Houghton Mifflin: Boston, 1951.Google Scholar
  22. 22.
  23. 23.
    E.C., Hildreth, “Computations underlying the measurement of visual motion”, Artificial Intelligence, vol. 23, pp. 309–354, 1984.Google Scholar
  24. 24.
    B.K.P., Horn, Robot Vision, McGraw-Hill: New York, 1986.Google Scholar
  25. 25.
    B.K.P. Horn, “Understanding image intensities”, ARTIFICIAL INTELLIGENCE, vol. 8, 1977.Google Scholar
  26. 26.
    B.K.P., Horn and B., Schunck, “Determining optical flow”, Artifical Intelligence, vol. 17, pp. 185–204, 1981.Google Scholar
  27. 27.
    K. Ikeuchi and B.K.P. Horn, “Numerical shape from shading and occluding boundaries”, ARTIFICIAL INTELLIGENCE, vol. 17, 1981.Google Scholar
  28. 28.
    E. Ito and J. Aloimonos, “Determining transformation parameters from images: theory”, in PROC. IEEE CONF. ON ROBOTICS AND AUTOMATION, 1987a.Google Scholar
  29. 29.
    E. Ito and J. Aloimonos, “Shape from nonplanar contour”, to appear, 1987b.Google Scholar
  30. 30.
    T. Kanade, “Determining the shape of an object from a single view”, ARTIFICIAL INTELLIGENCE, 17, 1981.Google Scholar
  31. 31.
    K. Kanatani, “Group theoretical methods in image understanding”, Tech. Report 1692, Computer Vision Laboratory, Univ. of Maryland, 1986.Google Scholar
  32. 32.
    S. Kirkpatrick, C.D. Gelatt, Jr., and M.P. Vecchi, Optimization by simulated annealing”, RC 9355 (#41093), IBM T.J. Watson Research Center, Yorktown Heights, NY.Google Scholar
  33. 33.
    D. Lee and T. Pavlidis, personal communication, 1986.Google Scholar
  34. 34.
    H.C., Longuet-Higgins, “A computer algorithm for reconstructing a scene from two projections”, Nature, vol. 293, pp. 133–135, 1981.Google Scholar
  35. 35.
    H.C., Longuet-Higgins and K., Prazdny, “The interpretation of a moving retinal image”, Proc. Roy. Soc. (London), vol. B208, pp. 385–397, 1980.Google Scholar
  36. 36.
    D., Marr, Vision. W.H. Freeman: San Francisco, 1982.Google Scholar
  37. 37.
    V.A., Morozov, Regularization Methods for Solving Ill-Posed Problems. Springer-Verlag: New York, 1984.Google Scholar
  38. 38.
    H.H. Nagel and B. Neumann, “On 3-D reconstruction from two perspective view”, PROC. SEVENTH INT. JOINT CONF. ARTIF. INTEL., Vancouver, pp. 661–663, 1981.Google Scholar
  39. 39.
    V. Nalwa, “Detecting edges in images”, personal communication, 1985.Google Scholar
  40. 40.
    S. Negahdaripour and B.K.P. Horn, “Determining 3-D motion of planar objects from image brightness patterns”, PROC. NINTH INT. JOINT CONF. ARTIF. INTEL., Los Angeles, pp. 898–901, 1985.Google Scholar
  41. 41.
    T. Poggio and C. Koch, “Ill-posed problems in early vision: from computational theory to analog networks”, PROC. ROY. SOC. (LONDON), B, 1985.Google Scholar
  42. 42.
    T. Poggio and the staff, “MIT progress in understanding images”, PROC. DARPA IMAGE UNDERSTANDING WORKSHOP, Miami, 1985.Google Scholar
  43. 45.
    K., Prazdny, “Determining the instantaneous direction of motion from optical flow generated by curvilinearly moving observer”, Computer Vision, Graphics and Image Processing, vol. 17, pp. 94–97, 1981.Google Scholar
  44. 45.
    J.W. Roach and J.K. Aggarwal, “Determining the movement of objects from a sequence of images”, IEEE TRANS. PAMI-2, 1980.Google Scholar
  45. 46.
    D. Shulman and J. Aloimonos, “A linear theory of discontinuous continuous regularization”, submitted, 1987.Google Scholar
  46. 47.
    K., Stevens, “The information content in texture gradients”, Biological Cybernetics, vol. 42, pp. 95–105, 1982.Google Scholar
  47. 48.
    D., Terzopoulos, “Regularization of inverse problems involving discontinuities”, IEEE Trans. Pami-8, pp. 413–425, 1986.Google Scholar
  48. 49.
    A.N., Tichonov and V.Y., Arsenin. Solutions of Illposed Problems. Winston: Washington, 1977.Google Scholar
  49. 50.
    R.Y., Tsai and T.S., Huang, “Uniqueness and estimation of three dimensional motion parameters of rigid objects with curved surfaces”, IEEE Trans. Pami-6, pp. 13–27, 1984.Google Scholar
  50. 51.
    S., Ullman, “The interpretation of structure from motion”, Proc. Roy. Soc. (London), vol. B203, pp. 405–426, 1979.Google Scholar
  51. 52.
    A. Waxman and S. Ullman, “Surface structure and 3-D motion from image flow: a kinematic analysis”, CAR-TR-24, Center for Automation Research, Univ. of Maryland, October 1983.Google Scholar
  52. 53.
    A. Witkin, “Recovering surface orientation and shape from texture”, ARTIFICIAL INTELLIGENCE, vol. 17, 1981.Google Scholar
  53. 54.
    L. Weiss, “3-D shape reconstruction on a varying mesh”, PROC. IMAGE UNDERSTANDING WORKSHOP, 1987.Google Scholar

Copyright information

© Kluwer Academic Publishers 1987

Authors and Affiliations

  • John Aloimonos
    • 1
  • Isaac Weiss
    • 1
  • Amit Bandyopadhyay
    • 2
  1. 1.Computer Vision Laboratory, Center for Automation ResearchUniversity of MarylandCollege Park
  2. 2.Department of Computer ScienceSUNY Stony BrookStony Brook

Personalised recommendations