Skip to main content

Artificial Intelligence Approaches to Image Understanding

  • Conference paper
Pattern Recognition Theory and Applications

Part of the book series: NATO Advanced Study Institutes Series ((ASIC,volume 81))

Abstract

The past decade has seen considerable progress in the development of computer vision within Artificial Intelligence. Attention has shifted from restrictions on the domain of application of a vision system to restrictions on the visual abilities studied. Mathematical analyses have been offered for some of the elements of visual perception, such as the relationship between image irradiance and scene radiance, the location of important intensity changes and motion primitives. In each case, it is observed that the information in the image only partially constrains the interpretation of the image, and further constraints are sought. The constraints embody commitments about the way the world is, at least most of the time. For example, the world mostly consists of smooth surfaces, and scenes are mostly viewed from general position, free of accidental alignments. Perceptual abilities such as stereopsis, lightness determination, shape from shading and from texture, require that the appropriate constraints be uncovered and appropriately expressed. Representations have been developed that make explicit the information computed by a perceptual ability. Examples include the Primal Sketch, the Reflectance Map, and object representations based on generalised cones. The isolation of representations has lead to a view of visual perception as the process of constructing instances of a sequence of representations. The input to a particular process is often not the image per se, but a representation of the information computed by a number of processes. It is this observation which most strongly distinguishes image understanding from conventional pattern recognition. A number of sample image understanding systems are described, including edge detection, shape from shading, binocular and photometric stereo, optical flow, directional selectivity, surface reconstruction through interpolation and the representation of objects by primitive volumes. In some cases, it has been possible to directly relate the theory embodied in the program to animate visual systems. In some cases it has been possible to develop important practical applications for example, industrial inspection and bin picking in robotics, and monitoring airfields or terrain for changes in usage. Finally, in some cases it has been possible to construct hardware realizations of theories to achieve real time performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Anderson J.A. and Hinton, C.E. Models of information processing in the brain. Parallel Models of Associative Memory, eds. G.E. Hinton and J.A. Anderson, Erlbaum (1981)

    Google Scholar 

  • Ballard, D.H. Generalising the Hough transform to detectarbitrary shapes. Pattern Recognition (1979).

    Google Scholar 

  • Barrow, H.G. and Popplestone, R.J. Relational Descriptions in Picture Processing. Machine Intelligence, 6 (1971).

    Google Scholar 

  • Barrow, H.G. and Tenenbaum, J.M. Interpreting Line Drawings as Three Dimensional Surface. Artificial Intelligence 16, (1981).

    Google Scholar 

  • Barrow, H.G. et al. Interactive Aids for Cartography and Photointerpretation: Progress Report. Proceedings of the Image Understanidng Workshop ed. Baumann Lee, Science Applications 1977.

    Google Scholar 

  • Batali, J. (forthcoming SM dissertation), MIT, 1981.

    Google Scholar 

  • Binford, T.O. Visual Perception by Computer. Proc. IEEE Conf.

    Google Scholar 

  • Systems and Control (1971).

    Google Scholar 

  • Binford, T.O. Inferring Surfaces from Images. Artificial Intelligence 16 (1981).

    Google Scholar 

  • Birk, J. Dessimoz, J.D. and Kelley R.B. General Methods to Enable Robots with Vision to Acquire, Orient and Transport Workpieces. NSF Grantees Conference on Production Research and Technology 8 (1981).

    Google Scholar 

  • Bolles, R. Locating Partially Visible Objects: the Local Feature Method. Proc. AAAI 1 (1980), 41 – 44.

    Google Scholar 

  • Bolles, R., Quam, L.H., Fischler, M.A. and Wolf, H.C. The SRI Road Expert: Image to Database Correspondence. Proceed-ings of the Image Understanding Workshop ed. Baumann Lee, Science Applications, 1978.

    Google Scholar 

  • Brady, J.M. Finding the Axis of an Egg. Proc. Int. J. Conf. Artif. Int. 6 (1979b), 85 – 87.

    Google Scholar 

  • Brady, J.M. The Development of a Computer Vision System. Recherche Psicologica (1979).

    Google Scholar 

  • Brady, J.M. Toward a Computational Theory of Early Visual Processing in Reading. MIT, Al Memo 593, 1980.

    Google Scholar 

  • Brady, J.M. The Changing Shape of Computer Vision. Artificial Intelligence 16 (1981).

    Google Scholar 

  • Brady, J.M. and Wielinga, B.J. Reading the Writing on the Wall. Computer Vision Systems Hanson and Riseman, Academic Press, 1978.

    Google Scholar 

  • Brady, J.M., Grimson, W.E.L. and Langridge, D. The Shape of Subjective Contours. Proc. AAAI 1 (1980) 15 – 17.

    Google Scholar 

  • Braid, I. Designing with Volumes. Cantab, Cambridge, UK (1973).

    Google Scholar 

  • Brice, C.R. and Fennema, C.L. Scene Analysis Using Regions. Artificial Intelligence 1 (1970), 205 – 226.

    Article  Google Scholar 

  • Brooks, M.J. Rationalising Edge Detection. Computer Graphics and Image Processing 8 (1978) 277 – 285.

    Article  Google Scholar 

  • Brooks, M.J. Surface Normals from Closed Paths. Proc. Int. Jt. Conf. Artificial Intelligence 6 (1979) 98 – 101.

    Google Scholar 

  • Brooks, R. Symbolic Reasoning Among 3-D Models and 2-D Images Artificial Intelligence 16 (1981).

    Google Scholar 

  • Brooks, R., Greiner Russell and Binford T.O. The ACRONYM Model Based Vision System. Proc. Int. Jt. Conf. Artificial Intelligence 6 (1979) 105 – 113.

    Google Scholar 

  • Bruss Anna R. The Image Irradiance Equation: Its Solution and Application, MIT, 1980.

    Google Scholar 

  • Carlbom, I. and Paciorek, J. Planar Geometric Projections and Viewing Transformations. Computing Surveys 10 (1978) 465 – 502.

    Article  MATH  Google Scholar 

  • Clocksin, W.F. Perception of Surface Slant and Edge Labels from Optical Flow: A Computational Approach. Perception 9 (1980) 253 – 269.

    Article  Google Scholar 

  • David Larry S. and Rosenfield Azriel. Cooperating Processes for Low-Level Vision: A Survey. Artificial Intelligence: Special Issue on Computer Vision 16 (1981).

    Google Scholar 

  • Dev. P. Perception of Depth Surfaces in Random-Dot Stereo-grams: A Neural Model. Int. J. Man-Machine Studies 7 (1975) 511 – 528.

    Article  Google Scholar 

  • Faux, I.D. and Pratt, M.J. Computational Geometry fro Design and Manufacture. Ellis Horwood, Chichester, UK, 1979.

    Google Scholar 

  • Finkel, R., Taylor, R.H., Bolles, R., Paul, R.. and Feldman, J.A. A Programming System for Automation. Stanford Univer-sity, AIM 177, 1974.

    Google Scholar 

  • Fischler, M.A. and Bolles, R. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Proceedings of the Image Understanding Workshop, ed. Baumann Lee, Science Applications, 1980,

    Google Scholar 

  • Gibson, J.J. The Perception of the Visual World. Houghton- Mifflin, Boston, Mass., 1950.

    Google Scholar 

  • Gibson, J.J. The Senses Considered as Perceptual Systems. Houghton-Mifflin, Boston, Mass, 1966.

    Google Scholar 

  • Grimson, W.E.L. Computing Shape Using a Theory of Human Stereo Vision (PhD Thesis), forthcoming book published by MIT Press, MIT, 1980.

    Google Scholar 

  • Herskovits, A. and Binford, T.O. On Boundary Detection, MIT, Al Memo 183, 1970.

    Google Scholar 

  • Hildreth, E.C. Implementation of a Theory of Edge Detection. (MS dissertation), also Al-TR 579, MIT, 1980.

    Google Scholar 

  • Hollerback, J.M. Hierarchical Shape Description of Objects by Selection and Modification of Prototypes. (MS Dissertation) also AI-TR-346, MIT, 1975.

    Google Scholar 

  • Horn, B.K.P. Determining Lightness from an Image. Computer Graphics and Image Processing 3, (1974), 277 – 299.

    Article  Google Scholar 

  • Horn, B.K.P. The Binford-Horn Line-Finder, MIT, Al Memo 285 ( 1973.

    Google Scholar 

  • Horn, B.K.P. Obtaining Shape from Shading Information. The Psychology of Computer Vision ed. Winstron P.H., McGraw-Hill, 1975.

    Google Scholar 

  • Horn, B.K.P. Understanding Image Intensities. Artificial Intelligence 8 (1977) 201 – 231.

    Article  MATH  Google Scholar 

  • Horn, B.K.P. Hill Shading and the Reflectance Map. Proc. IEEE (1981).

    Google Scholar 

  • Horn, B.K.P. Sequins and Quills — Representations for Surface Topography. Representation of 3-Dimensional Objects ed. Bajcsy, R., Springer Verlag, 1972.

    Google Scholar 

  • Horn, B.K.P. The Curve of Least Energy. MIT, Al Memo 610, 1981.

    Google Scholar 

  • Horn, B.K.P. and Bachman, B.L. Using Synthetic Images to Register Real Images with Surface Models. Comm. ACM 21 (1978) 914 – 924.

    Article  Google Scholar 

  • Horn, B.K.P. and Schunck, B.G. Determining Optical Flow. Artificial Intelligence 16 (1981).

    Google Scholar 

  • Horn, B.K.P. and Sjoberg, Robert W. Calculating the Reflectance Map. Appl. Optics 18 (1979), 1770 – 1779.

    Article  Google Scholar 

  • Horn, B.K.P. and Sjoberg, Robert W. Atmospheric Modelling for the Generation of Albedo Images. Proceedings of the Image Understanding Workshop ed. Baumann Lee, Science Applications, 1980.

    Google Scholar 

  • Horn, B.K.P., Woodham, R.J. and Silver, W.M. Determining Shape and Reflectance Using Multiple Images. MIT, Al Memo, 490, 1978.

    Google Scholar 

  • Hoffman, D.D. and Flinchbaugh, B.E. The Interpretation of Biological Motion. MIT, Al Memo 608, 1980.

    Google Scholar 

  • Huffman, D.A. Impossible Objects as Nonsense Sentences. Machine Intelligence 6, eds. Meltzer B. and Michie, D. Edinburgh University Press, 1971.

    Google Scholar 

  • Huffman, D.A. A Duality Concept for the Analysis of Poly-hedral Scenes. Machine Intelligence 8, eds. Elcock, E.W. and Michie, D., Chichester: Ellis Horwood, 1977.

    Google Scholar 

  • Ikeuchi, K. Determination of Surface Orientations of Specular

    Google Scholar 

  • Surfaces by Using the Photometric Stereo Method. IEEE (accepted for publication) (1981).

    Google Scholar 

  • Ikeuchi K. and Horn, B.K.P. Numerical Shape from Shading and Occluding Boundaries. Artificial Intelligence 16 (1981).

    Google Scholar 

  • Julesz, B. Foundations of Cyclopean Perception. The University of Chicago Press, Chicago, 1971.

    Google Scholar 

  • Johansson, G. Visual Perception of Biological Motion and a Model for its Analysis. Perception and Psychophysics 14 (1973), 201 – 211.

    Article  Google Scholar 

  • Land. E.H.# and McCann, J.J. Lightness and Retinex Theory. J. Optical Society of America 61 (1971) 1 – 11.

    Article  Google Scholar 

  • Lesser Victor R. and Erman Lee D. A Retrospective View of the Hearsay-IIArchitecture. Proc. Int. Jt. Conf. Artificial Intelligence 2 (1977), 790 – 800.

    Google Scholar 

  • Longuet-Higgins, H.C. and Prazdny, K.F. The Interpretation of Moving Retinal Images. Proc. R. Soc. London, 208 (1980) 385- 387.

    Google Scholar 

  • Lozano-Perez T. Spatial Planning: A Configuration Space Approach. MIT, Al Memo 605, 1980.

    Google Scholar 

  • Lynch, K. The Image of the City. MIT Press, Cambridge, MA 1960.

    Google Scholar 

  • Mackworth, A.K. Interpreting Pictures of Polyhedral Scenes. Artificial Intelligence 4 (1973), 121 – 137.

    Article  Google Scholar 

  • Marr, D. Early Processing of Visual Information. Phil. Trans. R. Soc. London, B275 (1976) 483 – 524.

    Article  Google Scholar 

  • Marr. D. Analysis of Occluding Contours. Proc. R. Soc. London B197 (1977) 441 – 475.

    Article  Google Scholar 

  • Marr, D. Representing Visual Information. Computer Vision Systems Hanson and Riseman, Academic Press, 1978.

    Google Scholar 

  • Marr, D. Vision. Freeman, San Francisco, 1981.

    Google Scholar 

  • Marr. D. and Hildreth, E. Theory of Edge Detection. Proc. R. Soc. London B207 (1980) 187 – 217.

    Article  Google Scholar 

  • Marr. D. Hildreth, E. and Poggio, T. Evidence for a Fifth, Smaller Channel in Early Human Vision, MIT, Al Memo 541, 1979.

    Google Scholar 

  • Marr, D. and Nichihara, H.K. Representation and Recognition of the Spatial Organisation of Three Dimensional Structure. Proc. R. Soc. London, B200 (1978) 269 – 294.

    Google Scholar 

  • Marr. D. and Poggio, T. Cooperative Computation of Stereo Disparity. Science 194 (1976a), 283 – 287.

    Article  Google Scholar 

  • Marr, D. and Poggio, T. Cooperative Computation of Stereo Disparity. MIT, Al Memo 364, 1976b.

    Google Scholar 

  • Marr, D. and Poggio, T. A Theory of Human Stereo Vision. Proc. R. Soc. London, B204 (1979) 301 – 328.

    Article  Google Scholar 

  • Marr, D. and Ullman, S. Directional Selectivity and Its Use in Early Visual Processing. Proc. R. Soc. London, B (1981).

    Google Scholar 

  • Marshall, J.C. and Newcombe, F. Patterns of Paralexia. Journal of Psycholonguistic Research 2 (1973), 175 – 199.

    Article  Google Scholar 

  • Mayhew, J. and Frisby, J.P. Psychophysical and Computational Studies Toward a Theory of Human Stereopsis. Artificial Intelligence 16 (1981).

    Google Scholar 

  • Nakayama, K. and Loomis, J.M. Optical Velocity Patterns, Velocity Sensitive Neurons and Space Perception. Perception 3 (1974) 63 – 80.

    Article  Google Scholar 

  • Nevatia, R. and Binford, T.O. Description and Recognition of Curved Objects. Artificial Intelligence 8 (1977) 77 – 98.

    Article  MATH  Google Scholar 

  • Nevatia, R. and Babu, K.R. Linear Feature Extraction. Proceedings of the Image Understanding Workshop, ed. Baumann Lee, Science Applications, November 1978.

    Google Scholar 

  • Nishihara, H.K. and Larson, N.G. Toward a Real Time Implementation of the Marr-Poggio Stereo Matcher. Proceedings of the Image Understanding Workshop, Lee Baumann, 1981.

    Google Scholar 

  • Nudd, G.R., Fouse, S.D., Nussmeier, T.A. and Nygaard, P.A. Development of Custom-Designed Integrated Circuits for Image Understanding. Proceedings of the Image Understanding Workshop. Lee Baumann, 1979.

    Google Scholar 

  • O’Callaghan, J.F. Recovery of Perceptual Shape Organisation from Simple Closed Boundaries. Computer Graphics and Image Processing 3 (1974) 300 – 312.

    Article  Google Scholar 

  • O’Gorman, F. Edge Detection Using Walsh Functions. Proc. Second AISB Conf., Edinburgh (1976).

    Google Scholar 

  • Paul, R.P. Manipulator Cartesian Path Control. IEEE Trans. Sys. Man and Cyb. SMC-9 (1979).

    Google Scholar 

  • Pavlidis Theodosios. Algorithms for Shape Analysis of Contours and Waveforms. IEEE Trans. PAMI 2 (1980) 301 – 312.

    Article  Google Scholar 

  • Prazdny, K.F. Egomotion and Relative Depth Map from Optical Flow. Biological Cybernetics, 36, (1980) 87 – 102.

    Article  MathSciNet  MATH  Google Scholar 

  • Reddy, Raj. Pragmatic Aspects of Machine Vision. Computer Vision Systems, Hanson and Riseman, Academic Press, 1978.

    Google Scholar 

  • Richter, J. and Ullman, S. A Model for the Spatio-Temporal Organisation of X and Y-Type Ganglion Cells in the Primate Retina. MIT, Al Memo 573 (1980).

    Google Scholar 

  • Riley, M. Representing Image Structure. MIT, 1981.

    Google Scholar 

  • Rosenfeld Azriel and Kak Avinash, C. Digital Picture Processing. Academic, New York, 1976.

    Google Scholar 

  • Schatz, Bruce R. The computation of Immediate Texture Discrimination. MIT Al Memo 426 (1977).

    Google Scholar 

  • Schultz, D.G. and Melsa, J.L. State Functions and Linear Control Systems. McGraw-Hill, New York, 1967.

    MATH  Google Scholar 

  • Shapiro Linda. A Structural Model for Shape. IEEE Trans. PAMI2 (1980) 111 – 126.

    Google Scholar 

  • Shirai, Y. A Context-Sensitive Line Finder for Recognition of Polyhedra. Artificial Intelligence 4 (1973) 95 – 119.

    Article  Google Scholar 

  • Silver, W.M. Determining Shape and Reflectance Using Multiple Images. MIT, 1980.

    Google Scholar 

  • Sloan, K.R. and Ballard, D.H. Experience with the Generalised Hough Transform. Proceedings of the Image Understanding Workshop Lee Baumann, 1980.

    Google Scholar 

  • Stevens, K.A. Occlusion Clues and Subjective Contours. MIT Al Lab. Memo, 363, 1976.

    Google Scholar 

  • Stevens K.A. Surface Perception by Local Analysis of Texture and Contour. MIT Al Lab. Technical Report, 512, 1980.

    Google Scholar 

  • Stevens, K.A. The Visual Interpretation of Surface Contours. Artificial Intelligence 16 (1981a).

    Google Scholar 

  • Stevens, K.A. On the Visual Detection of Fine Detail. MIT in Preparation, 1981b.

    Google Scholar 

  • Spacek, L.A Shape from Shading and More than One View. MSc Thesis, University of Essex, UK, 1979.

    Google Scholar 

  • Strat, T.M. A Numerical Method for Shape from Sahding from a Single Image. MIT, 1979.

    Google Scholar 

  • Taylor Russell, H. Planning and Execution of Straight Line Manipulator Trajectories. IBM J. of Res. and Dev. 23 (1979) 424 – 433.

    Article  Google Scholar 

  • Ullman, S. The Interpretation of Visual Motion. MIT Press, Cambridge, Mass, 1978.

    Google Scholar 

  • Webb, J. Static Analysis of Moving Jointed Objects. Proc. AAAI 1 (1980), 35 – 37.

    Google Scholar 

  • Weiskrantz, L., Warrington, E.K., Sanders, M.D. and Marshall, J. Visual Capacity in the Hemianopic Field Following a Restricted Occipital Ablation. Brain 97 (1974), 709 – 728.

    Article  Google Scholar 

  • Weisstein, N. Beyond the Yellow Volkswagen Detector and the Grandmother Cell: A General Strategy for the Exploration of Operations in Human Pattern Recognition. Contemporary Issues in Cognitive Psychology, ed. Solso, R., W.H. Winston £ Sons 1973.

    Google Scholar 

  • Wilson, H.R. and Bergen, J.R. A Four Mechanism Model for Spatial Vision. Vision Research 19, (1979) 19 – 32.

    Article  Google Scholar 

  • Wilson, H.R. and Giese, S.C. Threshold Visibility of Fre-quency Gradient Patterns. Vision Research 17 (1977) 1177 – 119C

    Article  Google Scholar 

  • Witkin, Andrew P. Recovering Surface Shape and Orientation from Texture. Artificial Intelligence 16 (1981).

    Google Scholar 

  • Witkin, Andrew P. A Statistical Technique for Recovering Surface Orientation from Texture in Natural Imagery. Proc. AAAI 1 (1980), 1 – 3.

    Google Scholar 

  • Woodham, R.J. Analysing Images of Curved Surfaces. Artifi-cial Intelligence 16 (1981).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1982 D. Reidel Publishing Company

About this paper

Cite this paper

Brady, M. (1982). Artificial Intelligence Approaches to Image Understanding. In: Kittler, J., Fu, K.S., Pau, LF. (eds) Pattern Recognition Theory and Applications. NATO Advanced Study Institutes Series, vol 81. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-7772-3_15

Download citation

  • DOI: https://doi.org/10.1007/978-94-009-7772-3_15

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-009-7774-7

  • Online ISBN: 978-94-009-7772-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics