Advertisement

International Journal of Computer Vision

, Volume 29, Issue 3, pp 233–253 | Cite as

Generalization to Novel Views: Universal, Class-based, and Model-based Processing

  • Yael Moses
  • Shimon Ullman
Article

Abstract

A major problem in object recognition is that a novel image of a given object can be different from all previously seen images. Images can vary considerably due to changes in viewing conditions such as viewing position and illumination. In this paper we distinguish between three types of recognition schemes by the level at which generalization to novel images takes place: universal, class, and model-based. The first is applicable equally to all objects, the second to a class of objects, and the third uses known properties of individual objects. We derive theoretical limitations on each of the three generalization levels. For the universal level, previous results have shown that no invariance can be obtained. Here we show that this limitation holds even when the assumptions made on the objects and the recognition functions are relaxed. We also extend the results to changes of illumination direction. For the class level, previous studies presented specific examples of classes of objects for which functions invariant to viewpoint exist. Here, we distinguish between classes that admit such invariance and classes that do not. We demonstrate that there is a tradeoff between the set of objects that can be discriminated by a given recognition function and the set of images from which the recognition function can recognize these objects. Furthermore, we demonstrate that although functions that are invariant to illumination direction do not exist at the universal level, when the objects are restricted to belong to a given class, an invariant function to illumination direction can be defined. A general conclusion of this study is that class-based processing, that has not been used extensively in the past, is often advantageous for dealing with variations due to viewpoint and illuminant changes.

object recognition invariance 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adini, Y., Moses, Y. and Ullman, S. 1997. Face recognition: the problem of compensating for illumination changes. IEEE Transactions on Pattern Analysis and Machine Intelligence,19:721–732.CrossRefGoogle Scholar
  2. Basri, R. and Moses, Y. 1998. When is it possible to identify 3D objects from single images using class constraints? In International Conference on Computer Vision,pp. 541–548.Google Scholar
  3. Belhumeur, P.N., Hespanha, J.P. and Kriegman, D.J. 1997. Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7): 711–720.CrossRefGoogle Scholar
  4. Biederman, I. 1985. Human image understanding: recent research and a theory. Computer, Graphics, and Image Processing,32:29–73.Google Scholar
  5. Brunelli, R. and Poggio, T. 1991. HyperBF networks for real object recognition. In IJCAI,Australia, pp. 1278–1284.Google Scholar
  6. Burns, J.B. Weiss, R.S. and Riseman, E.M. 1992. The non-existence of general-case view-invariants. In J. L. Mundy and A. Zisserman, Eds., Geometrical Invariance in Computer Vision,M.I.T. Press.Google Scholar
  7. Canny, J. F. 1986. A computational approach to edge detection. Pattern Analysis and Machine Intelligence,8:679–698.Google Scholar
  8. Clemens, D.J. and Jacobs, D.W. 1990. Model-group indexing for recognition. In Proc. Image Understanding Workshop,pp. 604–613.Google Scholar
  9. Clemens, D.J. and Jacobs, D.W. 1991. Space and time bounds on indexing 3D models from 2D images. Pattern Analysis and Machine Intelligence,13(10):1007–1017.CrossRefGoogle Scholar
  10. Craw, I., Ellis, H. and Lishman, J.R. 1987. Automatic extraction of face-features. Pattern Recognition Letters,5:183–187.CrossRefGoogle Scholar
  11. Daugman, J. G. 1985. Uncertainty relation for resolution in space, spatial frequency and orientation, optimized by two dimensional cortical filters. Journal of Optical Society of America,2:1160–1169.Google Scholar
  12. Davis, L. S. 1975. A survey of edge detection techniques. Computer Graphics and Image Processing,4:248–270.Google Scholar
  13. Faugeras, O.D. 1992. What can be seen in three dimensions with an uncalibrated stereo rig? In Proc. European Conference on Computer Vision,pp. 563–564.Google Scholar
  14. Fawcett, R., Zisserman, A., and Brady, J.M. 1994. Extracting structure from an affine view of a 3D point set with one or two bilateral symmetries. Image and Vision Computing,12(9):615–622.CrossRefGoogle Scholar
  15. Fischler, M. A., and Bolles, R. C. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM,24:381–395.CrossRefGoogle Scholar
  16. Hallinan, P.W. A low-dimensional representation of human faces for arbitrary lighting conditions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,pp. 995–999.Google Scholar
  17. Haralick, R. M. 1984. Digital step edges from zero crossings of second directional derivatives. IEEE Transactions on Pattern Analysis and Machine Intelligence,6:58–68.Google Scholar
  18. Hubel, D.G. and Wiesel, T.N. 1962. Receptive fields, binocular interaction, and functional architecture in the cat’s visual cortex. Journal of Physiology,160:106–154.Google Scholar
  19. Hubel, D.G. and Wiesel, T.N. 1968. Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology,195:215–243.Google Scholar
  20. Huttenlocher, D. P., and Ullman, S. 1990. Recognizing solid objects by alignment with an image. International Journal of Computer Vision, 5(2): 195–212.Google Scholar
  21. Jacobs, D. 1992. Space efficient 3D model indexing. In IEEE Conference on Computer Vision and Pattern Recognition,pp. 439–444.Google Scholar
  22. Kanade, T. 1977. Computer recognition of human faces. Birkhauser Verlag.Google Scholar
  23. Kaya, Y. and Kobayashi, K. 1972. A basic study of human face recognition. In S. Watanabe, Ed.,Frontiers of Pattern Recognition,pp. 265–289.Google Scholar
  24. Koenderink, J. J., and Van Doorn, A. J. 1991. Affine structure from motion. Journal of the Optical Society of America,8(2):377-385.Google Scholar
  25. Lamdan, Y., Schwartz, J.T. and Wolfson, H.J. 1987. Affine invariant model-based object recognition. IEEE Transaction on Robotics and Automation,6:578–589.CrossRefGoogle Scholar
  26. Lamdan, Y. and Wolfson, H. 1988. Geometric hashing: a general and efficient recognition scheme. In Proceedings of the 2nd International Conference on Computer Vision, pp. 238–251.Google Scholar
  27. Longuet-Higgins, H. C. 1981. Acomputer algorithm for reconstructing a scene from two projections. Nature,293:133–135.Google Scholar
  28. Lowe, D. G. 1987. Three-dimensional object recognition from single two-dimensional images. Artificial Intelligence,31:355–395.CrossRefGoogle Scholar
  29. Marcelja, S. 1980. Mathematical description of the responses of simple cortical cells. J. Optical Soc., 70:1297–1300.Google Scholar
  30. Marr, D. and Hildreth, E. 1980. Theory of edge detection. Proc. R. Soc. Lond. B,207:187–217.Google Scholar
  31. Moses, Y. 1993. Face recognition: generalization to novel images. Ph.D Thesis, Weizmann Institute of Science.Google Scholar
  32. Moses, Y., Edelman, S. and Ullman, S. 1996. Generalization to novel images in upright and inverted faces. Perception,25:443–461.Google Scholar
  33. Moses, Y., and Ullman, S. 1992. Limitation of Non-model-based recognition schemes. In Proc. European Conference on Computer Vision,pp. 820–828.Google Scholar
  34. Nixon, M. 1985. Eye spacing measurements for facial recognition. SPIE Application of Digital Image Processing VIII,575:279–285.Google Scholar
  35. Pollen, D., and Ronner, S. 1983. Visual cortical neurons as localized spatial frequency filters. IEEE Transactions on System, Man and Cybernetics,SMC-13: 907–916.Google Scholar
  36. Rothwell, C. A., Forsyth, D. A., Zisserman, A. and Mundy, J.L. 1993. Extracting projective structure from single perspective views of 3D point sets. In Proceeding of International Conference on Computer Vision,pp. 573–582.Google Scholar
  37. Rothwell, C.A., Zisserman, A., Forsyth, D.A. and Mundy, J.L. 1992. Canonical frames for planar object recognition. In European Conference on Computer Vision,pp. 757–772.Google Scholar
  38. Shashua, A. 1992. Illumination and viewposition in 3D visual recognition. In J.E. Moody, J. E. Hanson, and R.P. Lippman, Eds., Advances in Neural Information Processing Systems 4, Morgan Kaufman, pp. 68–74.Google Scholar
  39. Torre, V., and Poggio, T. 1986. On edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence,8:147–163.Google Scholar
  40. Tsai, R.Y. and Huang, T.S. 1984. Uniqueness and estimation of three dimensional motion parameters of rigid objects with curved surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence,6:13-27.Google Scholar
  41. Ullman, S. 1979. The interpretation of visual motion. MIT Press.Google Scholar
  42. Ullman, S. 1989. Aligning pictorial descriptions: an approach to object recognition. Cognition,32:93–254.CrossRefGoogle Scholar
  43. Ullman, S. and Basri, R. 1991. Recognition by linear combinations of models. IEEE Transactions on Pattern Analysis and Machine Intelligence,13:992–1005.CrossRefGoogle Scholar
  44. Viola, P., and Wells III, W. M. 1995. Alignment by maximization of mutual information. In Fifth International Conference on Computer Vision,pp.16–23.Google Scholar
  45. Warrington, E.K, and Taylor, A.M. 1978. Two categorical stages of object recognition. Perception,7:152–164.Google Scholar
  46. Weinshall, D. 1993. Model-based invariants for 3D vision. International Journal on Computer Vision,10(1):27–42.Google Scholar
  47. Wong, K.H., Law, H.M. and Tsang, P.W.M. 1989. A system for recognising human face. In Proc. ICASSP,pp. 1638–1642.Google Scholar
  48. Yuille, A. L., Cohen, D.C. and Hallinan, P.W. 1992. Feature extraction from faces using deformable templates. International Journal of Computer Vision,8(2):99–111.Google Scholar
  49. Zisserman, A., Forsyth, D., Mundy, J., Rothwell, C., Liu, J. and Pillow, N. 1995. 3D Object Recognition Using Invariance. Artificial Intelligent, 78(1-2):239–288.CrossRefGoogle Scholar

Copyright information

© Kluwer Academic Publishers 1998

Authors and Affiliations

  • Yael Moses
    • 1
  • Shimon Ullman
    • 1
  1. 1.Department of Applied Mathematics and Computer ScienceThe Weizmann Institute of ScienceRehovotIsrael

Personalised recommendations