International Journal of Computer Vision

, Volume 36, Issue 1, pp 31–50

Recognition without Correspondence using Multidimensional Receptive Field Histograms

  • Bernt Schiele
  • James L. Crowley
Article

Abstract

The appearance of an object is composed of local structure. This local structure can be described and characterized by a vector of local features measured by local operators such as Gaussian derivatives or Gabor filters. This article presents a technique where appearances of objects are represented by the joint statistics of such local neighborhood operators. As such, this represents a new class of appearance based techniques for computer vision. Based on joint statistics, the paper develops techniques for the identification of multiple objects at arbitrary positions and orientations in a cluttered scene. Experiments show that these techniques can identify over 100 objects in the presence of major occlusions. Most remarkably, the techniques have low complexity and therefore run in real-time.

object recognition appearance based recognition statistical object representation local appearance real-time computer vision 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ballard, D. and Rao, R. 1994. Seeing behind occlusions. In ECCV'94 Third European Conference on Computer Vision, Vol. 1, pp. 274–285.Google Scholar
  2. Ballard, D. and Wixson, L. 1993. Object recognition using steerable filters at multiple scales. In IEEE Workshop on Qualitative Vision, pp. 2–10.Google Scholar
  3. Basseville, M. 1996. Information: entropies, divergences et moyennes. Technical Report 1020, IRISA (in French).Google Scholar
  4. Belongie, S., Carson, C., Greenspan, H., and Malik, J. 1998. Colorand texture-based image segmentation using the expectationmaximization algorithm and its application to content-based image retrieval. In ICCV'98 Sixth International Conference on Computer Vision, pp. 675–682.Google Scholar
  5. Burkhardt, H. and Zisserman, A. (Eds.). 1992. Invariants for recognition. ESPRIT- Basic-Research-Workshop, ECCV'92.Google Scholar
  6. Burns, J., Weiss, R., and Riseman, E. 1990. View variation of point set and line segment features. In Proceedings DARPA Image Understanding Workshop, pp. 650–659.Google Scholar
  7. Califano, A. and Mohan, R. 1993. Systematic design of indexing strategies for object recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 709–710.Google Scholar
  8. Clemens, D. and Jacobs, D. 1991. Space and time bounds of indexing 3-d models from 2-d images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(10):1007–1017.Google Scholar
  9. Daugman, J. 1993. High confidence visual recognition of persons by test of statistical independence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(11):1148–1161.Google Scholar
  10. Deriche, R. 1987. Using canny' criteria to derive a recursively implemented optimal edge detector. International Journal of Computer Vision, 1(2):167–187. See also Deriche (1993).Google Scholar
  11. Deriche, R. 1993. Recursively implementing the gaussian and its derivatives. Technical Report 1893, INRIA- Sophia Antipolis.Google Scholar
  12. Ennesser, F. and Medioni, G. 1995. Finding waldo, or focus of attention using local color information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(8):805–809.Google Scholar
  13. Finlayson, G., Schiele, B., and Crowley, J. 1998. Comprehensive colour image normalization. In ECCV'98 Fifth European Conference on Computer Vision, Vol. 1, pp. 475–490.Google Scholar
  14. Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Juang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D., and Yanker, P. 1995. Query by image and video content: The QBIC system. IEEE Computer, pp. 23–32.Google Scholar
  15. Freeman, W. and Adelson, E. 1991. The design and use of steerable filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(9):891–906.Google Scholar
  16. Fukunaga, K. 1990. Introduction to statistical pattern recognition. In Computer Science and Scientific Computing, 2nd edn., Academic Press: New York.Google Scholar
  17. Funt, B. and Finlayson, G. 1995. Color constant color indexing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(5):522–529.Google Scholar
  18. Gabor, D. 1946. Theory of communication. Proc. Inst. Elec. Eng., 93(26):429–441.Google Scholar
  19. Grimson, W., Huttenlocher, D., and Jacobs, D. 1994. A study of affine matching with bounded sensor error. International Journal of Computer Vision, 13(1):7–32.Google Scholar
  20. Grimson, W. and Huttenlocher D. (Eds.). 1991. Interpretation of 3-d scenes-Part i (special issue). IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(10).Google Scholar
  21. Grimson, W. and Huttenlocher, D. (Eds.) 1992. Interpretation of 3-d scenes-Part ii (special issue). IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2).Google Scholar
  22. Hafner, J., Sawhney, H., Equitz, W., Flickner, M., and Niblack, W. 1995. Efficient color histogram indexing for quadratic form distance functions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(7):729–736.Google Scholar
  23. Haralick, R. 1979. Statistical and structural approaches to texture. Proceedings of IEEE, 67(5):786–804.Google Scholar
  24. Healey, G. and Slater, D. 1994. Using illumination invariant color histogram descriptors for recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 355–360.Google Scholar
  25. Hornegger, J. and Niemann, H. 1995. Statistical learning, localization and identification of objects. In ICCV'95 Fifth International Conference on Computer Vision, pp. 914–919.Google Scholar
  26. Intrator, N. and Gold, J. 1993. Three- dimensional object recognition using an unsupervised bcm network: The usefulness of distinguishing features. Neural Computation, 5:61–74.Google Scholar
  27. Jones, D. and Malik, J. 1992. A computational framework for determining stereo correspondence from a set of linear spatial filters. In ECCV'92 Second European Conference on Computer Vision, pp. 395–410.Google Scholar
  28. Koenderink, J. and Doorn, A. 1987. Representation of local geometry in the visual system. Biological Cybernetics, 55:367–375.Google Scholar
  29. Lamdan, Y., Schwartz, J., and Wolfson, H. 1988. Object recognition by affine invariant matching. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 335–344.Google Scholar
  30. Lamdan, Y. and Wolfson, H. 1988. Geometric hashing: A general and efficient model based recognition scheme. In ICCV'88 Second International Conference on Computer Vision, pp. 238–249.Google Scholar
  31. Malik, J. and Perona, P. 1989. A computational model of texture segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 326–332.Google Scholar
  32. Mao, J. and Jain, A. 1992. Texture classification and segmentation using multiresolution simultaneous autoregressive models. Pattern Recognition, 25(2):173–188.Google Scholar
  33. Matas, J., Marik, R., and Kittler, J. 1995. On representation and matching of multi-colored objects. In ICCV'95 Fifth International Conference on Computer Vision, pp. 726–732.Google Scholar
  34. Mel, B. 1997. Seemore: Combing color, shape, and texture histogramming in a neurally-inspired approach to visual object recognition. Neural Computation, 9:777–804.Google Scholar
  35. Moghaddam, B. and Pentland, A. 1995. Maximum likelihood detection of faces and hands. In International Workshop on Automatic Face-and Gesture-Recognition, pp. 122–128.Google Scholar
  36. Mohr, R., Picard, S., and Schmid, C. 1997. Bayesian decision versus voting for image retrieval. In Procceedings of the 7th International Conference on Computer Analysis of Images and Patterns, pp. 376–383.Google Scholar
  37. Mundy, J.L. and Zisserman, A. (Eds.). 1992. Geometric Invariance in Computer Vision. MIT Press.Google Scholar
  38. Mundy, J.L., Zisserman, A., and Forsyth, D. (Eds.). 1993. Application of Invariance in Computer Vision. Volume 825 of Lecture Notes in Computer Science, Springer Verlag.Google Scholar
  39. Murase, H. and Nayar, S. 1995. Visual learning and recognition of 3d objects from appearance. International Journal of Computer Vision, 14:5–24.Google Scholar
  40. Nagao, K. 1995. Recognizing 3d objects using photometric invariants. In ICCV'95 Fifth International Conference on Computer Vision, pp. 480–487.Google Scholar
  41. Object Representation 1996. In International Workshop on Object Representation for Computer Vision, Cambridge, England.Google Scholar
  42. Ohba, K. and Ikeuchi, K. 1996. Recognition of the multi specularity objects for bin-picking task. In IROS'96 Intelligent Robots and Systems, Osaka, Japan, pp. 1440–1447.Google Scholar
  43. Perona, P. 1995. Deformable kernels in early vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(5):488–499.Google Scholar
  44. Popat, K. and Picard, R. 1994. Cluster-based probability model applied to image restoration and compression. In IEEE Conference on Acoustics, Speech and Signal Processing.Google Scholar
  45. Pope, A. 1995. Learning to Recognize Objects in Images: Acquiring and Using Probabilistic Models of Appearance. Ph.D. Thesis, Department of Computer Science, University of British Columbia.Google Scholar
  46. Pope, A. and Lowe, D. 1996. Learning appearance models for object recognition. In International Workshop on Object Representation for Computer Vision, Cambridge, England.Google Scholar
  47. Press, W.H., Teukolsky, S.A., Vetterling, W.T., and Flannery, B.P. 1992. Numerical Recipes in C, 2nd edn., Cambridge University Press.Google Scholar
  48. Rao, R. and Ballard, D. 1995. An active vision architecture based on iconic representations. Artificial Intelligence, 78:461–505.Google Scholar
  49. Rao, R. and Ballard, D. 1997. Dynamic model of visual recognition predicts neural response properties in the visual cortex. Neural Computation, 9(4):721–763.Google Scholar
  50. Rigoutsos, I. and Hummel, R. 1993. Distributed Bayesian object recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 180–186.Google Scholar
  51. Schiele, B. 1997. Object Recognition using Multidimensional Receptive Field Histograms. Ph.D. Thesis (I.N.P.Grenoble English translation).Google Scholar
  52. Schmid, C. and Mohr, R. 1997. Local grayvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5):530–535.Google Scholar
  53. Schmid, C., Mohr, R., and Bauckhage, C. 1998. Comparing and evaluating interest points. In ICCV'98 Sixth International Conference on Computer Vision.Google Scholar
  54. Sirovich, L. and Kirby, M. 1987. Low- dimensional procedure for the characterization of human faces. Journal of the Optical Society of America, 4(3):519–524.Google Scholar
  55. Slater, D. and Healey, G. 1995. Combining color and geometric information for the illumination invariant recognition of 3d objects. In ICCV'95 Fifth International Conference on Computer Vision, pp. 563–568.Google Scholar
  56. Swain, M. and Ballard, D. 1991. Color indexing. International Journal of Computer Vision, 7(1):11–32.Google Scholar
  57. Tsotsos, J. 1989. The complexity of perceptual search tasks. In Proceedings of the 11th International Joint Conference on Artificial Intelligence, pp. 1571–1577.Google Scholar
  58. Turk, M. and Pentland, A. 1991. Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1):71–86.Google Scholar
  59. Westelius, C.-J. 1992. Preattentive Gaze Control for Robot Vision. Ph.D. Thesis, Department of Electrical Engineering, Linköping University.Google Scholar
  60. Wolfson, H. 1990. Model-based object recognition by geometric hashing. In ECCV'90 First European Conference on Computer Vision, pp. 526–536.Google Scholar
  61. Young, R. 1986. Simulation of human retinal function with the gaussian derivative model. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 564–569.Google Scholar

Copyright information

© Kluwer Academic Publishers 2000

Authors and Affiliations

  • Bernt Schiele
    • 1
  • James L. Crowley
    • 2
  1. 1.MIT Media LaboratoryCambridgeUSA
  2. 2.GRAVIR, INRIA Rhône–AlpesMonbonnotFrance

Personalised recommendations