Skip to main content

A Fragment-Based Approach to Object Representation and Classification

  • Conference paper
  • First Online:
Book cover Visual Form 2001 (IWVF 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2059))

Included in the following conference series:

Abstract

The task of visual classification is the recognition of an object in the image as belonging to a general class of similar objects, such as a face, a car, a dog, and the like. This is a fundamental and natural task for biological visual systems, but it has proven difficult to perform visual classification by artificial computer vision systems. The main reason for this difficulty is the variability of shape within a class: different objects vary widely in appearance, and it is difficult to capture the essential shape features that characterize the members of one category and distinguish them from another, such as dogs from cats.

In this paper we describe an approach to classification using a fragment-based representation. In this approach, objects within a class are represented in terms of common image fragments that are used as building blocks for representing a large variety of different objects that belong to a common class. The fragments are selected from a training set of images based on a criterion of maximizing the mutual information of the fragments and the class they represent. For the purpose of classification the fragments are also organized into types, where each type is a collection of alternative fragments, such as different hairline or eye regions for face classification. During classification, the algorithm detects fragments of the different types, and then combines the evidence for the detected fragments to reach a final decision. Experiments indicate that it is possible to trade off the complexity of fragments with the complexity of the combination and decision stage, and this tradeoff is discussed.

The method is different from previous part-based methods in using class-specific object fragments of varying complexity, the method of selecting fragments, and the organization into fragment types. Experimental results of detecting face and car views show that the fragment-based approach can generalize well to a variety of novel image views within a class while maintaining low mis-classification error rates. We briefly discuss relationships between the proposed method and properties of parts of the primate visual system involved in object perception.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amit, Y., Geman, D., Wilder, K.: Joint Induction of Shape Features and Tree Classifiers. IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 19, 11 (1997) 1300–1306

    Article  Google Scholar 

  2. Bhat, D., Nayar, K. S.: Ordinal Measures for Image Correspondence. IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 20, 4 (1998) 415–423

    Article  Google Scholar 

  3. Biederman, I.: Human image understanding: recent research and theory. Computer Vision, Graphics and Image Processing 32 (1985) 29–73

    Article  Google Scholar 

  4. Binford, T. O.: Visual perception by computer. IEEE conf. on systems and control, Vol. 94, 2 (1971) 115–147

    Google Scholar 

  5. Chow, C.K., Liu, C.N.: Approximating Discrete Probability Distributions with Dependence Trees. IEEE Transactions on Information Theory, Vol. 14, 3 (1968) 462–467

    Article  MATH  Google Scholar 

  6. Cover, T.M. & Thomas, J.A.: Elements of Information Theory. Wiley Series in Telecommunication, New York (1991)

    MATH  Google Scholar 

  7. Edelman, S.: Representing 3D objects by sets of activities of receptive fields. Biological cybernitics 70 (1993) 37–45

    Article  MATH  Google Scholar 

  8. Gallant, J.L., Braun, J., Van Essen, D.C.: Selectivity for polar, hyperbolic, and cartesian gratings in macaque visual cortex. Science, 259 (1993) 100–103

    Article  Google Scholar 

  9. Green, D. M., Swets, J. A.: Signal Detection Theory and Psychophysics. Wiley, Chichester New York Brisbane Toronto (1966). Reprinted by Krieger, Huntingdon, New York (1974)

    Google Scholar 

  10. Hubel, D. H., Wiesel, T. N.: Receptive fields and functional architecture of monkey striate cortex. Journal of physiology 195 (1968) 215–243

    Google Scholar 

  11. Logothetis, N. K., Pauls J., Bülthoff H. H., Poggio T.: View-dependent object recognition in monkeys. Current biology, 4 (1994) 401–414

    Article  Google Scholar 

  12. Marr, D.: Vision. W.H. Freeman, San Francisco (1982)

    Google Scholar 

  13. Marr, D., Nishihara, H. K.: Representation and recognition of the spatial organization of three dimensional structure. Proceedings of the Royal Society of London B, 200 (1978) 269–294

    Google Scholar 

  14. Mel, W. B.: SEEMORE: Combining color, shape and texture histogramming in a neurally inspired approach to visual object recognition. Neural computation 9 (1997) 777–804

    Article  Google Scholar 

  15. Minsky, M., Papert, S.: Perceptrons. The MIT Press, Cambridge, Massachusetts (1969)

    MATH  Google Scholar 

  16. Miyashita, Y., Chang, H.S.: Neuronal correlate of pictorial short-term memory in the primate temporal cortex. Nature, 331, (1988) 68–70

    Article  Google Scholar 

  17. Murase, H., Nayar, S.K.: Visual learning and recognition of 3-D objects from appearance. International J. of Com. Vision, 14 (1995) 5–24

    Article  Google Scholar 

  18. Nelson, C. R., Selinger A.: A Cubist approach to object recognition. International Conference on Computer Vision '98 (1998) 614–621

    Google Scholar 

  19. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufman Publishers, San Mateo, California (1988)

    Google Scholar 

  20. Perret, D. I., Rolls, E. T., Caan W.: Visual neurons responsive to faces in the monkey temporal cortex. Experimental brain research, 47 (1982) 329–342

    Article  Google Scholar 

  21. Poggio, T., Sung, K.: Finding human faces with a gaussian mixture distribution-base face model. Computer analysis of image and patterns (1995) 432–439

    Google Scholar 

  22. Poggio, T., Edelman, S.: A network that learns to recognize three-dimensional objects. Nature, 343 (1990) 263–266

    Article  Google Scholar 

  23. Riesenhuber, M., Poggio, T.: Hierarchical models of object recognition in cortex. Nature Neuroscience, Vol. 2, 11 (1999) 1019–1025

    Article  Google Scholar 

  24. Rolls, E. T.: Neurons in the cortex of the temporal lobe and in the amygdala of the monkey with responses selective for faces. Human neurobiology, 3 (1984) 209–222

    Google Scholar 

  25. Rosch, E. Mervis, C.B., Gray, W.D., Johnson, S.M., Boyes-Braem, P.: Basic objects in natural categories. Cognitive Psychology, 8 (1976) 382–439

    Article  Google Scholar 

  26. Schneiderman, H., Kanade. T.: Probabilistic modeling of local appearance and spatial relationships for object recognition. Proc. IEEE Comp. Soc. Conference on Computer Vision and Pattern Recognition, CVPR98, (1998) 45–51

    Google Scholar 

  27. Tanaka, K.: Neural Mechanisms of Object Recognition. Science, Vol. 262 (1993) 685–688.

    Article  Google Scholar 

  28. Turk M., Pentland A.: "Eigenfaces for recognition", Cognitive Neuroscience, 3 (1990) 71–86

    Article  Google Scholar 

  29. Ullman, S., Basri, R.: Recognition by Linear Combination of Models. IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 13, 10 (1991) 992–1006

    Article  Google Scholar 

  30. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)

    MATH  Google Scholar 

  31. von der Heydt, R., Peterhans, E., Baumgartner G.: Illusory Contours and Cortical Neuron Responses. Science, 224 (1984) 1260–1262

    Article  Google Scholar 

  32. Weber, M, Welling M. & Perona, P.: Towards Automatic Discovery of Object Categories. Proc. IEEE Comp. Soc. Conference on Computer Vision and Pattern Recognition, CVPR2000, 2, (2000) 101–108

    Google Scholar 

  33. Wiskott, L., Fellous J. M., Krüger N., von der Malsburg, C.: Face Recognition by Elastic Bunch Graph Matching. Intelligent Biometric Techniques in Fingerprint and Face Recognition, 11 (1999) 355–396

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ullman, S., Sali, E., Vidal-Naquet, M. (2001). A Fragment-Based Approach to Object Representation and Classification. In: Arcelli, C., Cordella, L.P., di Baja, G.S. (eds) Visual Form 2001. IWVF 2001. Lecture Notes in Computer Science, vol 2059. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45129-3_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-45129-3_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42120-7

  • Online ISBN: 978-3-540-45129-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics