Advertisement

A Sparse Object Category Model for Efficient Learning and Complete Recognition

  • Rob Fergus
  • Pietro Perona
  • Andrew Zisserman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4170)

Abstract

We present a “parts and structure” model for object category recognition that can be learnt efficiently and in a weakly-supervised manner: the model is learnt from example images containing category instances, without requiring segmentation from background clutter.

The model is a sparse representation of the object, and consists of a star topology configuration of parts modeling the output of a variety of feature detectors. The optimal choice of feature types (whose repertoire includes interest points, curves and regions) is made automatically.

In recognition, the model may be applied efficiently in a complete manner, bypassing the need for feature detectors, to give the globally optimal match within a query image. The approach is demonstrated on a wide variety of categories, and delivers both successful classification and localization of the object within the image.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agarwal, S., Roth, D.: Learning a sparse representation for object detection. In: Heyden, A., et al. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 113–127. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Borenstein, E., Ullman, S.: Class-specific, top-down segmentation. In: Heyden, A., et al. (eds.) ECCV 2002. LNCS, vol. 2351, pp. 109–122. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Burl, M., Leung, T., Perona, P.: Face localization via shape statistics. In: Int. Workshop on Automatic Face and Gesture Recognition (1995)Google Scholar
  4. 4.
    Crandall, D., Felzenszwalb, P., Huttenlocher, D.: Spatial priors for part-based recognition using statistical models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, vol.1, pp. 10–17 (2005)Google Scholar
  5. 5.
    Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004)Google Scholar
  6. 6.
    Feltzenswalb, P., Huttenlocher, D.: Pictorial structures for object recognition. International Journal of Computer Vision 61, 55–79 (2005)CrossRefGoogle Scholar
  7. 7.
    Fergus, R., Perona, P.: Caltech Object Category datasets (2003), http://www.vision.caltech.edu/html-files/archive.html
  8. 8.
    Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (June 2003)Google Scholar
  9. 9.
    Fergus, R., Perona, P., Zisserman, A.: A visual category filter for Google images. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 242–256. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  10. 10.
    Fischler, M., Elschlager, R.: The representation and matching of pictorial structures. IEEE Transactions on Computer 22(1), 67–92 (1973)CrossRefGoogle Scholar
  11. 11.
    Harris, C.J., Stephens, M.: A combined corner and edge detector. In: Proceedings of the 4th Alvey Vision Conference, Manchester, pp. 147–151 (1988)Google Scholar
  12. 12.
    Jurie, F., Schmid, C.: Scale-invariant shape features for recognition of object categories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, pp. 90–96 (2004)Google Scholar
  13. 13.
    Kadir, T., Brady, M.: Scale, saliency and image description. International Journal of Computer Vision 45(2), 83–105 (2001)zbMATHCrossRefGoogle Scholar
  14. 14.
    Ke, Y., Sukthankar, R.: PCA–SIFT: A more distinctive representation for local image descriptors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC (June 2004)Google Scholar
  15. 15.
    Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: Workshop on Statistical Learning in Computer Vision, ECCV (2004)Google Scholar
  16. 16.
    Lowe, D.: Local feature view clustering for 3D object recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, pp. 682–688. Springer, Heidelberg (2001)Google Scholar
  17. 17.
    Moreels, P., Maire, M., Perona, P.: Recognition by probabilistic hypothesis construction. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 55–68. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  18. 18.
    Opelt, A., Fussenegger, A., Auer, P.: Weak hypotheses and boosting for generic object detection and recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3024. Springer, Heidelberg (2004)Google Scholar
  19. 19.
    Thureson, J., Carlsson, S.: Appearance based qualitative image description for object class recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3022, pp. 518–529. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  20. 20.
    Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing features: efficient boosting procedures for multiclass object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, pp. 762–769 (2004)Google Scholar
  21. 21.
    Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Rob Fergus
    • 1
  • Pietro Perona
    • 2
  • Andrew Zisserman
    • 1
  1. 1.Dept. of Engineering ScienceUniversity of OxfordOxfordU.K.
  2. 2.Dept. of Electrical EngineeringCalifornia Institute of TechnologyPasadenaU.S.A.

Personalised recommendations