Weakly Supervised Learning of Visual Models and Its Application to Content-Based Retrieval

Schmid, Cordelia

doi:10.1023/B:VISI.0000004829.38247.b0

Weakly Supervised Learning of Visual Models and Its Application to Content-Based Retrieval

Published: January 2004

Volume 56, pages 7–16, (2004)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Cordelia Schmid¹

214 Accesses
33 Citations
3 Altmetric
Explore all metrics

Abstract

This paper presents a method for weakly supervised learning of visual models. The visual model is based on a two-layer image description: a set of “generic” descriptors and their distribution over neighbourhoods. “Generic” descriptors represent sets of similar rotational invariant feature vectors. Statistical spatial constraints describe the neighborhood structure and make our description more discriminant. The joint probability of the frequencies of “generic” descriptors over a neighbourhood is multi-modal and is represented by a set of “neighbourhood-frequency” clusters. Our image description is rotationally invariant, robust to model deformations and characterizes efficiently “appearance-based” visual structure. The selection of distinctive clusters determines model features (common to the positive and rare in the negative examples). Visual models are retrieved and localized using a probabilistic score. Experimental results for “textured” animals and faces show a very good performance for retrieval as well as localization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast and Scalable Image Retrieval Using Predictive Clustering Trees

Multiple Instance Classification in the Image Domain

Unsupervised Label Learning on Manifolds by Spatially Regularized Geometric Assignment

References

Amit, Y. and Geman, D. 1999. A computational model for visual selection. Neural Computation, 11(7):1691–1715.
Google Scholar
Belongie, S., Carson, C., Greenspan, H., and Malik, J. 1998. Colorand texture-based image segmentation using EM and its application to content-based image retrieval. In Proceedings of the 6th International Conference on Computer Vision, Bombay, India, pp. 675–682.
Bishop, C.M. 1995. Neural Networks for Pattern Recognition. Oxford University Press.
Cozzi, A., Crespi, B., Valentinotti, F., and Worgotter, F. 1997. Performance of phase-based algorithms for disparity estimation. Machine Vision and Applications, 9(5/6):334–340.
Google Scholar
Duda, R. and Hart, P. 1973. Pattern Classification and Scene Analysis. Wiley-Interscience.
Forsyth, D.A. and Fleck, M.M. 1997. Body plans. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Puerto Rico, USA, pp. 678–683.
Gabor, D. 1946. Theory of communication. Journal I.E.E., 3(93):429–457.
Google Scholar
Jain, A.K. and Farrokhnia, F. 1991. Unsupervised texture segmentation using Gabor filters. Pattern Recognition, 24(12):1167–1186.
Google Scholar
Koenderink, J.J. and van Doorn, A.J. 1987. Representation of local geometry in the visual system. Biological Cybernetics, 55:367–375.
Google Scholar
Konishi, S. and Yuille, A.L. 2000. Statistical cues for domain specific image segmentation with performance analysis. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Hilton Head Island, South Carolina, USA, pp. 125–132.
Lai, C., Tax, D., Duin, R., Pekalska, E., and Paclik, P. 2002. Oneclass classifiers for image database retrieval. In Multiple Classifier Systems, pp. 212–221.
Lazebnik, S., Schmid, C., and Ponce, J. 2003. Sparse texture representation using affine-invariant neighborhoods. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA, vol. II, pp. 313–324.
Google Scholar
Lindeberg, T. 1998. Feature detection with automatic scale selection. International Journal of Computer Vision, 30(2):79–116.
Google Scholar
Malik, J., Belongie, S., Shi, J., and Leung, T. 1999. Textons, contours and regions: Cue integration in image segmentation. In Proceedings of the 7th International Conference on Computer Vision, Kerkyra, Greece, pp. 918–925.
Mikolajczyk, K. and Schmid, C. 2001. Indexing based on scale invariant interest points. In Proceedings of the 8th International Conference on Computer Vision, Vancouver, Canada, pp. 525–531.
Niblack, W., Barber, R., Equitz, W., Fickner, M., Glasman, E., Petkovic, D., and Yanker, P. 1993. The QBIC project: Querying images by content using color, texture and shape. In Proceedings of the SPIE Conference on Geometric Methods in Computer Vision II, San Diego, California, USA.
Papageorgiou, C. and Poggio, T. 2000. A trainable system for object detection. International Journal of Computer Vision, 38(1):15–33.
Google Scholar
Paragios, N. and Deriche, R. 1999. Geodesic active regions for supervised texture segmentation. In Proceedings of the 7th International Conference on Computer Vision, Kerkyra, Greece, pp. 926–932.
Ratan, A.L., Maron, O., Grimson, W.E.L., and Lozano-Pérez, T. 1999. A framework for learning query concepts in image classification. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Fort Collins, Colorado, USA, pp. 423–429.
Rikert, T.D., Jones, M.J., and Viola, P. 1999. A cluster-based statistical model for object detection. In Proceedings of the 7th International Conference on Computer Vision, Kerkyra, Greece, pp. 1046–1053.
Rubner, Y. and Tomasi, C. 1999. Texture-based image retrieval without segmentation. In Proceedings of the 7th International Conference on Computer Vision, Kerkyra, Greece, vol. 2, pp. 1018–1024.
Google Scholar
Schmid, C. 2001. Constructing models for content-based image retrieval. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, USA, vol. II, pp. 39–45.
Google Scholar
Schneiderman, H. and Kanade, T. 2000. A statistical method for 3D object detection applied to faces and cars. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Hilton Head Island, South Carolina, USA, vol. I, pp. 746–751.
Google Scholar
Sung, K.K. and Poggio, T. 1998. Example-based learning for viewbased human face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(1):39–51.
Google Scholar
Vapnik, V.1995. The Nature of Statistical Learning Theory. Springer-Verlag.
Varma, M. and Zisserman, A. 2002. Classifying images of materials: Achieving viewpoint and illumination indepence. In Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark, vol. III, pp. 255–271.
Google Scholar
Vogelhuber, V. and Schmid, C. 2000. Face detection based on generic local descriptors and spatial constraints. In Proceedings of the 15th International Conference on Pattern Recognition, Barcelona, Spain, vol. 1, pp. 1084–1087.
Google Scholar
Weber, M., Welling, M., and Perona, P. 2000. Unsupervised learning of models for recognition. In Proceedings of the 6th European Conference on Computer Vision, Dublin, Ireland, pp. 18–32.

Download references

Author information

Authors and Affiliations

INRIA Rhône-Alpes, 655 av. de l'Europe, 38330, Montbonnot, France
Cordelia Schmid

Authors

Cordelia Schmid
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schmid, C. Weakly Supervised Learning of Visual Models and Its Application to Content-Based Retrieval. International Journal of Computer Vision 56, 7–16 (2004). https://doi.org/10.1023/B:VISI.0000004829.38247.b0

Download citation

Issue Date: January 2004
DOI: https://doi.org/10.1023/B:VISI.0000004829.38247.b0

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Weakly Supervised Learning of Visual Models and Its Application to Content-Based Retrieval

Abstract

Access this article

Similar content being viewed by others

Fast and Scalable Image Retrieval Using Predictive Clustering Trees

Multiple Instance Classification in the Image Domain

Unsupervised Label Learning on Manifolds by Spatially Regularized Geometric Assignment

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Weakly Supervised Learning of Visual Models and Its Application to Content-Based Retrieval

Abstract

Access this article

Similar content being viewed by others

Fast and Scalable Image Retrieval Using Predictive Clustering Trees

Multiple Instance Classification in the Image Domain

Unsupervised Label Learning on Manifolds by Spatially Regularized Geometric Assignment

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation