Foreground Focus: Unsupervised Learning from Partially Matching Images

Lee, Yong Jae; Grauman, Kristen

doi:10.1007/s11263-009-0252-y

Foreground Focus: Unsupervised Learning from Partially Matching Images

Published: 27 May 2009

Volume 85, pages 143–166, (2009)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Yong Jae Lee¹ &
Kristen Grauman²

1002 Accesses
83 Citations
3 Altmetric
Explore all metrics

Abstract

We present a method to automatically discover meaningful features in unlabeled image collections. Each image is decomposed into semi-local features that describe neighborhood appearance and geometry. The goal is to determine for each image which of these parts are most relevant, given the image content in the remainder of the collection. Our method first computes an initial image-level grouping based on feature correspondences, and then iteratively refines cluster assignments based on the evolving intra-cluster pattern of local matches. As a result, the significance attributed to each feature influences an image’s cluster membership, while related images in a cluster affect the estimated significance of their features. We show that this mutual reinforcement of object-level and feature-level similarity improves unsupervised image clustering, and apply the technique to automatically discover categories and foreground regions in images from benchmark datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agarwal, A., & Triggs, B. (2006). Hyperfeatures multilevel local coding for visual recognition. In European conference on computer vision.
Chum, O., & Zisserman, A. (2007). An exemplar model for learning object classes. In Conference on computer vision and pattern recognition.
Dhillon, I., Guan, Y., & Kulis, B. (2004). Kernel k-means: spectral clustering and normalized cuts. In ACM SIGKDD international conference on knowledge discovery and data mining.
Dorko, G., & Schmid, C. (2003). Selection of scale-invariant parts for object class recognition. In International conference on computer vision.
Dueck, D., & Frey, B. (2007). Non-metric affinity propagation for unsupervised image categorization. In International conference on computer vision.
Dy, J., & Brodley, C. (2004). Feature selection for unsupervised learning. Journal of Machine Learning Research, 5, 845–889.
MathSciNet Google Scholar
Everingham, M., Zisserman, A., Williams, C. K. I., & Van Gool, L. (2006). The PASCal visual object classes challenge 2006 (VOC2006) Results.
Fei-Fei, L., & Perona, P. (2005). A Bayesian hierarchical model for learning natural scene categories. In Conference on computer vision and pattern recognition.
Fei-Fei, L., Fergus, R., & Perona, P. (2004). Caltech 101 image database.
Fergus, R., Fei-Fei, L., Perona, P., & Zisserman, A. (2005). Learning object categories from Google’s image search. In International conference on computer vision.
Grauman, K., & Darrell, T. (2004). Fast contour matching using approximate Earth mover’s distance. In Conference on computer vision and pattern recognition.
Grauman, K., & Darrell, T. (2005). The pyramid match kernel: Discriminative classification with sets of image features. In International conference on computer vision.
Grauman, K., & Darrell, T. (2006). Unsupervised learning of categories from sets of partially matching image features. In Conference on computer vision and pattern recognition.
Griffin, G., Holub, A., & Perona, P. (2007). Caltech 256 image database.
Lazebnik, S., Schmid, C., & Ponce, J. (2003). A sparse texture representation using affine-invariant regions. In Conference on computer vision and pattern recognition.
Lazebnik, S., Schmid, C., & Ponce, J. (2004). Semi-local affine parts for object recognition. In British machine vision conference.
Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Conference on computer vision and pattern recognition.
Lee, Y. J., & Grauman, K. (2008a). Foreground focus: Finding meaningful features in unlabeled images. In British machine vision conference.
Lee, Y. J., & Grauman, K. (2008b). Discovering multi-aspect structure to learn from loosely labeled image collections. Technical report, UT-Austin, May 2008b.
Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In Wkshp on statistical learning in computer vision.
Ling, H., & Soatto, S. (2007). Proximity distribution kernel for geometric context in recognition. In International conference on computer vision.
Liu, D., & Chen, T. (2007). Unsupervised image categorization and object localization using topic models and correspondences between images. In International conference on computer vision.
Liu, D., & Chen, T. (2006). Semantic-shift for unsupervised object detection. In CVPR Wkshop on Beyond Patches.
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2).
Marszalek, M., & Schmid, C. (2006). Spatial weighting for bag-of-features. In Conference on computer vision and pattern recognition.
Mikolajczyk, K., & Schmid, C. (2004). Scale and affine invariant interest point detectors. International Journal of Computer Vision, 1(60), 63–86.
Article Google Scholar
Nowak, E., Jurie, F., & Triggs, B. (2006). Sampling strategies for bag-of-features image classification. In European conference on computer vision.
Opelt, A., Fussenegger, M., Pinz, A., & Auer, P. (2006). Generic object recognition with boosting. Transacations on Pattern Analysis and Machine Intelligence 28(3).
Quack, T., Ferrari, V., Leibe, B., & Gool, L. V. (2007). Efficient mining of frequent and distinctive feature configurations. In International conference on computer vision.
Quelhas, P., Monay, F., Odobez, J.-M., Gatica-Perez, D., Tuytelaars, T., & Van Gool, L. (2005). Modeling scenes with local descriptors and latent aspects. In International conference on computer vision, Beijing, China, October 2005.
Rubner, Y., Tomasi, C., & Guibas, L. (2000). The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40(2), 99–121.
Article MATH Google Scholar
Russell, B., Efros, A., Sivic, J., Freeman, W., & Zisserman, A. (2006). Using multiple segmentations to discover objects and their extent in image collections. In Conference on computer vision and pattern recognition.
Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. Transacations on Pattern Analysis and Machine Intelligence, 22(8), 888–905.
Article Google Scholar
Sivic, J., & Zisserman, A. (2004). Video data mining using configurations of viewpoint ivariant regions. In Conference on computer vision and pattern recognition.
Sivic, J., Russell, B., Efros, A., Zisserman, A., & Freeman, W. (2005). Discovering object categories in image collections. In International conference on computer vision.
Weber, M., Welling, M., & Perona, P. (2000). Unsupervised learning of models for recognition. In European conference on computer vision.
Winn, J., & Jojic, N. (2005). LOCUS: Learning object classes with unsupervised segmentation. In International conference on computer vision.
Winn, J., Criminisi, A., & Minka, T. (2005). Object categorization by learned universal visual dictionary. In International conference on computer vision.
Zelnik-Manor, L., & Perona, P. (2004). Self-tuning spectral clustering. In Advances in neural information processing (NIPS), Vancouver, Canada, December 2004.

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, TX, 78712, USA
Yong Jae Lee
Department of Computer Sciences, University of Texas at Austin, Austin, TX, 78712, USA
Kristen Grauman

Authors

Yong Jae Lee
View author publications
You can also search for this author in PubMed Google Scholar
Kristen Grauman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yong Jae Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, Y.J., Grauman, K. Foreground Focus: Unsupervised Learning from Partially Matching Images. Int J Comput Vis 85, 143–166 (2009). https://doi.org/10.1007/s11263-009-0252-y

Download citation

Received: 11 July 2008
Accepted: 13 May 2009
Published: 27 May 2009
Issue Date: November 2009
DOI: https://doi.org/10.1007/s11263-009-0252-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Foreground Focus: Unsupervised Learning from Partially Matching Images

Abstract

Access this article

Similar content being viewed by others

Toward Unsupervised, Multi-object Discovery in Large-Scale Image Collections

Feature Clustering with Fading Affect Bias: Building Visual Vocabularies on the Fly

Efficient Label Collection for Image Datasets via Hierarchical Clustering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Foreground Focus: Unsupervised Learning from Partially Matching Images

Abstract

Access this article

Similar content being viewed by others

Toward Unsupervised, Multi-object Discovery in Large-Scale Image Collections

Feature Clustering with Fading Affect Bias: Building Visual Vocabularies on the Fly

Efficient Label Collection for Image Datasets via Hierarchical Clustering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation