International Journal of Computer Vision

, Volume 85, Issue 2, pp 143–166 | Cite as

Foreground Focus: Unsupervised Learning from Partially Matching Images

Article

Abstract

We present a method to automatically discover meaningful features in unlabeled image collections. Each image is decomposed into semi-local features that describe neighborhood appearance and geometry. The goal is to determine for each image which of these parts are most relevant, given the image content in the remainder of the collection. Our method first computes an initial image-level grouping based on feature correspondences, and then iteratively refines cluster assignments based on the evolving intra-cluster pattern of local matches. As a result, the significance attributed to each feature influences an image’s cluster membership, while related images in a cluster affect the estimated significance of their features. We show that this mutual reinforcement of object-level and feature-level similarity improves unsupervised image clustering, and apply the technique to automatically discover categories and foreground regions in images from benchmark datasets.

Keywords

Object recognition Feature selection Unsupervised learning Feature descriptor 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, A., & Triggs, B. (2006). Hyperfeatures multilevel local coding for visual recognition. In European conference on computer vision. Google Scholar
  2. Chum, O., & Zisserman, A. (2007). An exemplar model for learning object classes. In Conference on computer vision and pattern recognition. Google Scholar
  3. Dhillon, I., Guan, Y., & Kulis, B. (2004). Kernel k-means: spectral clustering and normalized cuts. In ACM SIGKDD international conference on knowledge discovery and data mining. Google Scholar
  4. Dorko, G., & Schmid, C. (2003). Selection of scale-invariant parts for object class recognition. In International conference on computer vision. Google Scholar
  5. Dueck, D., & Frey, B. (2007). Non-metric affinity propagation for unsupervised image categorization. In International conference on computer vision. Google Scholar
  6. Dy, J., & Brodley, C. (2004). Feature selection for unsupervised learning. Journal of Machine Learning Research, 5, 845–889. MathSciNetGoogle Scholar
  7. Everingham, M., Zisserman, A., Williams, C. K. I., & Van Gool, L. (2006). The PASCal visual object classes challenge 2006 (VOC2006) Results. Google Scholar
  8. Fei-Fei, L., & Perona, P. (2005). A Bayesian hierarchical model for learning natural scene categories. In Conference on computer vision and pattern recognition. Google Scholar
  9. Fei-Fei, L., Fergus, R., & Perona, P. (2004). Caltech 101 image database. Google Scholar
  10. Fergus, R., Fei-Fei, L., Perona, P., & Zisserman, A. (2005). Learning object categories from Google’s image search. In International conference on computer vision. Google Scholar
  11. Grauman, K., & Darrell, T. (2004). Fast contour matching using approximate Earth mover’s distance. In Conference on computer vision and pattern recognition. Google Scholar
  12. Grauman, K., & Darrell, T. (2005). The pyramid match kernel: Discriminative classification with sets of image features. In International conference on computer vision. Google Scholar
  13. Grauman, K., & Darrell, T. (2006). Unsupervised learning of categories from sets of partially matching image features. In Conference on computer vision and pattern recognition. Google Scholar
  14. Griffin, G., Holub, A., & Perona, P. (2007). Caltech 256 image database. Google Scholar
  15. Lazebnik, S., Schmid, C., & Ponce, J. (2003). A sparse texture representation using affine-invariant regions. In Conference on computer vision and pattern recognition. Google Scholar
  16. Lazebnik, S., Schmid, C., & Ponce, J. (2004). Semi-local affine parts for object recognition. In British machine vision conference. Google Scholar
  17. Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Conference on computer vision and pattern recognition. Google Scholar
  18. Lee, Y. J., & Grauman, K. (2008a). Foreground focus: Finding meaningful features in unlabeled images. In British machine vision conference. Google Scholar
  19. Lee, Y. J., & Grauman, K. (2008b). Discovering multi-aspect structure to learn from loosely labeled image collections. Technical report, UT-Austin, May 2008b. Google Scholar
  20. Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In Wkshp on statistical learning in computer vision. Google Scholar
  21. Ling, H., & Soatto, S. (2007). Proximity distribution kernel for geometric context in recognition. In International conference on computer vision. Google Scholar
  22. Liu, D., & Chen, T. (2007). Unsupervised image categorization and object localization using topic models and correspondences between images. In International conference on computer vision. Google Scholar
  23. Liu, D., & Chen, T. (2006). Semantic-shift for unsupervised object detection. In CVPR Wkshop on Beyond Patches. Google Scholar
  24. Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2). Google Scholar
  25. Marszalek, M., & Schmid, C. (2006). Spatial weighting for bag-of-features. In Conference on computer vision and pattern recognition. Google Scholar
  26. Mikolajczyk, K., & Schmid, C. (2004). Scale and affine invariant interest point detectors. International Journal of Computer Vision, 1(60), 63–86. CrossRefGoogle Scholar
  27. Nowak, E., Jurie, F., & Triggs, B. (2006). Sampling strategies for bag-of-features image classification. In European conference on computer vision. Google Scholar
  28. Opelt, A., Fussenegger, M., Pinz, A., & Auer, P. (2006). Generic object recognition with boosting. Transacations on Pattern Analysis and Machine Intelligence 28(3). Google Scholar
  29. Quack, T., Ferrari, V., Leibe, B., & Gool, L. V. (2007). Efficient mining of frequent and distinctive feature configurations. In International conference on computer vision. Google Scholar
  30. Quelhas, P., Monay, F., Odobez, J.-M., Gatica-Perez, D., Tuytelaars, T., & Van Gool, L. (2005). Modeling scenes with local descriptors and latent aspects. In International conference on computer vision, Beijing, China, October 2005. Google Scholar
  31. Rubner, Y., Tomasi, C., & Guibas, L. (2000). The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40(2), 99–121. MATHCrossRefGoogle Scholar
  32. Russell, B., Efros, A., Sivic, J., Freeman, W., & Zisserman, A. (2006). Using multiple segmentations to discover objects and their extent in image collections. In Conference on computer vision and pattern recognition. Google Scholar
  33. Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. Transacations on Pattern Analysis and Machine Intelligence, 22(8), 888–905. CrossRefGoogle Scholar
  34. Sivic, J., & Zisserman, A. (2004). Video data mining using configurations of viewpoint ivariant regions. In Conference on computer vision and pattern recognition. Google Scholar
  35. Sivic, J., Russell, B., Efros, A., Zisserman, A., & Freeman, W. (2005). Discovering object categories in image collections. In International conference on computer vision. Google Scholar
  36. Weber, M., Welling, M., & Perona, P. (2000). Unsupervised learning of models for recognition. In European conference on computer vision. Google Scholar
  37. Winn, J., & Jojic, N. (2005). LOCUS: Learning object classes with unsupervised segmentation. In International conference on computer vision. Google Scholar
  38. Winn, J., Criminisi, A., & Minka, T. (2005). Object categorization by learned universal visual dictionary. In International conference on computer vision. Google Scholar
  39. Zelnik-Manor, L., & Perona, P. (2004). Self-tuning spectral clustering. In Advances in neural information processing (NIPS), Vancouver, Canada, December 2004. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringUniversity of Texas at AustinAustinUSA
  2. 2.Department of Computer SciencesUniversity of Texas at AustinAustinUSA

Personalised recommendations