“Clustering by Composition” – Unsupervised Discovery of Image Categories

  • Alon Faktor
  • Michal Irani
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7578)


We define a “good image cluster” as one in which images can be easily composed (like a puzzle) using pieces from each other, while are difficult to compose from images outside the cluster. The larger and more statistically significant the pieces are, the stronger the affinity between the images. This gives rise to unsupervised discovery of very challenging image categories. We further show how multiple images can be composed from each other simultaneously and efficiently using a collaborative randomized search algorithm. This collaborative process exploits the “wisdom of crowds of images”, to obtain a sparse yet meaningful set of image affinities, and in time which is almost linear in the size of the image collection. “Clustering-by-Composition” can be applied to very few images (where a ‘cluster model’ cannot be ‘learned’), as well as on benchmark evaluation datasets, and yields state-of-the-art results.


Image Category Query Image Multiple Image Good Region Image Collection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Grauman, K., Darrell, T.: Unsupervised learning of categories from sets of partially matching image features. In: CVPR (2006)Google Scholar
  2. 2.
    Russell, B.C., Efros, A.A., Sivic, J., Freeman, W.T., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: CVPR (2006)Google Scholar
  3. 3.
    Lee, Y.J., Grauman, K.: Shape discovery from unlabeled image collections. In: CVPR (2009)Google Scholar
  4. 4.
    Payet, N., Todorovic, S.: From a Set of Shapes to Object Discovery. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 57–70. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  5. 5.
    Sivic, J., Russell, B., Efros, A., Zisserman, A., Freeman, W.: Discovering objects and their localization in images. In: ICCV (2005)Google Scholar
  6. 6.
    Kim, G., Faloutsos, C., Hebert, M.: Unsupervised modeling of object categories using link analysis techniques. In: CVPR (2008)Google Scholar
  7. 7.
    Lee, Y.J., Grauman, K.: Foreground focus: Unsupervised learning from partially matching images. IJCV 85, 143–166 (2009)CrossRefGoogle Scholar
  8. 8.
    Boiman, O., Irani, M.: Similarity by composition. In: NIPS (2006)Google Scholar
  9. 9.
    Gu, C., Lim, J.J., Arbelaez, P., Malik, J.: Recognition using regions. In: CVPR (2009)Google Scholar
  10. 10.
    Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: Patchmatch: A randomized correspondence algorithm for structural image editing. In: SIGGRAPH (2009)Google Scholar
  11. 11.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)Google Scholar
  12. 12.
    Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: CVPR (2007)Google Scholar
  13. 13.
    Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: CVPR (2008)Google Scholar
  14. 14.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. TPAMI 22, 888–905 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Alon Faktor
    • 1
  • Michal Irani
    • 1
  1. 1.Dept. of Computer Science and Applied MathThe Weizmann Institute of ScienceIsrael

Personalised recommendations