International Journal of Computer Vision

, Volume 88, Issue 2, pp 284–302 | Cite as

Unsupervised Object Discovery: A Comparison

  • Tinne Tuytelaars
  • Christoph H. Lampert
  • Matthew B. Blaschko
  • Wray Buntine
Open Access
Article

Abstract

The goal of this paper is to evaluate and compare models and methods for learning to recognize basic entities in images in an unsupervised setting. In other words, we want to discover the objects present in the images by analyzing unlabeled data and searching for re-occurring patterns. We experiment with various baseline methods, methods based on latent variable models, as well as spectral clustering methods. The results are presented and compared both on subsets of Caltech256 and MSRC2, data sets that are larger and more challenging and that include more object classes than what has previously been reported in the literature. A rigorous framework for evaluating unsupervised object discovery methods is proposed.

Keywords

Object discovery Unsupervised object recognition Evaluation 

References

  1. Bart, E., Porteous, I., & Perona, P. (2008). Unsupervised learning of visual taxonomies. In Proceedings of the IEEE conference on computer vision and pattern recognition. Google Scholar
  2. Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6), 1373–1396. MATHCrossRefGoogle Scholar
  3. Bengio, Y., Delalleau, O., Le Roux, N., Paiement, J.-F., Vincent, P., & Ouimet, M. (2004). Learning eigenfunctions links spectral embedding and kernel pca. Neural Computation, 16(10), 2197–2219. MATHCrossRefGoogle Scholar
  4. Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94(2), 115–147. CrossRefGoogle Scholar
  5. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. MATHCrossRefGoogle Scholar
  6. Buntine, W. L. (2002). Variational extensions to EM and multinomial PCA. In 13th European conference on machine learning (ECML’02), Helsinki, Finland. Google Scholar
  7. Buntine, W. L., & Jakulin, A. (2006). Discrete components analysis. In C. Saunders, M. Grobelnik, S. Gunn & J. Shawe-Taylor (Eds.), Subspace, latent structure and feature selection techniques. Berlin: Springer. Google Scholar
  8. Canny, J. (2004). GaP: a factor model for discrete data. In SIGIR 2004 (pp. 122–129). Google Scholar
  9. Chapelle, O., Haffner, P., & Vapnik, V. (1999). Svms for histogram-based image classification. In IEEE transactions on neural networks, special issue on support vectors. Google Scholar
  10. Clarke, B. S., & Barron, A. R. (1994). Jeffrey’s prior is asymptotically least favorable under entropy risk. Journal of Statistical Planning and Inference, 41, 37–60. MATHCrossRefMathSciNetGoogle Scholar
  11. Grauman, K., & Darrell, T. (2006). Unsupervised learning of categories from sets of partially matching image features. In Proceedings of the IEEE conference on computer vision and pattern recognition. Google Scholar
  12. Griffin, G., Holub, A., & Perona, P. (2007). Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology. Google Scholar
  13. Hofmann, T. (1999). Probabilistic latent semantic indexing. In Research and development in information retrieval (pp. 50–57). Google Scholar
  14. Kim, G., Faloutsos, C., & Hebert, M. (2008). Unsupervised modeling of object categories using link analysis techniques. In IEEE conference on computer vision and pattern recognition. Google Scholar
  15. Lee, D., & Seung, H. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401, 788–791. CrossRefGoogle Scholar
  16. Liu, D., & Chen, T. (2007). A topic-motion model for unsupervised video object discovery. In Proceedings of the IEEE conference on computer vision and pattern recognition. Google Scholar
  17. Lowe, D. (2004). Distinctive image features from scale-invariant keypoint. International Journal of Computer Vision, 2(60), 91–110. CrossRefGoogle Scholar
  18. Meila, M. (2007). Comparing clusterings: an information based distance. Journal of Multivariate Analysis, 98, 873–895. MATHCrossRefMathSciNetGoogle Scholar
  19. Mikolajczyk, K., & Schmid, C. (2004). Scale and affine invariant interest point detectors. International Journal of Computer Vision, 1(60), 63–86. CrossRefGoogle Scholar
  20. Ng, A., Jordan, M., & Weiss, Y. (2002). On spectral clustering: Analysis and an algorithm. In T. K. Leen, T. G. Dietterich, & V. Tresp (Eds.), Advances in neural information processing systems, Vol. 14. Cambridge: MIT Press. Google Scholar
  21. Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2007). Object retrieval with large vocabularies and fast spatial matching. In Proceedings of the IEEE conference on computer vision and pattern recognition. Google Scholar
  22. Rosenberg, A., & Hirschberg, J. (2007). V-measure: A conditional entropy-based external cluster evaluation measure. In Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning, EMNLP-CoNLL (pp. 410–420). Google Scholar
  23. Russell, B. C., Efros, A. A., Sivic, J., Freeman, W. T., & Zisserman, A. (2006). Using multiple segmentations to discover objects and their extent in image collections. In Proceedings of the IEEE conference on computer vision and pattern recognition. Google Scholar
  24. Schölkopf, B., Smola, A., & Müller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5), 1299–1319. CrossRefGoogle Scholar
  25. Shotton, J., Winn, J., Rother, C., & Criminisi, A. (2006). Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In Proceedings European conference on computer vision. Google Scholar
  26. Sivic, J., Russell, B. C., Efros, A., Zisserman, A., & Freeman, W. T. (2005). Discovering object categories in image collections. In Proceedings of the international conference on computer vision. Google Scholar
  27. Sivic, J., Russell, B. C., Zisserman, A., Freeman, W. T., & Efros, A. A. (2008). Unsupervised discovery of visual object class hierarchies. In Proceedings of the IEEE conference on computer vision and pattern recognition. Google Scholar
  28. Tang, J., & Lewis, P. (2008). Non-negative matrix factorisation for object class discovery and image auto-annotation. In ACM international conference on image and video retrieval. Google Scholar
  29. Todorovic, S., & Ahuja, N. (2006). Extracting subimages of an unknown category from a set of images. In Proceedings of the IEEE conference on computer vision and pattern recognition. Google Scholar
  30. von Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and Computing, 17(4), 395–416. CrossRefMathSciNetGoogle Scholar
  31. Wang, X., & Grimson, E. (2008). Spatial latent Dirichlet allocation. In Proceedings of neural information processing systems conference. Google Scholar
  32. Weber, M., Welling, M., & Perona, P., (2000). Towards automatic discovery of object categories. In Proceedings of the IEEE conference on computer vision and pattern recognition. Google Scholar

Copyright information

© The Author(s) 2009

Authors and Affiliations

  • Tinne Tuytelaars
    • 1
  • Christoph H. Lampert
    • 2
  • Matthew B. Blaschko
    • 2
    • 3
  • Wray Buntine
    • 4
    • 5
  1. 1.ESAT-PSIK.U. LeuvenLeuvenBelgium
  2. 2.Max Planck Institute for Biological CyberneticsTübingenGermany
  3. 3.University of OxfordOxfordUK
  4. 4.NICTACanberraAustralia
  5. 5.Australian National UniversityCanberraAustralia

Personalised recommendations