Measuring Image Distances via Embedding in a Semantic Manifold

  • Chen Fang
  • Lorenzo Torresani
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7575)


In this work we introduce novel image metrics that can be used with distance-based classifiers or directly to decide whether two input images belong to the same class. While most prior image distances rely purely on comparisons of low-level features extracted from the inputs, our metrics use a large database of labeled photos as auxiliary data to draw semantic relationships between the two images, beyond those computable from simple visual features. In a preprocessing stage our approach derives a semantic image graph from the labeled dataset, where the nodes are the labeled images and the edges connect pictures with related labels. The graph can be viewed as modeling a semantic image manifold, and it enables the use of graph distances to approximate semantic distances. Thus, we reformulate the task of measuring the semantic distance between two unlabeled pictures as the problem of embedding the two input images in the semantic graph. We propose and evaluate several embedding schemes and graph distance metrics. Our results on Caltech101, Caltech256 and ImageNet show that our distances consistently match or outperform the state-of-the-art in this field.


Support Vector Machine Random Walk Near Neighbor Target Node Semantic Distance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Carey, S., Bartlett, E.: Acquiring a single new word. In: The Stanford Child Language Conference (1978)Google Scholar
  2. 2.
    Frome, A., Singer, Y., Sha, F., Malik, J.: Learning globally-consistent local distance functions for shape-based image retrieval and classification. In: ICCV (2007)Google Scholar
  3. 3.
    Wang, G., Hoiem, D., Forsyth, D.: Learning image similarity from flickr using stochastic intersection kernel machines. In: Intl. Conf. Computer Vision (2009)Google Scholar
  4. 4.
    Malisiewicz, T., Efros, A.A.: Recognition by association via learning per-exemplar distances. In: CVPR (2008)Google Scholar
  5. 5.
    Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing: Label transfer via dense scene alignment. In: CVPR (2009)Google Scholar
  6. 6.
    Ramanan, D., Baker, S.: Local distance functions: A taxonomy, new algorithms, and an evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 794–806 (2011)CrossRefGoogle Scholar
  7. 7.
    Tenenbaum, J.B., Kemp, C., Griffiths, T.L., Goodman, N.D.: How to Grow a Mind: Statistics, Structure, and Abstraction. Science 331, 1279–1285 (2011)MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Tenenbaum, J.B., Silva, V., Langford, J.C.: A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 290, 2319–2323 (2000)CrossRefGoogle Scholar
  9. 9.
    Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. JMLR (2006)Google Scholar
  10. 10.
    Rosch, E.: Cognitive representations of semantic categories. J. of Experimental Psychology: General (1975)Google Scholar
  11. 11.
    Deselaers, T., Ferrari, V.: Visual and semantic similarity in ImageNet. In: CVPR, pp. 1777–1784 (2011)Google Scholar
  12. 12.
    Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans. PAMI (2008)Google Scholar
  13. 13.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F.: Imagenet: A large-scale hierarchical image database. In: CVPR, pp. 248–255 (2009)Google Scholar
  14. 14.
    Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: NIPS (2004)Google Scholar
  15. 15.
    Torresani, L., Lee, K.C.: Large margin component analysis. In: NIPS (2006)Google Scholar
  16. 16.
    Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research 10, 207–244 (2009)zbMATHGoogle Scholar
  17. 17.
    Babenko, B., Branson, S., Belongie, S.: Similarity metrics for categorization: From monolithic to category specific. In: ICCV, pp. 293–300 (2009)Google Scholar
  18. 18.
    Hastie, T., Tibshirani, R.: Discriminant adaptive nearest neighbor classification. IEEE Trans. PAMI (1996)Google Scholar
  19. 19.
    Domeniconi, C., Peng, J., Gunopulos, D.: Locally adaptive metric nearest-neighbor classification. IEEE Trans. Pattern Anal. Mach. Intell. 24, 1281–1285 (2002)CrossRefGoogle Scholar
  20. 20.
    Oliva, A., Torralba, A.: Building the gist of a scene: The role of global image features in recognition. Visual Perception, Progress in Brain Research 155 (2006)Google Scholar
  21. 21.
    Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient Object Category Recognition Using Classemes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 776–789. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  22. 22.
    Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: WordNet: An on-line lexical database. International Journal of Lexicography 3, 235–244 (1990)CrossRefGoogle Scholar
  23. 23.
    Lim, Y., Jung, K., Kohli, P.: Energy Minimization under Constraints on Label Counts. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 535–551. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  24. 24.
  25. 25.
    Craswell, N., Szummer, M.: Random walks on the click graph. In: SIGIR (2007)Google Scholar
  26. 26.
    Szummer, M., Jaakkola, T.: Partially labeled classification with markov random walks. In: NIPS (2001)Google Scholar
  27. 27.
    Rastegari, M., Fang, C., Torresani, L.: Scalable object-class retrieval with approximate and top-k ranking. In: ICCV, pp. 2659–2666 (2011)Google Scholar
  28. 28.
    Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, Caltech (2007)Google Scholar
  29. 29.
    Berg, A., Deng, J., Fei-Fei, L.: Large scale visual recognition challenge (2010),
  30. 30.
  31. 31.
    Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. CoRR (1997)Google Scholar
  32. 32.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer (2006)Google Scholar
  33. 33.
    Duchenne, O., Joulin, A., Ponce, J.: A graph-matching kernel for object categorization. In: ICCV, pp. 1792–1799 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Chen Fang
    • 1
  • Lorenzo Torresani
    • 1
  1. 1.Computer Science DepartmentDartmouth CollegeHanoverUSA

Personalised recommendations