Advertisement

Large Scale Online Learning of Image Similarity through Ranking

  • Gal Chechik
  • Varun Sharma
  • Uri Shalit
  • Samy Bengio
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5524)

Abstract

Learning a measure of similarity between pairs of objects is a fundamental problem in machine learning. Pairwise similarity plays a crucial role in classification algorithms like nearest neighbors, and is practically important for applications like searching for images that are similar to a given image or finding videos that are relevant to a given video. In these tasks, users look for objects that are both visually similar and semantically related to a given object.

Unfortunately, current approaches for learning semantic similarity are limited to small scale datasets, because their complexity grows quadratically with the sample size, and because they impose costly positivity constraints on the learned similarity functions. To address real-world large-scale AI problem, like learning similarity over all images on the web, we need to develop new algorithms that scale to many samples, many classes, and many features.

The current abstract presents OASIS, an Online Algorithm for Scalable Image Similarity learning that learns a bilinear similarity measure over sparse representations. OASIS is an online dual approach using the passive-aggressive family of learning algorithms with a large margin criterion and an efficient hinge loss cost. Our experiments show that OASIS is both fast and accurate at a wide range of scales: for a dataset with thousands of images, it achieves better results than existing state-of-the-art methods, while being an order of magnitude faster. Comparing OASIS with different symmetric variants, provides unexpected insights into the effect of symmetry on the quality of the similarity. For large, web scale, datasets, OASIS can be trained on more than two million images from 150K text queries within two days on a single CPU. Human evaluations showed that 35% of the ten top images ranked by OASIS were semantically relevant to a query image. This suggests that query-independent similarity could be accurately learned even for large-scale datasets that could not be handled before.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. Journal of Machine Learning Research (JMLR) 7, 551–585 (2006)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Weinberger, K., Saul, L.: Fast Solvers and Efficient Implementations for Distance Metric Learning. In: Proc. of 25th International Conference on Machine Learning (ICML) (2008)Google Scholar
  3. 3.
    Weinberger, K., Blitzer, J., Saul, L.: Distance Metric Learning for Large Margin Nearest Neighbor Classification. Advances in Neural Information Processing Systems 18, 1473 (2006)Google Scholar
  4. 4.
    Globerson, A., Roweis, S.: Metric Learning by Collapsing Classes. Advances in Neural Information Processing Systems 18, 451 (2006)Google Scholar
  5. 5.
    Jain, P., Kulis, B., Dhillon, I., Grauman, K.: Online metric learning and fast similarity search. Advances in Neural Information Processing Systems 22 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Gal Chechik
    • 1
  • Varun Sharma
    • 1
  • Uri Shalit
    • 2
  • Samy Bengio
    • 1
  1. 1.GoogleMountain ViewUSA
  2. 2.Hebrew UniversityJerusalemIsrael

Personalised recommendations