Advertisement

Learning Binary Hash Codes for Large-Scale Image Search

  • Kristen Grauman
  • Rob Fergus
Part of the Studies in Computational Intelligence book series (SCI, volume 411)

Abstract

Algorithms to rapidly search massive image or video collections are critical for many vision applications, including visual search, content-based retrieval, and non-parametric models for object recognition. Recent work shows that learned binary projections are a powerful way to index large collections according to their content. The basic idea is to formulate the projections so as to approximately preserve a given similarity function of interest. Having done so, one can then search the data efficiently using hash tables, or by exploring the Hamming ball volume around a novel query. Both enable sub-linear time retrieval with respect to the database size. Further, depending on the design of the projections, in some cases it is possible to bound the number of database examples that must be searched in order to achieve a given level of accuracy.

This chapter overviews data structures for fast search with binary codes, and then describes several supervised and unsupervised strategies for generating the codes. In particular, we review supervised methods that integrate metric learning, boosting, and neural networks into the hash key construction, and unsupervised methods based on spectral analysis or kernelized random projections that compute affinity-preserving binary codes.Whether learning from explicit semantic supervision or exploiting the structure among unlabeled data, these methods make scalable retrieval possible for a variety of robust visual similarity measures.We focus on defining the algorithms, and illustrate the main points with results using millions of images.

Keywords

Hash Function Binary Code Hash Table Neural Information Processing System Locality Sensitive Hashing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Andoni, A., Indyk, P.: Near-Optimal Hashing Algorithms for Near Neighbor Problem in High Dimensions. In: IEEE Symposium on Foundations of Computer Science, FOCS (2006)Google Scholar
  2. 2.
    Athitsos, V., Alon, J., Sclaroff, S., Kollios, G.: BoostMap: A Method for Efficient Approximate Similarity Rankings. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2004)Google Scholar
  3. 3.
    Athitsos, V., Alon, J., Sclaroff, S., Kollios, G.: BoostMap: An Embedding Method for Efficient Nearest Neighbor Retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 30(1) (2008)Google Scholar
  4. 4.
    Babenko, B., Branson, S., Belongie, S.: Similarity Metrics for Categorization: from Monolithic to Category Specific. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2009)Google Scholar
  5. 5.
    Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning a Mahalanobis Metric from Equivalence Constraints. Journal of Machine Learning Research 6, 937–965 (2005)MathSciNetMATHGoogle Scholar
  6. 6.
    Belkin, M., Niyogi, P.: Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering. In: Neural Information Processing Systems (NIPS), pp. 585–591 (2001)Google Scholar
  7. 7.
    Belkin, M., Niyogi, P.: Towards a theoretical foundation for laplacian based manifold methods. J. of Computer System Sciences (2007)Google Scholar
  8. 8.
    Bengio, Y., Paiement, J.-F., Vincent, P., Delalleau, O., Le Roux, N., Ouimet, M.: Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering. In: Neural Information Processing Systems, NIPS (2004)Google Scholar
  9. 9.
    Bentley, J.: Multidimensional Divide and Conquer. Communications of the ACM 23(4), 214–229 (1980)MathSciNetMATHCrossRefGoogle Scholar
  10. 10.
    Broder, A.: On the Resemblance and Containment of Documents. In: Proceedings of the Compression and Complexity of Sequences (1997)Google Scholar
  11. 11.
    Bronstein, M., Bronstein, A., Michel, F., Paragios, N.: Data Fusion through Cross-modality Metric Learning using Similarity-Sensitive Hashing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)Google Scholar
  12. 12.
    Charikar, M.: Similarity Estimation Techniques from Rounding Algorithms. In: ACM Symp. on Theory of Computing (2002)Google Scholar
  13. 13.
    Chum, O., Perdoch, M., Matas, J.: Geometric min-Hashing: Finding a (Thick) Needle in a Haystack. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2009)Google Scholar
  14. 14.
    Chum, O., Philbin, J., Zisserman, A.: Near Duplicate Image Detection: min-Hash and tf-idf Weighting. In: British Machine Vision Conference (2008)Google Scholar
  15. 15.
    Coifman, R., Lafon, S., Lee, A.B., Maggioni, M., Nadler, B., Warner, F., Zucker, S.W.: Geometric Diffusions as a Tool for Harmonic Analysis and Struture Definition of Data: Diffiusion Maps. Proc. Natl. Academy of Sciences 102(21), 7426–7431 (2005)CrossRefGoogle Scholar
  16. 16.
    Crammer, K., Keshet, J., Singer, Y.: Kernel Design Using Boosting. In: Neural Information Processing Systems, NIPS (2002)Google Scholar
  17. 17.
    Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.: Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. In: Symposium on Computational Geometry, SOCG (2004)Google Scholar
  18. 18.
    Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image Retrieval: Ideas, Influences, and Trends of the New Age. ACM Computing Surveys (2008)Google Scholar
  19. 19.
    Davis, J., Kulis, B., Jain, P., Sra, S., Dhillon, I.: Information-Theoretic Metric Learning. In: Proceedings of International Conference on Machine Learning, ICML (2007)Google Scholar
  20. 20.
    Fergus, R., Bernal, H., Weiss, Y., Torralba, A.: Semantic Label Sharing for Learning with Many Categories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 762–775. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  21. 21.
    Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral Grouping Using the Nystrom Method. PAMI 26(2), 214–225 (2004)CrossRefGoogle Scholar
  22. 22.
    Freidman, J., Bentley, J., Finkel, A.: An Algorithm for Finding Best Matches in Logarithmic Expected Time. ACM Transactions on Mathematical Software 3(3), 209–226 (1977)CrossRefGoogle Scholar
  23. 23.
    Gionis, A., Indyk, P., Motwani, R.: Similarity Search in High Dimensions via Hashing. In: Proc. Intl Conf. on Very Large Data Bases (1999)Google Scholar
  24. 24.
    Globerson, A., Roweis, S.: Metric Learning by Collapsing Classes. In: Neural Information Processing Systems, NIPS (2005)Google Scholar
  25. 25.
    Goemans, M., Williamson, D.: Improved Approximation Algorithms for Maximum Cut and Satisfiability Problems Using Semidefinite Programming. JACM 42(6), 1115–1145 (1995)MathSciNetMATHCrossRefGoogle Scholar
  26. 26.
    Goldberger, J., Roweis, S.T., Salakhutdinov, R.R., Hinton, G.E.: Neighborhood Components Analysis. In: Neural Information Processing Systems, NIPS (2004)Google Scholar
  27. 27.
    Grauman, K., Darrell, T.: The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2005)Google Scholar
  28. 28.
    Grauman, K., Darrell, T.: Pyramid Match Hashing: Sub-Linear Time Indexing Over Partial Correspondences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2007)Google Scholar
  29. 29.
    Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality Reduction by Learning an Invariant Mapping. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2006)Google Scholar
  30. 30.
    Hertz, T., Bar-Hillel, A., Weinshall, D.: Learning Distance Functions for Image Retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2004)Google Scholar
  31. 31.
    Hertz, T., Bar-Hillel, A., Weinshall, D.: Learning a Kernel Function for Classification with Small Training Samples. In: Proceedings of International Conference on Machine Learning, ICML (2006)Google Scholar
  32. 32.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the Dimensionality of Data with Neural Networks. Nature 313(5786), 504–507 (2006)MathSciNetMATHGoogle Scholar
  33. 33.
    Indyk, P., Motwani, R.: Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In: 30th Symposium on Theory of Computing (1998)Google Scholar
  34. 34.
    Indyk, P., Thaper, N.: Fast Image Retrieval via Embeddings. In: Intl. Workshop on Statistical and Computational Theories of Vision (2003)Google Scholar
  35. 35.
    Iqbal, Q., Aggarwal, J.K.: CIRES: A System for Content-Based Retrieval in Digital Image Libraries. In: International Conference on Control, Automation, Robotics and Vision (2002)Google Scholar
  36. 36.
    Jain, P., Kulis, B., Dhillon, I., Grauman, K.: Online Metric Learning and Fast Similarity Search. In: Neural Information Processing Systems, NIPS (2008)Google Scholar
  37. 37.
    Jain, P., Kulis, B., Grauman, K.: Fast Image Search for Learned Metrics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)Google Scholar
  38. 38.
    Kulis, B., Darrell, T.: Learning to Hash with Binary Reconstructive Embeddings. In: Neural Information Processing Systems, NIPS (2009)Google Scholar
  39. 39.
    Kulis, B., Grauman, K.: Kernelized Locality-Sensitive Hashing for Scalable Image Search. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2009)Google Scholar
  40. 40.
    Kulis, B., Jain, P., Grauman, K.: Fast Similarity Search for Learned Metrics. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 31 (2009)Google Scholar
  41. 41.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2006)Google Scholar
  42. 42.
    Lin, R.-S., Ross, D., Yagnik, J.: SPEC Hashing: Similarity Preserving Algorithm for Entropy-based Coding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)Google Scholar
  43. 43.
    Ling, H., Soatto, S.: Proximity Distribution Kernels for Geometric Context in Category Recognition. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2007)Google Scholar
  44. 44.
    Liu, T., Moore, A., Gray, A., Yang, K.: An Investigation of Practical Approximate Nearest Neighbor Algorithms. In: Neural Information Processing Systems, NIPS (2005)Google Scholar
  45. 45.
    Lowe, D.: Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision (IJCV) 60(2) (2004)Google Scholar
  46. 46.
    Mu, Y., Shen, J., Yan, S.: Weakly-supervised hashing in kernel space. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)Google Scholar
  47. 47.
    Muja, M., Lowe, D.: Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration. In: International Conference on Computer Vision Theory and Application, VISSAPP (2009)Google Scholar
  48. 48.
    Nadler, B., Lafon, S., Coifman, R., Kevrekidis, I.: Diffusion maps, spectral clustering and reaction coordinates of dynamical systems (2008), http://arxiv.org
  49. 49.
    Ng, A., Jordan, M.I., Weiss, Y.: On Spectral Clustering, Analysis and an Algorithm. In: Neural Information Processing Systems, NIPS (2001)Google Scholar
  50. 50.
    Oliva, A., Torralba, A.: Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope. International Journal in Computer Vision 42, 145–175 (2001)MATHCrossRefGoogle Scholar
  51. 51.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object Retrieval with Large Vocabularies and Fast Spatial Matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2007)Google Scholar
  52. 52.
    Raginsky, M., Lazebnik, S.: Locality-Sensitive Binary Codes from Shift-Invariant Kernels. In: Neural Information Processing Systems, NIPS (2009)Google Scholar
  53. 53.
    Rahimi, A., Recht, B.: Random Features for Large-Scale Kernel Machines. In: Neural Information Processing Systems, NIPS (2007)Google Scholar
  54. 54.
    Rice, J.: Mathematical Statistics and Data Aanalysis. Duxbury Press (2001)Google Scholar
  55. 55.
    Roweis, S., Saul, L.: Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science 290(5500), 2323–2326 (2000)CrossRefGoogle Scholar
  56. 56.
    Salakhutdinov, R.R., Hinton, G.E.: Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure. In: AISTATS (2007)Google Scholar
  57. 57.
    Salakhutdinov, R.R., Hinton, G.E.: Semantic Hashing. In: SIGIR Workshop on Information Retrieval and Applications of Graphical Models (2007)Google Scholar
  58. 58.
    Schölkopf, B., Smola, A., Müller, K.-R.: Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Computation 10, 1299–1319 (1998)CrossRefGoogle Scholar
  59. 59.
    Schultz, M., Joachims, T.: Learning a Distance Metric from Relative Comparisons. In: Neural Information Processing Systems, NIPS (2003)Google Scholar
  60. 60.
    Shakhnarovich, G.: Learning Task-Specific Similarity. PhD thesis. MIT (2005)Google Scholar
  61. 61.
    Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter sensitive hashing. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2003)Google Scholar
  62. 62.
    Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press (2004)Google Scholar
  63. 63.
    Sivic, J., Zisserman, A.: Video Google: A Text Retrieval Approach to Object Matching in Videos. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2003)Google Scholar
  64. 64.
    Tenenbaum, J., de Silva, V., Langford, J.: A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 290(5500), 2319–2323 (2000)CrossRefGoogle Scholar
  65. 65.
    Torralba, A., Fergus, R., Weiss, Y.: Small Codes and Large Image Databases for Recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)Google Scholar
  66. 66.
    Uhlmann, J.: Satisfying General Proximity/Similarity Queries with Metric Trees. Information Processing Letters 40, 175–179 (1991)MATHCrossRefGoogle Scholar
  67. 67.
    van der Maaten, L., Hinton, G.: Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008)MATHGoogle Scholar
  68. 68.
    Varma, M., Ray, D.: Learning the Discriminative Power-Invariance Trade-off. In: Proceedings of the IEEE International Conference on Computer Vision, ICCV (2007)Google Scholar
  69. 69.
    Wang, J., Kumar, S., Chang, S.-F.: Semi-Supervised Hashing for Scalable Image Retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)Google Scholar
  70. 70.
    Wang, J., Kumar, S., Chang, S.-F.: Sequential Projection Learning for Hashing with Compact Codes. In: Proceedings of International Conference on Machine Learning, ICML (2010)Google Scholar
  71. 71.
    Weinberger, K., Blitzer, J., Saul, L.: Distance Metric Learning for Large Margin Nearest Neighbor Classification. In: Neural Information Processing Systems, NIPS (2006)Google Scholar
  72. 72.
    Weiss, Y., Torralba, A., Fergus, R.: Spectral Hashing. In: Neural Information Processing Systems, NIPS (2008)Google Scholar
  73. 73.
    Xing, E., Ng, A., Jordan, M., Russell, S.: Distance Metric Learning, with Application to Clustering with Side-Information. In: Neural Information Processing Systems, NIPS (2002)Google Scholar
  74. 74.
    Xu, D., Cham, T.J., Yan, S., Chang, S.-F.: Near Duplicate Image Identification with Spatially Aligned Pyramid Matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)Google Scholar
  75. 75.
    Yeh, T., Grauman, K., Tollmar, K., Darrell, T.: A Picture is Worth a Thousand Keywords: Image-Based Object Search on a Mobile Platform. In: Proceedings of the ACM Conference on Human Factors in Computing Systems (2005)Google Scholar
  76. 76.
    Zhang, H., Berg, A., Maire, M., Malik, J.: SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2006)Google Scholar
  77. 77.
    Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study. International Journal of Computer Vision (IJCV) 73(2), 213–238 (2007)CrossRefGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Dept. of Computer ScienceUniversity of TexasAustinUSA
  2. 2.Courant InstituteNew York UniversityNew YorkUSA

Personalised recommendations