Advertisement

Data-Dependent Locality Sensitive Hashing

  • Hongtao Xie
  • Zhineng Chen
  • Yizhi Liu
  • Jianlong Tan
  • Li Guo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8879)

Abstract

Locality sensitive hashing (LSH) is the most popular algorithm for approximate nearest neighbor (ANN) search. As LSH partitions vector space uniformly and the distribution of vectors is usually non-uniform, it poorly fits real dataset and has limited performance. In this paper, we propose a new data-dependent LSH algorithm, which has two-level structures to perform ANN search in high dimensional spaces. In the first level, we first train a number of cluster centers, then use the cluster centers to divide the dataset into many clusters and the vectors in each cluster has near uniform distribution. In the second level, we construct LSH tables for each cluster. Given a query, we first determine a few clusters that it belongs to with high probability, and then perform ANN search in the corresponding LSH tables. Experimental results on the reference datasets show that the search speed can be increased by 48 times compared to E2LSH, while keeping high search precision.

Keywords

Locality sensitive hashing approximate nearest neighbor 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Wan, J., Tang, S., Zhang, Y., Huang, L., Li, J.: Data Driven Multi-Index Hashing. In: IEEE International Conference on Image Processing (2013)Google Scholar
  2. 2.
    Zezula, P., Amato, G., Dohnal, V., et al.: Similarity Search: The metric space approach. Advances in Database Systems (2006)Google Scholar
  3. 3.
    Adonis, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Symposium on Foundations of Computer Science (2006)Google Scholar
  4. 4.
    Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Symposium on Computational Geometry (2004)Google Scholar
  5. 5.
    Jegou, H., Amsaleg, L., Schmid, C., Gro, P.: Query adaptative locality sensitive hashing. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (2008)Google Scholar
  6. 6.
    Weiss, Y., Torralba, A., Fergus, R.: Spectral Hashing. In: Advances in Neural Information Processing Systems (2008)Google Scholar
  7. 7.
    Heo, J.-P., Lee, Y.: Spherical Hashing. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)Google Scholar
  8. 8.
    Pan, J., Manocha, D.: Bi-level Locality Sensitive Hashing for k-Nearest Neighbor Computation. In: Very Large Data Base (2010)Google Scholar
  9. 9.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer (2006)Google Scholar
  10. 10.
    Jegou, H., Douze, M., et al.: Product Quantization for Nearest Neighbor Search. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(1), 117–128 (2011)CrossRefGoogle Scholar
  11. 11.
    Pauleve, L., Jegou, H., Amsaleg, L.: Locality sensitive hashing: A comparison of hash function types and querying mechanisms. Elsevier B.V. (2010)Google Scholar
  12. 12.
    Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: Very Large Data Base (2007)Google Scholar
  13. 13.
    Bawa, M., Condie, T., Ganesan, P.: LSH forest: self-tuning indexes for similarity search. In: International Conference on World Wide Web (2005)Google Scholar
  14. 14.
    Dong, W., Wang, Z., Josephson, W., Charikar, M., Li, K.: Modeling lsh for performance tuning. In: Conference on Information and Knowledge Management (2008)Google Scholar
  15. 15.
    Babenko, A., Lempitsky, V.: The inverted multi-index. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)Google Scholar
  16. 16.
    Xie, H., Zhang, Y., Tan, J., Guo, L., Li, J.: Contextual Query Expansion for Image Retrieval. IEEE Trans. on Multimedia 16(4) (June 2014)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Hongtao Xie
    • 1
  • Zhineng Chen
    • 2
  • Yizhi Liu
    • 3
  • Jianlong Tan
    • 1
  • Li Guo
    • 1
  1. 1.National Engineering Laboratory for Information Security TechnologiesInstitute of Information Engineering, Chinese Academy of SciencesBeijingChina
  2. 2.Interactive Digital Media Technology Research CenterInstitute of Automation, Chinese Academy of SciencesBeijingChina
  3. 3.School of Computer Science and EngineeringHunan University of Science and TechnologyXiangtanChina

Personalised recommendations