Locality Sensitive Hashing Using GMM

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8753)


We propose a new approach for locality sensitive hashes (LSH) solving the approximate nearest neighbor problem. A well known LSH family uses linear projections to place the samples of a dataset into different buckets. We extend this idea and, instead of using equally spaced buckets, use a Gaussian mixture model to build a data dependent mapping.


  1. 1.
    Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: 47th Annual IEEE Symposium on Foundations of Computer Science, 2006. FOCS ’06, pp. 459–468 (2006)Google Scholar
  2. 2.
    Bache, K., Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
  3. 3.
    Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the Twentieth Annual Symposium on Computational Geometry, SCG ’04, pp. 253–262. ACM, New York (2004). http://doi.acm.org/10.1145/997817.997857
  4. 4.
    Figueiredo, M., Jain, A.: Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 381–396 (2002)CrossRefGoogle Scholar
  5. 5.
    Friedman, J.H., Bentley, J.L., Finkel, R.A.: An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3(3), 209–226 (1977). http://doi.acm.org/10.1145/355744.355745 CrossRefMATHGoogle Scholar
  6. 6.
    Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC ’98 pp. 604–613. ACM, New York (1998). http://doi.acm.org/10.1145/276698.276876
  7. 7.
    Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB ’07, VLDB Endowment, pp. 950–961 (2007). http://dl.acm.org/citation.cfm?id=1325851.1325958
  8. 8.
    Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: VISAPP International Conference on Computer Vision Theory and Applications, pp. 331–340 (2009)Google Scholar
  9. 9.
    Panigrahy, R.: Entropy based nearest neighbor search in high dimensions. In: Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm, SODA ’06, pp. 1186–1195. ACM, New York (2006). http://doi.acm.org/10.1145/1109557.1109688
  10. 10.
    Paulevé, L., Jégou, H., Amsaleg, L.: Locality sensitive hashing: a comparison of hash function types and querying mechanisms. Pattern Recogn. Lett. 31(11), 1348–1358 (2010). http://hal.inria.fr/inria-00567191, qUAEROCrossRefGoogle Scholar
  11. 11.
    Silpa-Anan, C., Hartley, R.: Optimised KD-trees for fast image descriptor matching. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008, CVPR 2008, June 2008, pp. 1–8 (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.ISSUniversity of StuttgartStuttgartGermany

Personalised recommendations