Evolving Computationally Efficient Hashing for Similarity Search
Finding nearest neighbors in high-dimensional spaces is a very expensive task. Locality-sensitive hashing is a general dimension reduction technique that maps similar elements closely in the hash space, streamlining near neighbor lookup.
In this paper we propose a variable genome length biased random key genetic algorithm whose encoding facilitates the exploration of locality-sensitive hash functions that only use sparsely applied addition operations instead of the usual costly dense multiplications.
Experimental results show that the proposed method obtains highly efficient functions with a much higher mean average precision than standard methods using random projections, while also being much faster to compute.
KeywordsLocality-sensitive hashing Optimal design Genetic algorithms Variable length representation
This paper was partially supported by the Sapientia Institute for Research Programs (KPI). The work of Sándor Miklós Szilágyi and László Szilágyi was additionally supported by the Hungarian Academy of Sciences through the János Bolyai Fellowship program.
- 2.Broder, A.Z.: On the resemblance and containment of documents. In: Proceedings of Compression and Complexity of Sequences, pp. 21–29. IEEE Computer Society (1997)Google Scholar
- 3.Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: Proceedings of the 34th Annual ACM Symposium on Theory of Computing, pp. 380–388. ACM, New York (2002)Google Scholar
- 6.Dean, T., Ruzon, M.A., Segal, M., Shlens, J., Vijayanarasimhan, S., Yagnik, J.: Fast, accurate detection of 100,000 object classes on a single machine. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1814–1821. IEEE (2013)Google Scholar
- 8.Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pp. 604–613. ACM (1998)Google Scholar
- 9.Spears, W., De Jong, K.: On the virtues of parameterized uniform crossover. In: Proceedings of the Fourth International Conference on Genetic Algorithms, pp. 230–236 (1991)Google Scholar
- 11.Wang, J., Shen, H.T., Song, J., Ji, J.: Hashing for similarity search: a survey. CoRR abs/1408.2927 (2014)Google Scholar