Advertisement

Unsupervised deep neuron-per-neuron hashing

  • Sanaa ChafikEmail author
  • Mounim A. El Yacoubi
  • Imane Daoudi
  • Hamid El Ouardi
Article
  • 45 Downloads

Abstract

Hashing has been widely applied to approximate nearest neighbor search for large-scale multimedia retrieval. A variety of hashing methods have been developed for learning an efficient binary data representation, mainly by relaxing some imposed constraints during hash function learning. Although they have achieved good accuracy-speed trade-off, the resulting binary codes may fail sometimes in adequately approximating the input data, thus significantly decreasing the search accuracy. In this paper, we present a new Unsupervised Deep Learning Hashing approach, called Deep Neuron-per-Neuron Hashing, for high dimensional data indexing. Unlike most existing hashing approaches, our method does not seek to binarize the neural network output, but rather relies directly on the continuous output to create an efficient index structure with hash tables. Given the neural network deepest layer, each table indexes separately a neuron output, capturing in this way a particular high level individual structure (feature) of the input. An efficient search is then performed by computing a cumulative collision score of a given query over all the neuron-based hash tables. Experimental comparisons to the state-of-the-art demonstrate the competitiveness of the proposed method for large datasets.

Keywords

Information retrieval Indexing Approximate nearest neighbor search Deep learning Unsupervised hashing 

References

  1. 1.
  2. 2.
    Andoni A, Indyk P (2008) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun ACM 51(1):117–122CrossRefGoogle Scholar
  3. 3.
    Bellegarda JR, Monz C (2016) State of the art in statistical methods for language and speech processing. Comput Speech Lang 35:163–184CrossRefGoogle Scholar
  4. 4.
    Bentley JL (1990) K-d trees for semi dynamic point sets. In: Proceedings of the 6th annual symposium on computational geometry, SCG ’90. ACM, New York, pp 187–197Google Scholar
  5. 5.
    Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D, Bengio Y (2010) Theano: a CPU and GPU math expression compiler. In: Proceedings of the python for scientific computing conference (SciPy)Google Scholar
  6. 6.
    Bordes A, Glorot X, Weston J, Bengio Y (2012) Joint learning of words and meaning representations for open-text semantic parsing. In: Proceedings of the 15th international conference on artificial intelligence and statistics, AISTATS 2012. La Palma, Canary Islands, pp 127–135Google Scholar
  7. 7.
    Cao Y, Long M, Wang J, Zhu H, Wen Q (2016) Deep quantization network for efficient image retrieval. In: Proceedings of the 13th AAAI conference on artificial intelligence. Phoenix, Arizona, USA, pp 3457–3463Google Scholar
  8. 8.
    Carreira-Perpiñán MÁ, Raziperchikolaei R (2015) Hashing with binary autoencoders. In: IEEE conference on computer vision and pattern recognition, CVPR 2015. Boston, MA, USA, pp 557–566Google Scholar
  9. 9.
    Carreira-Perpiñán MÁ, Wang W (2012) Distributed optimization of deeply nested systems. CoRR, arXiv:1212.5921
  10. 10.
    Chafik S, Daoudi I, El-Yacoubi MA, El Ouardi H (2015) Cluster-based data oriented hashing. In: 2015 IEEE International conference on data science and advanced analytics, DSAA 2015, Campus Des Cordeliers, Paris, France, October, pp 1–7Google Scholar
  11. 11.
    Charikar MS (2002) Similarity estimation techniques from rounding algorithms. In: Proceedings of the 34 Annual ACM symposium on theory of computing, STOC ’02. New York, NY, USA, pp 380–388Google Scholar
  12. 12.
    Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–42CrossRefGoogle Scholar
  13. 13.
    Datar M, Indyk P (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the 20th annual symposium on computational geometry, SCG ’04. ACM Press, pp 253–262Google Scholar
  14. 14.
    Deng L (2012) The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Proc Mag 29(6):141–142CrossRefGoogle Scholar
  15. 15.
    Deng L, Yu D (2014) Deep learning: methods and applications. Found Trends Signal Process 7(3-4):197–387MathSciNetCrossRefGoogle Scholar
  16. 16.
    Faro S, Lecroq T (2012) Fast searching in biological sequences using multiple hash functions. In: 12th IEEE international conference on bioinformatics & bioengineering, BIBE 2012. Larnaca, Cyprus, pp 175–180Google Scholar
  17. 17.
    Gan J, Feng J, Fang Q, Ng W (2012) Locality-sensitive hashing scheme based on dynamic collision counting. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data, SIGMOD ’12. New York, NY, USA, pp 541–552Google Scholar
  18. 18.
    Garofolo JS, Lamel LF, Fisher WM, Fiscus JG, Pallett DS, Dahlgren NL (1993) Darpa timit acoustic phonetic continuous speech corpus cdromGoogle Scholar
  19. 19.
    Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. In: Proceedings of the 25th International conference on very large data bases, VLDB ’99. San Francisco, CA, USA, pp 518–529Google Scholar
  20. 20.
    Gong Y, Lazebnik S, Gordo A, Perronnin F (2013) Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35(12):2916–2929CrossRefGoogle Scholar
  21. 21.
    Goodfellow IJ, Warde-Farley D, Lamblin P, Dumoulin V, Mirza M, Pascanu R, Bergstra J, Bastien F, Bengio Y (2013) Pylearn2: a machine learning research library. CoRR, arXiv:1308.4214
  22. 22.
    Gorisse D, Cord M, Precioso F (2012) Locality-sensitive hashing for chi2 distance. IEEE Trans Pattern Anal Mach Intell 34(2):402–409CrossRefGoogle Scholar
  23. 23.
    Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: International conference on management of data. ACM, pp 47–57Google Scholar
  24. 24.
    Haveliwala T, Gionis A, Indyk P (2000) Scalable techniques for clustering the web (extended abstract). In: 3rd international workshop on the web and databases (WebDB 2000)Google Scholar
  25. 25.
    Heo J-P, Lee Y, He J, Chang S-F, Yoon S-E (2015) Spherical hashing: binary code embedding with hyperspheres. IEEE Trans Pattern Anal Mach Intell 37(11):2304–2316CrossRefGoogle Scholar
  26. 26.
    Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313 (5786):504–507MathSciNetCrossRefGoogle Scholar
  27. 27.
    Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554MathSciNetCrossRefGoogle Scholar
  28. 28.
    Huang Q, Feng J, Zhang Y, Fang Q, Ng W (2015) Query-aware locality-sensitive hashing for approximate nearest neighbor search. Proc VLDB Endowment 9(1):1–12CrossRefGoogle Scholar
  29. 29.
    Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the thirtieth annual ACM symposium on theory of computing, STOC ’98. New York, NY, USA, pp 604–613Google Scholar
  30. 30.
    Krizhevsky A, Hinton GE (2011) Using very deep autoencoders for content-based image retrievalGoogle Scholar
  31. 31.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. Lake Tahoe, Nevada, US, pp 1106–1114Google Scholar
  32. 32.
    Kulis B, Darrell T (2009) Learning to hash with binary reconstructive embeddings. In: Advances in neural information processing systems. Vancouver, British Columbia, Canada, pp 1042–1050Google Scholar
  33. 33.
    Kulis B, Grauman K (2009) Kernelized locality-sensitive hashing for scalable image search. In: IEEE 12th international conference on computer vision, ICCV 2009. Kyoto, Japan, pp 2130–2137Google Scholar
  34. 34.
    Kulis B, Jain P, Grauman K (2009) Fast similarity search for learned metrics. IEEE Trans Pattern Anal Mach Intell 31(12):2143–2157CrossRefGoogle Scholar
  35. 35.
    Lai H, Pan Y, Ye L, Yan S (2015) Simultaneous feature learning and hash coding with deep neural networks. CoRR, arXiv:1504.03410
  36. 36.
    LeCun Y, Bengio Y (1998) The handbook of brain theory and neural networks. Chapter convolutional networks for images, speech, and time series. MIT Press, Cambridge, pp 255– 258Google Scholar
  37. 37.
    Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444, 5CrossRefGoogle Scholar
  38. 38.
    Lin J, Morère O, Petta J, Chandrasekhar V, Veillard A (2015) Tiny descriptors for image retrieval with unsupervised triplet hashing. CoRR, arXiv:1511.03055
  39. 39.
    Liong VE, Lu J, Wang G, Moulin P, Zhou J (2015) Deep hashing for compact binary codes learning. In: IEEE conference on computer vision and pattern recognition, CVPR 2015. Boston, MA, USA, pp 2475–2483Google Scholar
  40. 40.
    Nguyen VA, Lu J, Do MN (2014) Supervised discriminative hashing for compact binary codes. In: Proceedings of the 22nd ACM international conference on multimedia, MM ’14. ACM, New York, pp 989–992Google Scholar
  41. 41.
    Qin H, El-Yacoubi MA (2017) Deep representation-based feature extraction and recovering for finger-vein verification. IEEE Trans Inf Forensics Secur 12(8):1816–1829CrossRefGoogle Scholar
  42. 42.
    Qin H, El Yacoubi MA (2017) Deep representation for finger-vein image quality assessment. IEEE Trans Circ Syst Video Technol PP(99):1–1Google Scholar
  43. 43.
    Salakhutdinov R, Hinton GE (2009) Semantic hashing. Int J Approx Reasoning 50(7):969–978CrossRefGoogle Scholar
  44. 44.
    Shen F, Shen C, Liu W, Shen HT (2015) Supervised discrete hashing. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 37–45Google Scholar
  45. 45.
    Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30(11):1958–1970CrossRefGoogle Scholar
  46. 46.
    Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A (2010) Stacked denoising autoencoders Learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408MathSciNetzbMATHGoogle Scholar
  47. 47.
    Wang J, Zhang T, Song J, Sebe N, Shen HT (2016) A survey on learning to hash. CoRR, arXiv:1606.00185
  48. 48.
    Wang J, Kumar S, Chang S-F (2012) Semi-supervised hashing for large-scale search. IEEE Trans Pattern Anal Mach Intell 34(12):2393–2406CrossRefGoogle Scholar
  49. 49.
    Wang J, Liu W, Kumar S, Chang S-F (2016) Learning to hash for indexing big data - a survey. Proc IEEE 104(1):34–57CrossRefGoogle Scholar
  50. 50.
    Wang S, Huang Q, Jiang S, Tian Q (2012) S3mkl: scalable semi-supervised multiple kernel learning for real-world image applications. IEEE Trans Multimed 14 (4):1259–1274CrossRefGoogle Scholar
  51. 51.
    Wang Z, Bovik AC (2009) Mean squared error: love it or leave it? a new look at signal fidelity measures. IEEE Signal Proc Mag 26(1):98–117CrossRefGoogle Scholar
  52. 52.
    Weber R, Schek H-J, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the 24rd international conference on very large data bases, VLDB ’98. San Francisco, CA, USA, pp 194–205Google Scholar
  53. 53.
    Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. In: Advances in neural information processing systems. Vancouver, British Columbia, Canada, pp 1753–1760Google Scholar
  54. 54.
    Xia R, Pan Y, Lai H, Liu C, Yan S (2014) Supervised hashing for image retrieval via image representation learning. In: Proceedings of the 28th AAAI conference on artificial intelligence. Québec City, Québec, Canada, pp 2156–2162Google Scholar
  55. 55.
    Xia Z, Feng X, Peng J, Hadid A (2016) Unsupervised deep hashing for large-scale visual search. In: 6th international conference on image processing theory, tools and applications, IPTA 2016. Oulu, Finland, pp 1–5Google Scholar
  56. 56.
    Zhu H, Long M , Wang J, Cao Y (2016) Deep hashing network for efficient similarity retrieval. In: Proceedings of the 13th AAAI conference on artificial intelligence, AAAI’16. AAAI Press, pp 2415–2421Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.With SAMOVAR, Telecom SudParis, CNRSUniversity Paris SaclayEvry CedexFrance
  2. 2.With LISER, ENSEMHassan II University CasablancaCasablancaMorocco

Personalised recommendations