Abstract
Learning-based hashing methods are becoming the mainstream for large scale visual search. They consist of two main components: hash codes learning for training data and hash functions learning for encoding new data points. The performance of a content-based image retrieval system crucially depends on the feature representation, and currently Convolutional Neural Networks (CNNs) has been proved effective for extracting high-level visual features for large scale image retrieval. In this paper, we propose a Multiple Hierarchical Deep Hashing (MHDH) approach for large scale image retrieval. Moreover, MHDH seeks to integrate multiple hierarchical non-linear transformations with hidden neural network layer for hashing code generation. The learned binary codes represent potential concepts that connect to class labels. In addition, extensive experiments on two popular datasets demonstrate the superiority of our MHDH over both supervised and unsupervised hashing methods.
Similar content being viewed by others
Notes
During training process, we consider ‘0’ bit as ‘-1’ bit, then in the implementation of encoding and testing, we use ‘0’ again.
References
Bengio Y (2009) Learning deep architectures for ai. Found Trends Mach Learn 2(1):1–55
Bradski GR (1998) Computer vision face tracking for use in a perceptual user interface. Intel Technol J Q2(Q2):214–219
Chang SF (2012) Supervised hashing with kernels. In: IEEE Conference on computer vision and pattern recognition, pp 2074–2081
Chowdhury GG (2003) Natural language processing. Ann Rev Inf Sci Technol 37(1):51–89
Datar M, Immorlica N, Indyk P, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: Twentieth Symposium on computational geometry, pp 253–262
Gao L, Song J, Liu X, Shao J, Liu J, Shao J (2015) Learning in high-dimensional multimedia data: the state of the art. Multimed Syst 1–11
Gao L, Song J, Nie F, Yan Y (2015) Optimal graph learning with partial tags and multiple features for image and video annotation. In: CVPR, pp 4371–4379
Gao L, Song J, Zou F, Zhang D, Shao J (2015) Scalable multimedia retrieval by deep learning hashing with relative similarity learning. In: ACM multimedia, pp 903–906
Gao LL, Song J, Shao J, Zhu X, Shen HT (2015) Zero-shot image categorization by image correlation exploration. In: ICMR, pp 487–490
Gong Y, Lazebnik S (2011) Iterative quantization: a procrustean approach to learning binary codes. In: IEEE Conference on computer vision and pattern recognition, pp 817–824
He J, Liu W, Chang SF (2010) Scalable similarity search with optimized kernel hashing. In: ACM SIGKDD International conference on knowledge discovery and data mining. Washington, Dc, pp 1129–1138
He K, Wen F, Sun J (2013) K-means hashing: an affinity-preserving quantization method for learning binary compact codes. In: IEEE Conference on computer vision and pattern recognition, pp 2938–2945
Heo JP, Lee Y, He J, Chang SF, Yoon SE (2015) Spherical hashing: binary code embedding with hyperspheres. IEEE Trans Pattern Anal Mach Intelli 37(11):1–1
Hinton G, Deng L, Yu D, Dahl GE, Mohamed A, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath TN (2012) Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Process Mag 29(6):82–97
Hu G, Shao J, Gao L, Yang Y (2015) Exploring viewable angle information in georeferenced video search. In: ACM Multimedia, pp 839–842
Ji S, Yang M, Yu K (2013) 3d convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35(1):221–31
Krizhevsky A (2012) Learning multiple layers of features from tiny images
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25(2):2012
Kulis B, Darrell T (2009) Learning to hash with binary reconstructive embeddings. In: Advances in neural information processing systems 22: conference on neural information processing systems 2009. Proceedings of a meeting held 7-10 December 2009. Vancouver, pp 1042–1050
Kulis B, Grauman K (2009) Kernelized locality-sensitive hashing for scalable image search 30(2):2130–2137
Kulis B, Jain P, Grauman K (2009) Fast similarity search for learned metrics. IEEE Trans Pattern Anal Mach Intell 31(12):2143–57
Le QV, Zou WY, Yeung SY, Ng AY (2011) Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: Computer vision and pattern recognition, pp 3361–3368
Lecun Y, Cortes C (2010) The mnist database of handwritten digits
Lee H, Grosse R, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: International conference on machine learning, ICML 2009. Montreal, pp 609–616
Lin K, Yang HF, Hsiao JH, Chen CS (2015) Deep learning of binary hash codes for fast image retrieval. In: IEEE Conference on computer vision and pattern recognition workshops, pp 27–35
Liong VE, Lu J, Wang G, Moulin P, Zhou J (2015) Deep hashing for compact binary codes learning. In: IEEE Conference on computer vision and pattern recognition, pp 2475–2483
Liu X, He J, Lang B, Chang SF (2013) Hash bit selection: a unified solution for selection problems in hashing. In: Computer vision and pattern recognition, pp 1570–1577
Lu H, Li B, Zhu J, Li Y, Li Y, Xu X, He L, Li X, Li J, Serikawa S (2016) Wound intensity correction and segmentation with convolutional neural networks. Concurr Comput Pract Exper
Norouzi ME, Fleet DJ (2011) Minimal loss hashing for compact binary codes. In: International conference on machine learning, ICML 2011. Bellevue, pp 353–360
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175
Pan Z, Zhang Y, Kwong S (2015) Efficient motion and disparity estimation optimization for low complexity multiview video coding. IEEE Trans Broad 61(2):1–1
Song J (2013) Effective hashing for large-scale multimedia search
Song J, Yang Y, Yang Y, Huang Z, Shen HT (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources, pp 785–796
Song J, Yang Y, Li X, Huang Z, Yang Y (2014) Robust hashing with local models for approximate similarity search. IEEE Trans Cybern 44(7):1225–1236
Song J, Gao L, Yan Y, Zhang D, Sebe N (2015) Supervised hashing with pseudo labels for scalable multimedia retrieval. In: ACM Multimedia, pp 827–830
Song J, Gao L, Zou F, Yan Y, Sebe N (2016) Deep and fast: deep learning hashing with semi-supervised graph construction *. Image Vis Comput 55:101–108
Strecha C, Bronstein A, Bronstein M, Fua P (2012) Ldahash: improved matching with smaller descriptors. IEEE Trans Pattern Anal Mach Intell 34(1):66–78
Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30(11):1958
Torralba A, Fergus R, Weiss Y (2008) Small codes and large image databases for recognition. In: IEEE Computer society conference on computer vision and pattern recognition, pp 1–8
Wang J, Kumar S, Chang SF (2010) Semi-supervised hashing for scalable image retrieval, pp 3424–3431
Wang J, Kumar S, Chang SF (2012) Semi-supervised hashing for large-scale search. IEEE Trans Pattern Anal Mach Intell 34(12):2393–406
Wang J, Zhang T, Song J, Sebe N, Shen HT (2016) A survey on learning to hash
Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. In: Conference on neural information processing systems, Vancouver, pp 1753–1760
Williams R, Zipser D (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Comput 1(2):270–280
Xia R, Pan Y, Lai H, Liu C, Yan S (2014) Supervised hashing for image retrieval via image representation learning
Xie S, Wang Y (2014) Construction of tree network with limited delivery latency in homogeneous wireless sensor networks. Wireless Person Commun 78(1):231–246
Xu H, Wang J, Li Z, Zeng G, Li S, Yu N (2011) Complementary hashing for approximate nearest neighbor search. In: IEEE International conference on computer vision, ICCV 2011. Barcelona, pp 1631–1638
Xu X, He L, Lu H, Shimada A, Taniguchi RI (2016) Non-linear matrix completion for social image tagging PP(99), pp 1–1
Xu X, He L, Shimada A, Taniguchi RI, Lu H (2016) Learning unified binary codes for cross-modal retrieval via latent semantic hashing. Neurocomputing 213:191–203
Zhong G, Yang P, Wang S, Dong J (2015) A deep hashing learning network. Comput Sci
Acknowledgments
This work is supported by the Fundamental Research Funds for the Central Universities (Grant no. ZYGX2014J063), the National Natural Science Foundation of China (Grant no. 61502080) and the Priority Academic Program Development of Jiangsu Higher Education Institutions, and Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cao, L., Gao, L., Song, J. et al. Multiple hierarchical deep hashing for large scale image retrieval. Multimed Tools Appl 77, 10471–10484 (2018). https://doi.org/10.1007/s11042-017-4489-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-4489-0