Abstract
Hashing plays an important role in information retrieval, due to its low storage and high speed of processing. As an effective multi-modal representation learning method, multi-modal hashing has received particular attention. Most of the existing multi-modal hashing methods adopt the fixed weighting factors to fuse multiple modalities for any query data, which cannot capture the variation among different queries. Besides, there are too much hyper-parameters in their models while it is time-consuming and labor-intensive to determine the proper parameters. The limitations may significantly hinder their promotion in practical applications. In this paper, we propose a simple, yet effective method that is inspired by the Hadamard matrix. On the one hand, our proposed method that involves a very few hyper-parameters is flexible. On the other hand, the complementary information between multi-modal data and the semantic discrimination information are preserved well in the hash codes. Extensive experimental results on four benchmark datasets show that the proposed framework is effective and achieves superior performance compared to state-of-the-art methods.
Similar content being viewed by others
References
Chen Y, Zhang H, Tian Z, Wang J, Zhang D, Li X (2020) Enhanced discrete multi-modal hashing: More constraints yet less time to learn. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2020.2995195
Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: A real-world web image database from national university of singapore. In: Proceedings of the ACM international conference on image and video retrieval. https://doi.org/10.1145/1646396.1646452
Datar M, Immorlica N, Indyk P, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the Twentieth annual symposium on computational geometry, pp 253–262
Gong Y, Lazebnik S, Gordo A, Perronnin F (2012) Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35:2916–2929
Hu M, Yang Y, Shen F, Xie N, Hong R, Shen HT (2019) Collective reconstructive embeddings for cross-modal hashing. IEEE Trans Image Process 28:2770–2784
Ji R, Liu H, Cao L, Liu D, Wu Y, Huang F (2017) Toward optimal manifold hashing via discrete locally linear embedding. IEEE Trans Image Process 26:5411–5420
Jiang QY, Li WJ (2019) Discrete latent factor model for cross-modal hashing. IEEE Trans Image Process 28:3490–3501
Koutaki G, Shirai K, Ambai M (2018) Hadamard coding for supervised discrete hashing. IEEE Trans Image Process 27:5378–5392
Li Z, Tang J, Mei T (2019) Deep collaborative embedding for social image understanding. IEEE Trans Pattern Anal Mach Intell 41:2070–2083
Lin M, Ji R, Liu H, Sun X, Chen S, Tian Q (2020) Hadamard matrix guided online hashing. Int J Comput Vis 128:2279–2306
Lin M, Ji R, Liu H, Wu Y (2018) Supervised online hashing via hadamard codebook learning. In: Proceedings of the 26th ACM international conference on multimedia, pp 1635–1643
Lin Z, Ding G, Han J, Wang J (2016) Cross-view retrieval via probability-based semantics-preserving hashing. IEEE Trans Cybern 47:4342–4355
Liu H, Ji R, Wu Y, Hua G (2016) Supervised matrix factorization for cross-modality hashing. In: International joint conference on artificial intelligence, pp 1767–1773
Liu X, He J, Liu D, Lang B (2012) Compact kernel hashing with multiple features. In: Proceedings of the 20th ACM international conference on multimedia, pp 881–884
Lu X, Liu L, Nie L, Chang X, Zhang H (2020) Semantic-driven interpretable deep multi-modal hashing for large-scale multimedia retrieval. IEEE Transactions on Multimedia
Lu X, Zhu L, Cheng Z, Li J, Nie X, Zhang H (2019) Flexible online multi-modal hashing for large-scale multimedia retrieval. In: Proceedings of the 27th ACM international conference on multimedia, pp 1129–1137
Lu X, Zhu L, Cheng Z, Nie L, Zhang H (2019) Online multi-modal hashing with dynamic query-adaption. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, pp 715–724
Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GR, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM international conference on multimedia, pp 251–260
Shen F, Shen C, Liu W, Shen HT (2015) Supervised discrete hashing. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 37–45
Shen X, Shen F, Sun QS, Yuan YH (2015) Multi-view latent hashing for efficient multimedia search. In: Proceedings of the 23rd ACM international conference on multimedia, pp 831– 834
Song J, Yang Y, Huang Z, Shen H (2013) Effective multiple feature hashing for large-scale near-duplicate video retrieval. IEEE Trans Multimed 15:1997–2008
Sylvester J (1867) Lx. thoughts on inverse orthogonal matrices, simultaneous signsuccessions, and tessellated pavements in two or more colours, with applications to newton’s rule, ornamental tile-work, and the theory of numbers. Lon Edinb Dublin Philos Mag J Sci 34:461–475. https://doi.org/10.1080/14786446708639914
Wang D, Gao X, Wang X, He L, Yuan B (2016) Multimodal discriminative binary embedding for large-scale cross-modal retrieval. IEEE Trans Image Process 25:4540–4554
Wang D, Wang Q, Gao X (2018) Robust and flexible discrete hashing for cross-modal similarity search. IEEE Trans Circuits Syst Video Technol 28:2703–2715
Wang J, Shen HT, Song J, Ji J (2014) Hashing for similarity search: A survey. arXiv: Data Structures and Algorithms
Wei Y, Zhao Y, Lu C, Wei S, Liu L, Zhu Z, Yan S (2017) Cross-modal retrieval with cnn visual features: A new baseline. IEEE Trans Syst Man Cybern 47:449–460
Xiaobo S, Fumin S, Li L, Yun-Hao Y, Weiwei L, Quan-Sen S (2018) Multiview discrete hashing for scalable multimedia search. ACM Trans Intell Syst Technol (TIST) 9:1–21
Xu X, Shen F, Yang Y, Shen HT, Li X (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process 26:2494–2507
Yi Z, Yeung DY (2012) Co-regularized hashing for multimodal data. In: International conference on neural information processing systems
Yu J, Wu X, Kittler J (2019) Discriminative supervised hashing for cross-modal similarity search. Image Vis Comput 89:50–56
Yuan L, Wang T, Zhang X, Tay FE, Jie Z, Liu W, Feng J (2020) Central similarity quantization for efficient image and video retrieval. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3083–3092
Zhang D, Li WJ (2014) Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of the Twenty-Eighth AAAI conference on artificial intelligence, AAAI Press, pp 2177–2183
Zhang D, Wu XJ, Yu J (2021) Learning latent hash codes with discriminative structure preserving for cross-modal retrieval. Pattern Anal Applic 24:283–297
Zheng C, Zhu L, Lu X, Li J, Cheng Z, Zhang H (2019) Fast discrete collaborative multi-modal hashing for large-scale multimedia retrieval. IEEE Trans Knowl Data Eng 32:2171– 2184
Zheng C, Zhu L, Zhang S, Zhang H (2020) Efficient parameter-free adaptive multi-modal hashing. IEEE Signal Process Lett 27:1270–1274
Zhu L, Lu X, Cheng Z, Li J, Zhang H (2020) Flexible multi-modal hashing for scalable multimedia retrieval. ACM Trans Intell Syst Technol (TIST) 11:1–20
Acknowledgements
The authors would like to thank the anonymous reviewers for their encouragement and helpful comments. The paper is supported by the Research startup Fund project of Zhengzhou University of light industry (Grant No.2021BSJJ025), the Henan Provincial Department of Science and Technology Research Project (Grant No. 222102210064), and the National Natural Science Foundation of China (Grant No. 62162033).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yu, J., Zhang, D., Shu, Z. et al. Adaptive multi-modal fusion hashing via Hadamard matrix. Appl Intell 52, 17170–17184 (2022). https://doi.org/10.1007/s10489-022-03367-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03367-w