Robust zero-shot discrete hashing with noisy labels for cross-modal retrieval

Yong, Kailing; Shu, Zhenqiu; Wang, Hongbin; Yu, Zhengtao

doi:10.1007/s13042-024-02131-5

Robust zero-shot discrete hashing with noisy labels for cross-modal retrieval

Original Article
Published: 13 April 2024

(2024)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Kailing Yong¹,
Zhenqiu Shu¹,
Hongbin Wang¹ &
…
Zhengtao Yu¹

86 Accesses
Explore all metrics

Abstract

Recently, zero-shot hashing methods have been successfully applied to cross-modal retrieval. However, these methods typically assume that the training data labels are accurate and noise-free, which is unrealistic in real-world scenarios due to the noises introduced by manual or automatic annotation. To address this problem, we propose a robust zero-shot discrete hashing with noisy labels (RZSDH) method, which fully considers the impact of noisy labels in real scenes. Our RZSDH method incorporates the sparse and low-rank constraints on the noise matrix and the recovered label matrix, respectively, to effectively reduce the negative impact of noisy labels. Therefore, this significantly enhances the robustness of our proposed method in practice cross-modal retrieval tasks. Additionally, the proposed RZSDH method learns a representation vector of each category attribute, which effectively captures the relationship between seen classes and unseen classes. Furthermore, our approach learns the common latent representation with drift from multimodal data features, which is more conducive to obtaining stable hash codes and hash functions. Finally, we employ a fine-grained similarity preserving strategy to generate more discriminative hash codes. Experiments on several benchmark datasets verify the effectiveness and robustness of the proposed RZSDH method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discrete Bidirectional Matrix Factorization Hashing for Zero-Shot Cross-Media Retrieval

Deep noise mitigation and semantic reconstruction hashing for unsupervised cross-modal retrieval

Article 03 January 2024

Category-Level Contrastive Learning for Unsupervised Hashing in Cross-Modal Retrieval

Article Open access 02 April 2024

Data availability

All data for this study are available from public repositories.

References

Shen F, Shen C, Liu W, Shen HT (2015) Supervised discrete hashing. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 37–45
Kou F, Du J, Cui W, Shi L, Cheng P, Chen J, Li J (2019) Common semantic representation method based on object attention and adversarial learning for cross-modal data in iov. IEEE Trans Veh Technol 68(12):11588–11598
Article Google Scholar
Shu Z, Li L, Yu J, Zhang D, Yu Z, Wu XJ (2023) Online supervised collective matrix factorization hashing for cross-modal retrieval. Appl Intell 53(11):14201–14218
Article Google Scholar
Shi L, Du J, Cheng G, Liu X, Xiong Z, Luo J (2022) Cross-media search method based on complementary attention and generative adversarial network for social networks. Int J Intell Syst 37(8):4393–4416
Article Google Scholar
Shi L, Luo J, Zhu C, Kou F, Cheng G, Liu X (2023) A survey on cross-media search based on user intention understanding in social networks. Inform Fusion 91:566–581
Article Google Scholar
Yu J, Huang W, Li Z, Shu Z, Zhu L (2022) Hadamard matrix-guided multi-modal hashing for multi-modal retrieval. Digital Signal Process 130:103743
Article Google Scholar
Li H, Zhang C, Jia X, Gao Y, Chen C (2021) Adaptive label correlation based asymmetric discrete hashing for cross-modal retrieval. IEEE Trans Knowl Data Eng 35(2):1185–1199
Google Scholar
Shu Z, Bai Y, Zhang D, Yu J, Yu Z, Wu XJ (2022) Specific class center guided deep hashing for cross-modal retrieval. Inf Sci 609:304–318
Article Google Scholar
Shu Z, Yong K, Zhang D, Yu J, Yu Z, Wu XJ (2023) Robust supervised matrix factorization hashing with application to cross-modal retrieval. Neural Comput Appl 35(9):6665–6684
Article Google Scholar
Hong C, Yu J, Zhang J, Jin X, Lee K (2018) Multimodal face-pose estimation with multitask manifold deep learning. IEEE Trans Ind Inf 15(7):3952–3961
Article Google Scholar
Hong C, Chen L, Liang Y, Zeng Z (2021) Stacked capsule graph autoencoders for geometry-aware 3d head pose estimation. Comput Vis Image Underst 208:103224
Article Google Scholar
Yu J, Tan M, Zhang H, Rui Y, Tao D (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell 44(2):563–578
Article Google Scholar
Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670
Article MathSciNet Google Scholar
Hong C, Yu J, Chen X (2013) Image-based 3d human pose recovery with locality sensitive sparse retrieval. In: 2013 IEEE International Conference on systems, man, and cybernetics, pp 2103–2108. IEEE, 2013
Yu J, Zhang D, Shu Z, Chen F (2022) Adaptive multi-modal fusion hashing via hadamard matrix. Appl Intell 52(15):17170–17184
Article Google Scholar
Hu P, Zhu H, Lin J, Peng D, Zhao YP, Peng X (2022) Unsupervised contrastive cross-modal hashing. IEEE Trans Pattern Anal Mach Intell 45(3):3877–3889
Google Scholar
Yang X, Liu W, Liu W, Tao D (2019) A survey on canonical correlation analysis. IEEE Trans Knowl Data Eng 33(6):2349–2368
Article Google Scholar
Hardoon D, Szedmak S, Shawe-Taylor J (2004) Canonical correlation analysis: An overview with application to learning methods. Neural Comput 16(12):2639–2664
Article Google Scholar
Yang X, Liu W, Tao D, Cheng J (2017) Canonical correlation analysis networks for two-view image recognition. Inf Sci 385:338–352
Article Google Scholar
Ding G, Guo Y, Zhou J (2014) Collective matrix factorization hashing for multimodal data. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 2075–2082
Wang D, Wang Q, He L, Gao X, Tian Y (2020) Joint and individual matrix factorization hashing for large-scale cross-modal retrieval. Pattern Recognit 107:107479
Article Google Scholar
Shen HT, Liu L, Yang Y, Xu X, Huang Z, Shen F, Hong R (2020) Exploiting subspace relation in semantic labels for cross-modal hashing. IEEE Trans Knowl Data Eng 33(10):3351–3365
Article Google Scholar
Wang L, Zareapoor M, Yang J, Zheng Z (2021) Asymmetric correlation quantization hashing for cross-modal retrieval. IEEE Trans Multimed 24:3665–3678
Article Google Scholar
Liu X, Li Z, Wang J, Yu G, Domenicon C, Zhang X (2019) Cross-modal zero-shot hashing. In: 2019 IEEE International Conference on data mining (ICDM), pages 449–458. IEEE
Zhong F, Chen Z, Min G (2019) An exploration of cross-modal retrieval for unseen concepts. In: Database systems for advanced applications: 24th International Conference, DASFAA 2019, Chiang Mai, Thailand, April 22–25, 2019, Proceedings, Part II 24, pp 20–35. Springer
Yuan X, Wang G, Chen Z, Zhong F (2021) Chop: an orthogonal hashing method for zero-shot cross-modal retrieval. Pattern Recognit Lett 145:247–253
Article Google Scholar
Zhou ZH (2018) A brief introduction to weakly supervised learning. Natl Sci Rev 5(1):44–53
Article Google Scholar
Wang R, Yu G, Zhang H, Guo M, Cui L, Zhang X (2021) Noise-robust deep cross-modal hashing. Inf Sci 581:136–154
Article MathSciNet Google Scholar
Zhang C, Li H, Gao Y, Chen C (2022) Weakly-supervised enhanced semantic-aware hashing for cross-modal retrieval. IEEE Trans Knowl Data Eng 35(6):6475–6488
Google Scholar
Wang W, Zheng VW, Yu H, Miao C (2019) A survey of zero-shot learning: settings, methods, and applications. ACM Trans Intell Syst Technol (TIST) 10(2):1–37
Google Scholar
Shu Z, Yong K, Yu J, Gao S, Mao C, Yu Z (2022) Discrete asymmetric zero-shot hashing with application to cross-modal retrieval. Neurocomputing 511:366–379
Article Google Scholar
Wang R, Yu G, Liu L, Cui L, Domeniconi C, Zhang X (2021) Cross-modal zero-shot hashing by label attributes embedding. arXiv preprint arXiv:2111.04080
Song L, Shang X, Yang C, Sun M (2022) Attribute-guided multiple instance hashing network for cross-modal zero-shot hashing. IEEE Trans Multimed 25:5305–5318
Article Google Scholar
Cui H, Zhu L, Cui C, Nie X, Zhang H (2020) Efficient weakly-supervised discrete hashing for large-scale social image retrieval. Pattern Recognit Lett 130:174–181
Article Google Scholar
Patrini G, Rozza A, Krishna Menon A, Nock R, Qu L (2017) Making deep neural networks robust to label noise: a loss correction approach. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 1944–1952
Han B, Yao Q, Yu X, Niu G, Xu M, Hu W, Tsang I, Sugiyama M (2018) Co-teaching: robust training of deep neural networks with extremely noisy labels. In: 32nd Conference on Neural Information Processing Systems (NIPS), pp 1–11
Liu X, Yu G, Domeniconi C, Wang J, Xiao G, Guo M (2019) Weakly supervised cross-modal hashing. IEEE Trans Big Data 8(2):552–563
Google Scholar
Wang M, Zhou W, Tian Q, Li H (2021) Deep enhanced weakly-supervised hashing with iterative tag refinement. IEEE Trans Multimed 24:2779–2790
Article Google Scholar
Hu P, Peng X, Zhu H, Zhen L, Lin J (2021) Learning cross-modal retrieval with noisy labels. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 5403–5413
Kulis Bn, Grauman K (2009). Kernelized locality-sensitive hashing for scalable image search. In: 2009 IEEE 12th International Conference on computer vision, pp 2130–2137. IEEE
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Wang Y, Chen ZD, Luo X, Xu XS (2022) A high-dimensional sparse hashing framework for cross-modal retrieval. IEEE Trans Circuits Syst Video Technol 32(12):8822–8836
Article Google Scholar
Liu X, Nie X, Zeng W, Cui C, Zhu L, Yin Y (2018) Fast discrete cross-modal hashing with regressing from semantic labels. In: Proceedings of the 26th ACM International Conference on Multimedia, pp 1662–1669
Liu W, Mu C, Kumar S, Chang SF (2014) Discrete graph hashing. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, pp 3419–3427
Cai JF, Candès EJ, Shen Z (2010) A singular value thresholding algorithm for matrix completion. SIAM J Optim 20(4):1956–1982
Article MathSciNet Google Scholar
Rudin Walter et al (1976) Principles of mathematical analysis, vol 3. McGraw-hill, New York
Google Scholar
Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GR, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM International Conference on multimedia, pp 251–260
Huiskes MJ, Lew MS (2008). The mir flickr retrieval evaluation. In: Proceedings of the 1st ACM International Conference on multimedia information retrieval, pp 39–43
Lin Z, Ding G, Hu M, Wang J (2015) Semantics-preserving hashing for cross-view retrieval. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3864–3872
Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009). Nus-wide: a real-world web image database from national university of Singapore. In Proceedings of the ACM International Conference on image and video retrieval, pp 1–9
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick C (2014) Microsoft coco: common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pp 740–755. Springer
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Wang D, Gao X, Wang X, He L (2018) Label consistent matrix factorization hashing for large-scale cross-modal similarity search. IEEE Trans Pattern Anal Mach Intell 41(10):2466–2479
Article Google Scholar
Wang Y, Luo X, Nie L, Song J, Zhang W, Xu X (2020) Batch: a scalable asymmetric discrete cross-modal hashing. IEEE Trans Knowl Data Eng 33(11):3507–3519
Article Google Scholar
Luo K, Zhang C, Li H, Jia X, Chen C (2023) Adaptive marginalized semantic hashing for unpaired cross-modal retrieval. IEEE Trans Multimed 25:9082–9095
Article Google Scholar
Sun Y, Ren Z, Hu P, Peng D, Wang X (2023) Hierarchical consensus hashing for cross-modal retrieval. IEEE Trans Multimed 26:824–836
Article Google Scholar
Ni H, Zhang J, Kang P, Fang X, Sun W, Xie S, Han N (2023) Cross-modal hashing with missing labels. Neural Netw 165:60–76
Article Google Scholar
Xu Y, Yang Y, Shen F, Xu X, Zhou Y, Shen HT (2017) Attribute hashing for zero-shot image retrieval. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), pp 133–138. IEEE

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China [Grant nos. 61603159, 62162033, U21B2027], Yunnan Provincial Major Science and Technology Special Plan Projects [Grant nos. 202002AD080001, 202103AA080015], Yunnan Foundation Research Projects [Grant nos. 202101AT070438, 202101BE070001-056].

Author information

Authors and Affiliations

Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500, China
Kailing Yong, Zhenqiu Shu, Hongbin Wang & Zhengtao Yu

Authors

Kailing Yong
View author publications
You can also search for this author in PubMed Google Scholar
Zhenqiu Shu
View author publications
You can also search for this author in PubMed Google Scholar
Hongbin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhengtao Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhenqiu Shu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yong, K., Shu, Z., Wang, H. et al. Robust zero-shot discrete hashing with noisy labels for cross-modal retrieval. Int. J. Mach. Learn. & Cyber. (2024). https://doi.org/10.1007/s13042-024-02131-5

Download citation

Received: 24 October 2023
Accepted: 10 March 2024
Published: 13 April 2024
DOI: https://doi.org/10.1007/s13042-024-02131-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust zero-shot discrete hashing with noisy labels for cross-modal retrieval

Abstract

Access this article

Similar content being viewed by others

Discrete Bidirectional Matrix Factorization Hashing for Zero-Shot Cross-Media Retrieval

Deep noise mitigation and semantic reconstruction hashing for unsupervised cross-modal retrieval

Category-Level Contrastive Learning for Unsupervised Hashing in Cross-Modal Retrieval

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust zero-shot discrete hashing with noisy labels for cross-modal retrieval

Abstract

Access this article

Similar content being viewed by others

Discrete Bidirectional Matrix Factorization Hashing for Zero-Shot Cross-Media Retrieval

Deep noise mitigation and semantic reconstruction hashing for unsupervised cross-modal retrieval

Category-Level Contrastive Learning for Unsupervised Hashing in Cross-Modal Retrieval

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation