A fast weighted multi-view Bayesian learning scheme with deep learning for text-based image retrieval from unlabeled galleries

Oussama, Aiadi; Khaldi, Belal; Kherfi, Mohammed Lamine

doi:10.1007/s11042-022-13788-x

A fast weighted multi-view Bayesian learning scheme with deep learning for text-based image retrieval from unlabeled galleries

Published: 17 September 2022

Volume 82, pages 10795–10812, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

229 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, we propose a new computationally fast method for text-based image retrieval from unlabeled galleries, where retrieval is formulated as a multi-class learning problem. While most existing methods assign images representing the same concept with equal importance during learning, we propose a weighted multi-view likelihood term to deal with the intra-class variations within training set of each concept. At first, we cluster each training set to detect the concept’s visual appearances (views). Because number of clusters may significantly vary from one set to another, abusively unifying such a hyper-parameter over all the sets could degrade the learning outcomes. We, therefore, propose to automatically and precisely accomplish this task using Davies-Bouldin index. Noting that images are represented using deep features, which are normalized using vanilla-L₂ rule to deal with bursty visual features. The proposed multi-view term is constructed by combining multivariate normal probability density functions related to the resulting clusters. This term is then incorporated within a naïve Bayes classifier alongside with the prior probability of the concept, where each component is weighted using Expectation-Maximization (EM) algorithm. Given a textual query, relevant images are the ones that reach the maximum scores of posterior probability, which is calculated using our Bayesian learning scheme. Experimental results on public datasets demonstrate the effectiveness and rapidity of the proposed method compared to several other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Information-Theoretic Active Learning for Content-Based Image Retrieval

A novel multimodal clustering framework for images with diverse associated text

Article 10 January 2019

Multilayer Semantic Analysis in Image Databases

Data availability

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

References

Aggarwal AK (2015) Machine Vision Based Self Position Estimation of Mobile Robots. Int J Electron Commun Eng Technol 6(10)
Aiadi O, Khaldi B, Kherfi ML (2016) Retrieving images from unlabeled photo collections using a textual query. In: Second international conference on pattern analysis and intelligent systems, 218–223
Amiri SH, Jamzad M (2015) Efficient multi-modal fusion on supergraph for scalable image annotation. Pattern Recogn 48(7):2241–2253
Article MATH Google Scholar
Arora K, Aggarwal AK (2018) Approaches for image database retrieval based on color, texture, and shape features. In: Handbook of research on advanced concepts in real-time image and video processing. IGI Global, pp 28–50
Chapter Google Scholar
Bello-Cerezo R et al (2019) Comparative Evaluation of Hand-Crafted Image Descriptors vs. Off-the-Shelf CNN-Based Features for Colour Texture Classification under Ideal and Realistic Conditions. Appl Sci 9(4):738
Article Google Scholar
Cai X et al (2013) New graph structured sparsity model for multi-label image annotations. In: Proceedings of the IEEE International Conference on Computer Vision
Cao X, Zhang H, Guo X, Liu S, Meng D (2015) Sled: semantic label embedding dictionary representation for multilabel image annotation. IEEE Trans Image Process 24(9):2746–2759
Article MathSciNet MATH Google Scholar
Chen W et al (2021) Deep learning for instance retrieval: a survey. arXiv preprint
Chen J et al (2010) WLD: A robust local image descriptor. IEEE Trans Pattern Anal Mach Intell 32(9):1705–1720
Article Google Scholar
Chen Y, Liu L, Tao J, Chen X, Xia R, Zhang Q, Xiong J, Yang K, Xie J (2021) The image annotation algorithm using convolutional features from intermediate layer of deep learning. Multimed Tools Appl 80(3):4237–4261
Article Google Scholar
Cusano C, Napoletano P, Schettini R (2016) Combining multiple features for color texture classification. J Electron Imaging 25(6):061410
Article Google Scholar
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 2:224–227
Article Google Scholar
Escalante HJ, Hernández CA, Gonzalez JA, López-López A, Montes M, Morales EF, Enrique Sucar L, Villaseñor L, Grubinger M (2010) The segmented and annotated IAPR TC-12 benchmark. Comput Vis Image Underst 114(4):419–428
Article Google Scholar
Guillaumin M et al (2009) Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In: 2009 IEEE 12th international conference on computer vision. IEEE
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. ACM
Jing X-Y, Wu F, Li Z, Hu R, Zhang D (2016) Multi-label dictionary learning for image annotation. IEEE Trans Image Process 25(6):2712–2725
Article MathSciNet MATH Google Scholar
Kalayeh MM, Idrees H, Shah M (2014) NMF-KNN: Image annotation using weighted multi-view non-negative matrix factorization. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Khaldi B, Aiadi O, Kherfi ML (2019) Combining colour and grey-level co-occurrence matrix features: a comparative study. IET Image Process 13(9):1401–1410
Article Google Scholar
Khaldi B, Aiadi O, Lamine KM (2020) Image representation using complete multi-texton histogram. Multimed Tools Appl 79(11):8267–8285
Article Google Scholar
Lavrenko V, Manmatha R, Jeon J (2004) A model for learning the semantics of pictures. In: Advances in neural information processing systems
Li Z et al (2021) A semi-supervised learning approach based on adaptive weighted fusion for automatic image annotation. ACM Trans Multimedia Comput Commun Appl (TOMM) 17(1):1–23
Article Google Scholar
Li H, Li W, Zhang H, He X, Zheng M, Song H (2021) Automatic image annotation by sequentially learning from multi-level semantic neighborhoods. IEEE Access 9:135742–135754
Article Google Scholar
Liu W, Tao D, Cheng J, Tang Y (2014) Multiview hessian discriminative sparse coding for image annotation. Comput Vis Image Underst 118:50–60
Article Google Scholar
Liu M et al (2015) Low-rank multi-view learning in matrix completion for multi-label image classification. In: Twenty-ninth AAAI conference on artificial intelligence, 2778–2784
Moran S, Lavrenko V (2014) Sparse kernel learning for image annotation. In: Proceedings of international conference on multimedia retrieval. ACM
Mori Y, Takahashi H, Oka R (1999) Image-to-word transformation based on dividing and vector quantizing images with words. In First international workshop on multimedia intelligent storage and retrieval management. Citeseer
Nair LR et al (2020) Essentiality for bridging the gap between low and semantic level features in image retrieval systems: an overview. J Ambient Intell Humaniz Comput:1–13
Rao SS et al (2014) A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159(7):1665–1680
Article Google Scholar
Salih FAA, Abdulla AA (2021) An efficient two-layer based technique for content-based image retrieval. UHD J Sci Technol 5(1):28–40
Article Google Scholar
Salih SF, Abdulla AA (2021) An improved content based image retrieval technique by exploiting bi-layer concept. UHD J Sci Technol 5(1):1–12
Article Google Scholar
Sharif Razavian A et al (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops
Song H, Wang P, Yun J, Li W, Xue B, Wu G (2020) A weighted topic model learned from local semantic space for automatic image annotation. IEEE Access 8:76411–76422
Article Google Scholar
Srivastava D, Rajitha B, Agarwal S, Singh S (2018) Pattern-based image retrieval using GLCM. Neural Comput & Applic 32:1–14
Google Scholar
Sun F, Tang J, Li H, Qi GJ, Huang TS (2014) Multi-label image categorization with sparse factor representation. IEEE Trans Image Process 23(3):1028–1037
Article MathSciNet MATH Google Scholar
Thukral R, Kumar A, Arora A (2019) Effect of different thresholding techniques for denoising of emg signals by using different wavelets. In: 2019 2nd international conference on intelligent communication and computational techniques (ICCT). IEEE.
Verma Y, Jawahar C (2012) Image annotation using metric learning in semantic neighbourhoods. In: European conference on computer vision. 2012. Springer
Von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM
Wang C, et al (2009) Multi-label sparse coding for automatic image annotation. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE
Wang W et al (2021) Exploring cross-image pixel contrast for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Xue Z, Li G, Huang Q (2016) Joint multi-view representation learning and image tagging. In: Thirtieth AAAI Conference on Artificial Intelligence
Xue Z, Li G, Huang Q (2018) Joint multi-view representation and image annotation via optimal predictive subspace learning. Inf Sci 451:180–194
Article MathSciNet MATH Google Scholar
Youcefa A, Kherfi ML, Khaldi B, Aiadi O (2019) Understanding user intention in image retrieval: generalization selection using multiple concept hierarchies. TELKOMNIKA 17(5):2572–2586
Article Google Scholar
Zhang M-L, Wu L (2015) Lift: multi-label learning with label-specific features. IEEE Trans Pattern Anal Mach Intell 37(1):107–120
Article MathSciNet Google Scholar
Zhou T et al (2020) Motion-attentive transition for zero-shot video object segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence

Download references

Author information

Authors and Affiliations

Department of Computer Science, Artificial Intelligence and Information Technology Laboratory (LINATI), University of Kasdi Merbah, 30000, Ouargla, Algeria
Aiadi Oussama & Belal Khaldi
LAboratoire de recherche en Mathématiques et Informatique Appliquées (LAMIA), 3351, Boulevard des Forges, C.P. 500, Trois-Riviéres, G9A 5H7, Canada
Mohammed Lamine Kherfi

Authors

Aiadi Oussama
View author publications
You can also search for this author in PubMed Google Scholar
Belal Khaldi
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Lamine Kherfi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aiadi Oussama.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Oussama, A., Khaldi, B. & Kherfi, M.L. A fast weighted multi-view Bayesian learning scheme with deep learning for text-based image retrieval from unlabeled galleries. Multimed Tools Appl 82, 10795–10812 (2023). https://doi.org/10.1007/s11042-022-13788-x

Download citation

Received: 11 September 2021
Revised: 15 July 2022
Accepted: 05 September 2022
Published: 17 September 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s11042-022-13788-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fast weighted multi-view Bayesian learning scheme with deep learning for text-based image retrieval from unlabeled galleries

Abstract

Access this article

Similar content being viewed by others

Information-Theoretic Active Learning for Content-Based Image Retrieval

A novel multimodal clustering framework for images with diverse associated text

Multilayer Semantic Analysis in Image Databases

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A fast weighted multi-view Bayesian learning scheme with deep learning for text-based image retrieval from unlabeled galleries

Abstract

Access this article

Similar content being viewed by others

Information-Theoretic Active Learning for Content-Based Image Retrieval

A novel multimodal clustering framework for images with diverse associated text

Multilayer Semantic Analysis in Image Databases

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation