Unsupervised query-adaptive implicit subtopic discovery for diverse image retrieval based on intrinsic cluster quality

Figuerêdo, José Solenir Lima; Calumby, Rodrigo Tripodi

doi:10.1007/s11042-022-13050-4

Unsupervised query-adaptive implicit subtopic discovery for diverse image retrieval based on intrinsic cluster quality

1135T: Social Multimedia Processing
Published: 04 May 2022

Volume 81, pages 42991–43011, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

188 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Given the complex search tasks imposed to social multimedia retrieval systems, the generated similarity-based ranked results often represent redundant item sets, including, e.g., near-duplicates or unrepresentative samples. In this context, several real-world search tasks demand broad coverage of multiple implicit subtopics of a given query in order to properly fulfill the user need. Many works have proposed the use of result diversification for addressing such problem. As a popular approach, the diversification is achieved by grouping similar items obtained from the original ranked list. Hence, a new and diverse ranked list is constructed by iteratively selecting a representative item from each cluster. However, the definition of the number of clusters (subtopics) to be discovered is a long-lasting challenge. Moreover, most clustering optimization approaches for diversification rely on offline training for the selection of a general best configuration used for all queries at run-time. However, this is a complex task given the multiple heterogeneity associated (data, user, query, concepts, etc.) and the consequent impact on the effectiveness of retrieval algorithms. Therefore, such approaches are usually prone to overfit. Hence, in order to attenuate such problems, this work proposes a novel diverse image retrieval approach as an unsupervised query-adaptive subtopic discovery based on intrinsic clustering quality optimization. Our experimental analysis have shown significant improvements in relation to the baseline, both in terms of relevance and diversity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pseudo-relevance feedback diversification of social image retrieval results

Article 25 June 2016

Towards both Local and Global Query Result Diversification

Multi-level diversification approach of semantic-based image retrieval results

Article 22 June 2019

Notes

http://www.flickr.com (As of July 2021)

References

Baeza-Yates R, Ribeiro-Neto B (2011) Modern information retrieval: the concepts and technology behind search, 2nd edn. Addison-Wesley Publishing Company, USA
Google Scholar
Bholowalia P, Kumar A (2014) Ebk-means: A clustering technique based on elbow method and k-means in wsn. Int J Comput Appl 105(9)
Biasotti S, Cerri A, Giorgi D, Spagnuolo M (2013) PHOG: photometric and geometric functions for textured shape retrieval. Comput Graph Forum 32(5):13–22. https://doi.org/10.1111/cgf.12168
Article Google Scholar
Calumby RT, Gonçalves MA, da Silva Torres R (2017) Diversity-based interactive learning meets multimodality. Neurocomputing 259:159–175. https://doi.org/10.1016/j.neucom.2016.08.129
Article Google Scholar
Carbonell J, Goldstein J (1998) The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM Conference on Research and Development in Information Retrieval. ACM, New York, NY, USA. https://doi.org/10.1145/290941.291025, pp 335–336
Chang W, Yeh Y, Wang YF (2016) Style-oriented landmark retrieval and summarization. In: Asia-pacific signal and information processing association annual summit and conference, APSIPA 2016, jeju, south korea, december 13-16, 2016. IEEE. https://doi.org/10.1109/APSIPA.2016.7820857, pp 1–4
Chatzichristofis SA, Boutalis YS (2008) CEDD: color and edge directivity descriptor: A compact descriptor for image indexing and retrieval. In: Computer Vision systems, 6th international conference, ICVS 2008, Santorini, Greece, May 12-15, 2008, Proceedings. https://doi.org/10.1007/978-3-540-79547-6_30, pp 312–322
Chatzichristofis SA, Boutalis YS (2008) FCTH: fuzzy color and texture histogram - A low level feature for accurate image retrieval. In: Ninth International Workshop on Image Analysis for Multimedia Interactive Services, WIAMIS 2008, Klagenfurt, Austria, May 7-9, 2008. https://doi.org/10.1109/WIAMIS.2008.24, pp 191–196
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(2):224–227. https://doi.org/10.1109/TPAMI.1979.4766909
Article Google Scholar
Do Carmo Araujo IBA, Calumby RT (2016) Features fusion for diversity gap reduction. In: 31º Simpȯsio brasileiro de banco de dados, 2016, salvador, bahia, brasil, october 4-7, 2016, pp 175–180
Dunn JC (1973) A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. J Cybern 3(3):32–57. https://doi.org/10.1080/01969727308546046
Article MathSciNet MATH Google Scholar
Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, USA. https://www.aaai.org/Papers/KDD/1996/KDD96-037.pdf. Accessed 13 Aug 2021, pp 226–231
Ferreira CD, Calumby RT, do Carmo Araujo IBA, Dourado ÍC, Muñoz JAV, Penatti OAB, Li LT, Almeida J, da Silva Torres R (2016) Recod @ mediaeval 2016: Diverse social images retrieval. In: Working notes proceedings of the MediaEval 2016 Workshop, Hilversum, The Netherlands, October 20-21, 2016. http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_21.pdf. Accessed 13 Aug 2021
Ferreira CD, Calumby RT, do Carmo Araujo IBA, Dourado ÍC, Muñoz JAV, Penatti OAB, Li LT, Almeida J, da Silva Torres R (2016) Recod @ mediaeval 2016: Diverse social images retrieval. In: Working Notes Proceedings of the MediaEval 2016 Workshop, Hilversum, The Netherlands, October 20-21, 2016. http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_21.pdf. Accessed 13 Aug 2021
González ÁC, Garcia XB, García-Serrano A, de Ves Cuenca E (2016) UNED-UV@retrieving diverse social images task. In: Working Notes Proceedings of the MediaEval 2016 Workshop, Hilversum, The Netherlands, October 20-21, 2016. http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_17.pdf. Accessed 13 Aug 2021
Han J, Kamber M, Pei J (2012) 10 - cluster analysis: Basic concepts and methods. In: Data mining: concepts and techniques, the morgan kaufmann series in data management systems, third edn. Morgan Kaufmann, Boston, pp 443–495
He J, Meij E, de Rijke M (2011) Result diversification based on query-specific cluster ranking. J Assoc Inf Sci Technol 62(3):550–571. https://doi.org/10.1002/asi.21468
Google Scholar
Ionescu B, Gînscă A, Boteanu B, Popescu A, Lupu M, Müller H (2015) Retrieving diverse social images at mediaeval 2015: Challenge, dataset and evaluation. In: Working Notes Proceedings of the MediaEval 2015 Workshop. Wurzen. http://ceur-ws.org/Vol-1436/Paper2.pdf. Accessed 13 Aug 2021
Ionescu B, Gînsca A, Boteanu B, Popescu A, Lupu M, Müller H (2015) Retrieving diverse social images at mediaeval 2015: Challenge, dataset and evaluation. In: Working Notes Proceedings of the MediaEval 2015 Workshop, Wurzen, Germany, September 14-15, 2015. http://ceur-ws.org/Vol-1436/Paper2.pdf. Accessed 13 Aug 2021
Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651–666. https://doi.org/10.1016/j.patrec.2009.09.011
Article Google Scholar
Kharazmi S, Sanderson M, Scholer F, Vallet D (2014) Using score differences for search result diversification. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 1́4. https://doi.org/10.1145/2600428.2609530. Association for Computing Machinery, New York, NY, USA, pp 1143–1146
Lewis J, Ossowski S, Hicks JM, Errami M, Garner HR (2006) Text similarity: an alternative way to search MEDLINE. Bioinformatics 22 (18):2298–2304. https://doi.org/10.1093/bioinformatics/btl388
Article Google Scholar
Liang J, Zhao X, Li D, Cao F, Dang C (2012) Determining the number of clusters using information entropy for mixed data. Pattern Recogn 45 (6):2251–2265. https://doi.org/10.1016/j.patcog.2011.12.017. Brain Decoding
Article MATH Google Scholar
Lux M, Chatzichristofis SA (2008) lire: lucene image retrieval: an extensible java CBIR library. In: Proceedings of the 16th International Conference on Multimedia 2008, Vancouver, British Columbia, Canada, October 26-31, 2008. https://doi.org/10.1145/1459359.1459577, pp 1085–1088
Nisbet R, Elder J, Miner G (2009) Chapter 13 - model evaluation and enhancement. In: R. Nisbet, J. Elder, G. Miner (eds.) Handbook of Statistical Analysis and Data Mining Applications. Academic Press, Boston, pp 285–312
Penatti OAB, Valle E, da Silva Torres R (2012) Comparative study of global color and texture descriptors for web image retrieval. J Vis Commun Image Represent 23(2):359–380. https://doi.org/10.1016/j.jvcir.2011.11.002
Article Google Scholar
Peng L, Bin Y, Fu X, Zhou J, Yang Y, Shen HT (2017) Cfm@mediaeval 2017 retrieving diverse social images task via re-ranking and hierarchical clustering. In: Working Notes Proceedings of the MediaEval 2017 Workshop co-located with the Conference and Labs of the Evaluation Forum, Dublin, Ireland, September 13-15, 2017. http://ceur-ws.org/Vol-1984/Mediaeval_2017_paper_23.pdf. Accessed 13 Aug 2021
Raman K, Shivaswamy P, Joachims T (2012) Online learning to diversify from implicit feedback. In: The 18th ACM international conference on knowledge discovery and data mining, 2012, beijing, china, august 12-16, 2012, pp. 705–713. https://doi.org/10.1145/2339530.2339642
Rao V, Jain P, Jawahar CV (2016) Diverse yet efficient retrieval using locality sensitive hashing. In: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, ICMR 2016, New York, New York, USA, June 6-9, 2016. https://doi.org/10.1145/2911996.2911998, pp 189–196
Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496. https://doi.org/10.1126/science.1242072
Article Google Scholar
Rokach L, Maimon O (2005) Clustering Methods. Springer, Boston, pp 321–352
Google Scholar
Rousseeuw PJ (1987) Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65. https://doi.org/10.1016/0377-0427(87)90125-7. https://www.sciencedirect.com/science/article/pii/0377042787901257
Article MATH Google Scholar
Samani ZR, Moghaddam ME (2017) A knowledge-based semantic approach for image collection summarization. Multimed Tools Appl 76(9):11917–11939. https://doi.org/10.1007/s11042-016-3840-1
Article Google Scholar
Santos RLT, Macdonald C, Ounis I (2015) Search result diversification. Found Trends Inf Retr 9(1):1–90. https://doi.org/10.1561/1500000040
Article Google Scholar
Soleymani M, Riegler M, Halvorsen P (2017) Multimodal analysis of image search intent: Intent recognition in image search from user behavior and visual content. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, ICMR 2017, Bucharest, Romania, June 6-9, 2017. https://doi.org/10.1145/3078971.3078995, pp 251–259
Spyromitros-Xioufis E, Papadopoulos S, Ginsca AL, Popescu A, Kompatsiaris Y, Vlahavas I (2015) Improving diversity in image search via supervised relevance scoring. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, ICMR ’15, Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2671188.2749334, pp 323–330
Tollari S (2016) UPMC at mediaeval 2016 retrieving diverse social images task. In: Working Notes Proceedings of the MediaEval 2016 Workshop, Hilversum, The Netherlands, October 20-21, 2016. http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_14.pdf. Accessed 13 Aug 2021
Tripathi S, Bhardwaj AEP (2018) Approaches to clustering in customer segmentation. Int J Eng Technol 7:802. https://doi.org/10.14419/ijet.v7i3.12.16505
Article Google Scholar
Ünlü R, Xanthopoulos P (2019) Estimating the number of clusters in a dataset via consensus clustering. Expert Syst Appl 125:33–39. https://doi.org/10.1016/j.eswa.2019.01.074
Article Google Scholar
Vargas S, Castells P, Vallet D (2012) Explicit relevance models in intent-oriented information retrieval diversification. In: The 35th international ACM conference on research and development in information retrieval, 2012, portland, OR, USA, August 12-16, 2012. https://doi.org/10.1145/2348283.2348297, pp 75–84
Veltkamp RC, Tanase M, Sent D (1999) Features in content-based image retrieval systems: a survey. In: State-of-the-art in content-based image and video retrieval [dagstuhl seminar, 5-10 december 1999]. https://doi.org/10.1007/978-94-015-9664-0_5, pp 97–124
Vieira MR, Razente HL, Barioni MCN, Hadjieleftheriou M, Srivastava D, Traina C, Tsotras VJ (2011) On query result diversification. In: Proceedings of the ieee 27th international conference on data engineering. https://doi.org/10.1109/ICDE.2011.5767846, pp 1163–1174
Xie XL, Beni G (1991) A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Machine Intell 13(8):841–847. https://doi.org/10.1109/34.85677
Article Google Scholar
Xu J, Xia L, Lan Y, Guo J, Cheng X (2017) Directly optimize diversity evaluation measures: A new approach to search result diversification ACM Transactions on Intelligent Systems and Technology 8(3). https://doi.org/10.1145/2983921
Yu H, Liu Z, Wang G (2014) An automatic method to determine the number of clusters using decision-theoretic rough set. Int J Approx Reason 55 (1, Part 2):101–115. https://doi.org/10.1016/j.ijar.2013.03.018. Special issue on Decision-Theoretic Rough Sets
Article MathSciNet MATH Google Scholar
Zagoris K, Chatzichristofis S, Papamarkos N, Boutalis SY (2010) Automatic image annotation and retrieval using the joint composite descriptor. In: 14Th panhellenic conference on informatics, 2010, tripoli, greece, september 10-12, 2010. https://doi.org/10.1109/PCI.2010.38, pp 143–147
Zaharieva M (2016) An adaptive clustering approach for the diversification of image retrieval results. In: Working Notes Proceedings of the MediaEval 2016 Workshop, Hilversum, The Netherlands, October 20-21, 2016. http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_12.pdf. Accessed 13 Aug 2021
Zhai CX, Cohen WW, Lafferty J (2003) Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In: Proceedings of the 26th Annual International ACM Conference on Research and Development in Information Retrieval, pp. 10–17. ACM, New York, NY, USA. https://doi.org/10.1145/860435.860440
Zhang T, Ramakrishnan R, Livny M (1996) BIRCH: An efficient data clustering method for very large databases. In: Proceedings of the 1996 ACM International Conference on Management of Data, Montreal, Quebec, Canada, June 4-6, 1996. https://doi.org/10.1145/233269.233324, pp 103–114

Download references

Author information

Authors and Affiliations

Department of Exact Sciences, University of Feira de Santana, Feira de Santana, Bahia, Brazil
José Solenir Lima Figuerêdo & Rodrigo Tripodi Calumby

Authors

José Solenir Lima Figuerêdo
View author publications
You can also search for this author in PubMed Google Scholar
Rodrigo Tripodi Calumby
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to José Solenir Lima Figuerêdo.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Figuerêdo, J.S.L., Calumby, R.T. Unsupervised query-adaptive implicit subtopic discovery for diverse image retrieval based on intrinsic cluster quality. Multimed Tools Appl 81, 42991–43011 (2022). https://doi.org/10.1007/s11042-022-13050-4

Download citation

Received: 01 May 2020
Revised: 09 March 2022
Accepted: 04 April 2022
Published: 04 May 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s11042-022-13050-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised query-adaptive implicit subtopic discovery for diverse image retrieval based on intrinsic cluster quality

Abstract

Access this article

Similar content being viewed by others

Pseudo-relevance feedback diversification of social image retrieval results

Towards both Local and Global Query Result Diversification

Multi-level diversification approach of semantic-based image retrieval results

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unsupervised query-adaptive implicit subtopic discovery for diverse image retrieval based on intrinsic cluster quality

Abstract

Access this article

Similar content being viewed by others

Pseudo-relevance feedback diversification of social image retrieval results

Towards both Local and Global Query Result Diversification

Multi-level diversification approach of semantic-based image retrieval results

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation