Abstract
The quantity of information placed on the web has been greater than before and is increasing rapidly day by day. Searching through the huge amount of data and finding the most relevant and useful result set involves searching, ranking, and presenting the results. Most of the users probe into the top few results and neglect the rest. In order to increase user’s satisfaction, the presented result set should not only be relevant to the search topic, but should also present a variety of perspectives, that is, the results should be different from one another. The effectiveness of web search and the satisfaction of users can be enhanced through providing various results of a search query in a certain order of relevance and concern. The technique used to avoid presenting similar, though relevant, results to the user is known as a diversification of search results. This article presents a survey of the approaches used for search result diversification. To this end, this article not only provides a technical survey of existing diversification techniques, but also presents a taxonomy of diversification algorithms with respect to the types of search queries.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Giunchiglia F (2006) Managing diversity in knowledge. Advances in applied artificial intelligence. Springer, Berlin, Heidelberg
Gollapudi S, Sharma A (2009) An Axiomatic Approach for Result Diversification. In: The international world wide web conference committee (IW3C2), ACM, Madrid
Zhai CX, Cohen WW, Lafferty J (2003) Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: SIGIR’03 proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, ACM, New York, NY
Clough P, Sanderson M, Abouammoh M, Navarro S, Paramita M (2009) Multiple approaches to analysing query diversity. In: SIGIR’09 proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval, ACM, New York, NY
Jain A, Sarda P, Haritsa JR (2004) Providing diversity in K-nearest neighbor query. Advances in Knowledge Discovery and Data Mining. Springer, Berlin
Minack E, Demartini G, Nejdl W (2009) Current approaches to search result diversification. In: Proceedings of the first international workshop on living web, collocated with the 8th international semantic web conference (ISWC 2009), s.n., Washington D.C.
Sparck-Jones K, Robertson SE, Sanderson M (2007) Ambiguous requests: implications for retrieval tests, systems and theories, University of Cambridge ACM SIGIR, , s.n., UK, pp 8–17
Capannini G, Nardini FM, Perego R, Silvestri F (2011) Efficient diversification of web search results, VLDB Endowmen, Seattle, Washington
Rafiei D, Gollapudi S, Halverson A, Ieong S (2010) Diversifying web search results. In: the international world wide web conference committee (IW3C2), ACM, North Carolina
Zheng W, Wang X, Fang H, Cheng H (2011) An exploration of pattern-based subtopic modeling for search result diversification, JCDL, New York, NY. ISBN: 978-1-4503-0744-4
Santos RLT, Peng J, Macdonald C, Ounis I (2010) Explicit search result diversification through sub-queries. In: European conference, Springer, Berlin, pp 87–99
Capannini G, Nardini FM, Perego R, Silvestri F (2011) Efficient diversification of search results using query logs. In: WWW’11 Proceedings of the 20th international conference companion on World wide web, New York, NY, ACM
Boldi P, Bonchi F, Castillo C, Vigna S (2009) From “Dango” to “Japanese Cakes”:Query reformulation models and patterns. WI’09. IEEE CS Press
Song R, Luo Z, Wen J-R, Y Yu (2007) Identifying ambiguous queries in web search. In: WWW’07 proceedings of the 16th international conference on world wide web, ACM, New York, NY
Richardson M, Dominowska E, Ragno R (2007) Predicting clicks: estimating the click-through rate for new ads.. In Proceedings of the 16th international world wide web conference(WWW-2007)
Vargas S, Santos RLT, Macdonald C, Ounis I (2013) Selecting effective expansion terms for diversity. In: OAIR’13 proceedings of the 10th conference on open research areas in information retrieval, France : Le Centre De Hautes Etudes Internationales D’Informatique Documentaire, Paris
Baeza-Yates R, Hurtado C, Mendoza M (2004) Query recommendation using query logs in search engines. In: EDBT’04 proceedings of the 2004 international conference on current trends in database technology, Springer, Berlin, Heidelberg
Santos RLT, Macdonald C, Ounis I (2010) Exploiting query reformulations for web search result diversification. In: WWW’10 proceedings of the 19th international conference on world wide web, ACM, New York, NY
Vallet D, Castells P (2012) Personalized diversification of search results. In: SIGIR’12 proceedings of the 35th international acm sigir conference on research and development in information retrieval, ACM, New York, NY
Vallet D, Castells P (2011) On diversifying and personalized web search, SIGIR’11, ACM, Beijing
Vallet D, Cantador I, Joemon M. Jose (2010) Personalizing web search with folksonomy-based user and document profiles. ECIR’2010 Proceedings of the 32nd European conference on Advances in Information Retrieval, Springer, Berlin
Sydow M, Ciesielski K, Wajda J (2011) Introducing diversity to log-based query suggestions to deal with underspecified user queries. In proceeding of: security and intelligent information systems—international joint conferences, DBLP, Warsaw
Radlinski F, Dumais S (2006) Improving personalized web search using result diversification. In: SIGIR’06 proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, ACM, New York, NY
Speretta M, Gauch S (2005) Personalized search based on user search histories. In: Web intelligence, 2005, proceedings the 2005 IEEE/WIC/ACM International Conference, SPEC, USA
Vallet, D (2011) Crowdsourced evaluation of personalization and diversification techniques in web search. In: Proceedings of the SIGIR 2011 workshop on text retrieval conference (TREC)
Micarelli A, Gasparetti F, Sciarrone F, Gauch S (2007) Personalized search on the world wide web. In: The adaptive web Springer, Berlin
Bozzon A, Brambilla M, Ceri S, Fraternali P Liquid query: multi-domain exploratory search. In: International world wide web conference, ACM, Raleigh, North Carolina, 2010
Ceri S, Brambilla M (2011) Search computing trends and developments. Springer Science & Business Media, Berlin, Heidelberg
Brambilla M, Brambilla M, Fraternali P, Tagliasacchi M (2011) Diversification for multi-domain result sets. In: CIKM’11 proceedings of the 20th ACM international conference on Information and knowledge management, ACM, New York, NY
Vee E, Srivastava U, Shanmugasundaram J, Bhat P (2008) Efficient computation of diverse query results. In: ICDE’08 proceedings of the 2008 IEEE 24th international conference on data engineering, IEEE Computer Society, Washington, DC
Martinenghi D, Tagliasacchi M (2010) Proximity rank join. In: Proceedings of the VLDB endowment
Demidova E, Fankhauser P, Zhou X, Nejdl W (2010) DivQ: diversification for keyword search over structured databases. In: SIGIR’10 proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, ACM, New York, NY
Carbonell J, Goldstein J (1998) The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: SIGIR’98 proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. vol ACM, s.n., New York, NY
Chen Y-Y, Suel T, Markowetz A Efficient query processing in geographic web search engines. In: SIGMOD’06 proceedings of the 2006 ACM SIGMOD international conference on management of data, ACM, New York, NY, 2006
Cong G, Jensen CS, Wu D (2009) Efficient retrieval of the top-k most relevant spatial web objects. Proc VLDB endow 2(1):337–348
Fraternali P, Martinenghi D, Tagliasacchi M (2012) Top-k bounded diversification. In: SIGMOD’12 proceedings of the 2012 ACM SIGMOD international conference on management of data, ACM, New York, NY
Schnaitter K, Polyzotis N (2008) Evaluating rank joins with optimal cost. In: PODS’08 proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, ACM, New York, NY
Abid A, Tagliasacchi M (2013) Provisional reporting for rank joins. J Intell Inf Syst 40(3):479–500
Berchtold S, Ertl B, Keim DA, Kriegel H-P, Seidl T (1998) Fast nearest neighbor search in high-dimensional space. In: Proceedings of the 14th international conference on data engineering, Orlando, 23–27 Feb 1998
de Berg M, Cheong O, van Kreveld M, Overmars M (2008) Computational geometry: algorithms and applications. Springer, Netherlands
Soliman MA, Ilyas (2011) Ranking with uncertain scoring functions: semantics and sensitivity measures. In: SIGMOD’11 proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, ON, Canada. New York, NY
Bhatia S, Mitra P, Brunk C (2012) A query classification scheme for diversification. DDR’12, s.n., Seattle, WA, 2012
Lee U, Liu Z, Cho J (2005) Automatic identification of user goals in web search. In: WWW’05 proceedings of the 14th international conference on world wide web, ACM, New York, NY
Welch MJ, Cho J (2011) Search result diversity for informational queries. In: The international world wide web conference committee (IW3C2), ACM, Hyderabad
Qiu F, Liu Z, Cho J (2005) Analysis of user web traffic with a focus on search activities. In: International workshop on the web and databases
Blei DM (2012) Probabilistic topic models. Commun ACM 55(4):77–84
Blei DM, Ng AY (2003) Michael I Latent dirichlet allocation. J mach learn res 3:993–1022 (s.n., Jordan)
Clarke CLA, Kolla M, Cormack GV, Vechtomova O (2008) Novelty and diversity in information retrieval evaluation. In: SIGIR’08 proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, ACM, New York, NY
Agrawal R, Gollapudi S, Halverson A, Ieong S (2009) Diversifying search results. In: WSDM’09 proceedings of the second ACM international conference on web search and data mining, ACM, New York, NY
Zhai C, Cohen WW, Lafferty J (2003) Beyond independent relevance: methods and evaluation. SIGIR-2003, ACM, Toronto
Acknowledgments
We are grateful to the reviewers of this article for their constructive and valuable review comments, which helped us to improve this article in many different ways.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Abid, A., Hussain, N., Abid, K. et al. A survey on search results diversification techniques. Neural Comput & Applic 27, 1207–1229 (2016). https://doi.org/10.1007/s00521-015-1945-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-015-1945-5