Skip to main content
Log in

A survey on search results diversification techniques

  • Review
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The quantity of information placed on the web has been greater than before and is increasing rapidly day by day. Searching through the huge amount of data and finding the most relevant and useful result set involves searching, ranking, and presenting the results. Most of the users probe into the top few results and neglect the rest. In order to increase user’s satisfaction, the presented result set should not only be relevant to the search topic, but should also present a variety of perspectives, that is, the results should be different from one another. The effectiveness of web search and the satisfaction of users can be enhanced through providing various results of a search query in a certain order of relevance and concern. The technique used to avoid presenting similar, though relevant, results to the user is known as a diversification of search results. This article presents a survey of the approaches used for search result diversification. To this end, this article not only provides a technical survey of existing diversification techniques, but also presents a taxonomy of diversification algorithms with respect to the types of search queries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Giunchiglia F (2006) Managing diversity in knowledge. Advances in applied artificial intelligence. Springer, Berlin, Heidelberg

  2. Gollapudi S, Sharma A (2009) An Axiomatic Approach for Result Diversification. In: The international world wide web conference committee (IW3C2), ACM, Madrid

  3. Zhai CX, Cohen WW, Lafferty J (2003) Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: SIGIR’03 proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, ACM, New York, NY

  4. Clough P, Sanderson M, Abouammoh M, Navarro S, Paramita M (2009) Multiple approaches to analysing query diversity. In: SIGIR’09 proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval, ACM, New York, NY

  5. Jain A, Sarda P, Haritsa JR (2004) Providing diversity in K-nearest neighbor query. Advances in Knowledge Discovery and Data Mining. Springer, Berlin

    Book  Google Scholar 

  6. Minack E, Demartini G, Nejdl W (2009) Current approaches to search result diversification. In: Proceedings of the first international workshop on living web, collocated with the 8th international semantic web conference (ISWC 2009), s.n., Washington D.C.

  7. Sparck-Jones K, Robertson SE, Sanderson M (2007) Ambiguous requests: implications for retrieval tests, systems and theories, University of Cambridge ACM SIGIR, , s.n., UK, pp 8–17

  8. Capannini G, Nardini FM, Perego R, Silvestri F (2011) Efficient diversification of web search results, VLDB Endowmen, Seattle, Washington

  9. Rafiei D, Gollapudi S, Halverson A, Ieong S (2010) Diversifying web search results. In: the international world wide web conference committee (IW3C2), ACM, North Carolina

  10. Zheng W, Wang X, Fang H, Cheng H (2011) An exploration of pattern-based subtopic modeling for search result diversification, JCDL, New York, NY. ISBN: 978-1-4503-0744-4

  11. Santos RLT, Peng J, Macdonald C, Ounis I (2010) Explicit search result diversification through sub-queries. In: European conference, Springer, Berlin, pp 87–99

  12. Capannini G, Nardini FM, Perego R, Silvestri F (2011) Efficient diversification of search results using query logs. In: WWW’11 Proceedings of the 20th international conference companion on World wide web, New York, NY, ACM

  13. Boldi P, Bonchi F, Castillo C, Vigna S (2009) From “Dango” to “Japanese Cakes”:Query reformulation models and patterns. WI’09. IEEE CS Press

  14. Song R, Luo Z, Wen J-R, Y Yu (2007) Identifying ambiguous queries in web search. In: WWW’07 proceedings of the 16th international conference on world wide web, ACM, New York, NY

  15. Richardson M, Dominowska E, Ragno R (2007) Predicting clicks: estimating the click-through rate for new ads.. In Proceedings of the 16th international world wide web conference(WWW-2007)

  16. Vargas S, Santos RLT, Macdonald C, Ounis I (2013) Selecting effective expansion terms for diversity. In: OAIR’13 proceedings of the 10th conference on open research areas in information retrieval, France : Le Centre De Hautes Etudes Internationales D’Informatique Documentaire, Paris

  17. Baeza-Yates R, Hurtado C, Mendoza M (2004) Query recommendation using query logs in search engines. In: EDBT’04 proceedings of the 2004 international conference on current trends in database technology, Springer, Berlin, Heidelberg

  18. Santos RLT, Macdonald C, Ounis I (2010) Exploiting query reformulations for web search result diversification. In: WWW’10 proceedings of the 19th international conference on world wide web, ACM, New York, NY

  19. Vallet D, Castells P (2012) Personalized diversification of search results. In: SIGIR’12 proceedings of the 35th international acm sigir conference on research and development in information retrieval, ACM, New York, NY

  20. Vallet D, Castells P (2011) On diversifying and personalized web search, SIGIR’11, ACM, Beijing

  21. Vallet D, Cantador I, Joemon M. Jose (2010) Personalizing web search with folksonomy-based user and document profiles. ECIR’2010 Proceedings of the 32nd European conference on Advances in Information Retrieval, Springer, Berlin

  22. Sydow M, Ciesielski K, Wajda J (2011) Introducing diversity to log-based query suggestions to deal with underspecified user queries. In proceeding of: security and intelligent information systems—international joint conferences, DBLP, Warsaw

  23. Radlinski F, Dumais S (2006) Improving personalized web search using result diversification. In: SIGIR’06 proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, ACM, New York, NY

  24. Speretta M, Gauch S (2005) Personalized search based on user search histories. In: Web intelligence, 2005, proceedings the 2005 IEEE/WIC/ACM International Conference, SPEC, USA

  25. Vallet, D (2011) Crowdsourced evaluation of personalization and diversification techniques in web search. In: Proceedings of the SIGIR 2011 workshop on text retrieval conference (TREC)

  26. Micarelli A, Gasparetti F, Sciarrone F, Gauch S (2007) Personalized search on the world wide web. In: The adaptive web Springer, Berlin

  27. Bozzon A, Brambilla M, Ceri S, Fraternali P Liquid query: multi-domain exploratory search. In: International world wide web conference, ACM, Raleigh, North Carolina, 2010

  28. Ceri S, Brambilla M (2011) Search computing trends and developments. Springer Science & Business Media, Berlin, Heidelberg

    Book  Google Scholar 

  29. Brambilla M, Brambilla M, Fraternali P, Tagliasacchi M (2011) Diversification for multi-domain result sets. In: CIKM’11 proceedings of the 20th ACM international conference on Information and knowledge management, ACM, New York, NY

  30. Vee E, Srivastava U, Shanmugasundaram J, Bhat P (2008) Efficient computation of diverse query results. In: ICDE’08 proceedings of the 2008 IEEE 24th international conference on data engineering, IEEE Computer Society, Washington, DC

  31. Martinenghi D, Tagliasacchi M (2010) Proximity rank join. In: Proceedings of the VLDB endowment

  32. Demidova E, Fankhauser P, Zhou X, Nejdl W (2010) DivQ: diversification for keyword search over structured databases. In: SIGIR’10 proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, ACM, New York, NY

  33. Carbonell J, Goldstein J (1998) The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: SIGIR’98 proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. vol ACM, s.n., New York, NY

  34. Chen Y-Y, Suel T, Markowetz A Efficient query processing in geographic web search engines. In: SIGMOD’06 proceedings of the 2006 ACM SIGMOD international conference on management of data, ACM, New York, NY, 2006

  35. Cong G, Jensen CS, Wu D (2009) Efficient retrieval of the top-k most relevant spatial web objects. Proc VLDB endow 2(1):337–348

    Article  Google Scholar 

  36. Fraternali P, Martinenghi D, Tagliasacchi M (2012) Top-k bounded diversification. In: SIGMOD’12 proceedings of the 2012 ACM SIGMOD international conference on management of data, ACM, New York, NY

  37. Schnaitter K, Polyzotis N (2008) Evaluating rank joins with optimal cost. In: PODS’08 proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, ACM, New York, NY

  38. Abid A, Tagliasacchi M (2013) Provisional reporting for rank joins. J Intell Inf Syst 40(3):479–500

    Article  Google Scholar 

  39. Berchtold S, Ertl B, Keim DA, Kriegel H-P, Seidl T (1998) Fast nearest neighbor search in high-dimensional space. In: Proceedings of the 14th international conference on data engineering, Orlando, 23–27 Feb 1998

  40. de Berg M, Cheong O, van Kreveld M, Overmars M (2008) Computational geometry: algorithms and applications. Springer, Netherlands

  41. Soliman MA, Ilyas (2011) Ranking with uncertain scoring functions: semantics and sensitivity measures. In: SIGMOD’11 proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, ON, Canada. New York, NY

  42. Bhatia S, Mitra P, Brunk C (2012) A query classification scheme for diversification. DDR’12, s.n., Seattle, WA, 2012

  43. Lee U, Liu Z, Cho J (2005) Automatic identification of user goals in web search. In: WWW’05 proceedings of the 14th international conference on world wide web, ACM, New York, NY

  44. Welch MJ, Cho J (2011) Search result diversity for informational queries. In: The international world wide web conference committee (IW3C2), ACM, Hyderabad

  45. Qiu F, Liu Z, Cho J (2005) Analysis of user web traffic with a focus on search activities. In: International workshop on the web and databases

  46. Blei DM (2012) Probabilistic topic models. Commun ACM 55(4):77–84

    Article  MathSciNet  Google Scholar 

  47. Blei DM, Ng AY (2003) Michael I Latent dirichlet allocation. J mach learn res 3:993–1022 (s.n., Jordan)

    MATH  Google Scholar 

  48. Clarke CLA, Kolla M, Cormack GV, Vechtomova O (2008) Novelty and diversity in information retrieval evaluation. In: SIGIR’08 proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, ACM, New York, NY

  49. Agrawal R, Gollapudi S, Halverson A, Ieong S (2009) Diversifying search results. In: WSDM’09 proceedings of the second ACM international conference on web search and data mining, ACM, New York, NY

  50. Zhai C, Cohen WW, Lafferty J (2003) Beyond independent relevance: methods and evaluation. SIGIR-2003, ACM, Toronto

Download references

Acknowledgments

We are grateful to the reviewers of this article for their constructive and valuable review comments, which helped us to improve this article in many different ways.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yaser Daanial Khan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abid, A., Hussain, N., Abid, K. et al. A survey on search results diversification techniques. Neural Comput & Applic 27, 1207–1229 (2016). https://doi.org/10.1007/s00521-015-1945-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-015-1945-5

Keywords

Navigation