Skip to main content
Log in

Search Result Diversification Based on Query Facets

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

In search engines, different users may search for different information by issuing the same query. To satisfy more users with limited search results, search result diversification re-ranks the results to cover as many user intents as possible. Most existing intent-aware diversification algorithms recognize user intents as subtopics, each of which is usually a word, a phrase, or a piece of description. In this paper, we leverage query facets to understand user intents in diversification, where each facet contains a group of words or phrases that explain an underlying intent of a query. We generate subtopics based on query facets and propose faceted diversification approaches. Experimental results on the public TREC 2009 dataset show that our faceted approaches outperform state-of-the-art diversification models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Jansen B J, Spink A, Saracevic T. Real life, real users, and real needs: A study and analysis of user queries on the web. Inf. Process. Manage., 2000, 36(2): 207–227.

    Article  Google Scholar 

  2. Silverstein C, Marais H, Henzinger M, Moricz M. Analysis of a very large web search engine query log. SIGIR Forum, 1999, 33(1): 6–12.

    Article  MATH  Google Scholar 

  3. Dou Z, Song R, Wen J R. A large-scale evaluation and analysis of personalized search strategies. In Proc. the 16th WWW, May 2007, pp.581–590.

  4. Rafiei D, Bharat K, Shukla A. Diversifying web search results. In Proc. the 19th WWW, April 2010, pp.781–790.

  5. Clarke C L A, Craswell N, Soboroff I. Overview of the TREC 2009 web track. In Proc. the 18th TREC, November 2009.

  6. Carbonell J, Goldstein J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proc. the 21st SIGIR, August 1998, pp.335–336.

  7. Agrawal R, Gollapudi S, Halverson A, Ieong S. Diversifying search results. In Proc. the 2nd WSDM, February 2009, pp.5–14.

  8. Santos R L T, Macdonald C, Ounis I. Exploiting query reformulations for web search result diversification. In Proc. the 19th WWW, April 2010, pp.881-890.

  9. Dou Z, Hu S, Chen K, Song R, Wen J R. Multi-dimensional search result diversification. In Proc. the 4th WSDM, February 2011, pp.475–484.

  10. Dang V, CroftWB. Term level search result diversification. In Proc. the 36th SIGIR, July 28–August 1, 2013, pp.603–612.

  11. Dou Z, Hu S, Luo Y, Song R, Wen J R. Finding dimensions for queries. In Proc. the 20th CIKM, October 2011, pp.1311–1320.

  12. Kong W, Allan J. Extracting query facets from search results. In Proc. the 36th SIGIR, July 28–August 1, 2013, pp.93–102.

  13. Kong W, Allan J. Extending faceted search to the general web. In Proc. the 23rd CIKM, Nov. 2014, pp.839–848.

  14. Clarke C L, Kolla M, Cormack G V, Vechtomova O, Ashkan A, B¨uttcher S, MacKinnon I. Novelty and diversity in information retrieval evaluation. In Proc. the 31st SIGIR, July 2008, pp.659–666.

  15. Zhai C, Lafferty J. A risk minimization framework for information retrieval. Inf. Process. Manage., 2006, 42(1): 31–55.

    Article  MATH  Google Scholar 

  16. Chen H, Karger D R. Less is more: Probabilistic models for retrieving fewer relevant documents. In Proc. the 29th SIGIR, August 2006, pp.429–436.

  17. Zhang B, Li H, Liu Y, Ji L, Xi W, Fan W, Chen Z, Ma W Y. Improving web search results using affinity graph. In Proc. the 28th SIGIR, August 2005, pp.504–511.

  18. Santos R L, Macdonald C, Ounis I. Selectively diversifying web search results. In Proc. the 19th CIKM, October 2010, pp.1179–1188.

  19. Santos R L, Macdonald C, Ounis I. Intent-aware search result diversification. In Proc. the 34th SIGIR, July 2011, pp.595–604.

  20. Yue Y, Joachims T. Predicting diverse subsets using structural SVMs. In Proc. the 25th ICML, July 2008, pp.1224–1231.

  21. Radlinski F, Kleinberg R, Joachims T. Learning diverse rankings with multi-armed bandits. In Proc. the 25th ICML, July 2008, pp.784–791.

  22. Dang V, Croft W B. Diversity by proportionality: An election-based approach to search result diversification. In Proc. the 35th SIGIR, August 2012, pp.65–74.

  23. He J, Hollink V, de Vries A. Combining implicit and explicit topic representations for result diversification. In Proc. the 35th SIGIR, August 2012, pp.851–860.

  24. Zhu Y, Lan Y, Guo J, Cheng X, Niu S. Learning for search result diversification. In Proc. the 37th SIGIR, July 2014, pp.293–302.

  25. Yu H T, Ren F. Search result diversification via filling up multiple knapsacks. In Proc. the 23rd CIKM, November 2014, pp.609–618.

  26. Liang S, Ren Z, de Rijke M. Fusion helps diversification. In Proc. the 37th SIGIR, July 2014, pp.303–312.

  27. Lawrie D, Croft W B, Rosenberg A. Finding topic words for hierarchical summarization. In Proc. the 24th SIGIR, September 2001, pp.349–357.

  28. Hu Y, Qian Y, Li H, Jiang D, Pei J, Zheng Q. Mining query subtopics from search log data. In Proc. the 35th SIGIR, August 2012, pp.305–314.

  29. Abbassi Z, Mirrokni V S, Thakur M. Diversity maximization under matroid constraints. In Proc. the 19th SIGKDD, August 2013, pp.32–40.

  30. Bache K, Newman D, Smyth P. Text-based measures of document diversity. In Proc. the 19th SIGKDD, August 2013, pp.23–31.

  31. Jameel S, Lam W. An unsupervised topic segmentation model incorporating word order. In Proc. the 36th SIGIR, July 28–August 1, 2013, pp.203–212.

  32. Fuxman A, Tsaparas P, Achan K, Agrawal R. Using the wisdom of the crowds for keyword generation. In Proc. the 17th WWW, April 2008, pp.61–70.

  33. Manning C D, Raghavan P, Sch¨utze H. Introduction to Information Retrieval (1st edition). Cambridge University Press, 2008.

  34. Song R, Wen J R, Shi S, Xin G, Liu T Y, Qin T, Zheng X, Zhang J, Xue G R, Ma W Y. Microsoft research Asia at web track and terabyte track of TREC 2004. In Proc. the 13th TREC, November 2004.

  35. Shen D, Pan R, Sun J T, Pan J J, Wu K, Yin J, Yang Q. Q2C@UST: Our winning solution to query classification in KDDCUP 2005. SIGKDD Explorations, 2005, 7(2): 100–110.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhi-Cheng Dou.

Additional information

This work was partially supported by the National Basic Research 973 Program of China under Grant No. 2014CB340403, the Fundamental Research Funds for the Central Universities of China, and the Research Funds of Renmin University of China under Grant No. 15XNLF03.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, S., Dou, ZC., Wang, XJ. et al. Search Result Diversification Based on Query Facets. J. Comput. Sci. Technol. 30, 888–901 (2015). https://doi.org/10.1007/s11390-015-1567-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-015-1567-5

Keywords

Navigation