Aggregated Search Result Diversification

  • Rodrygo L. T. Santos
  • Craig Macdonald
  • Iadh Ounis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6931)

Abstract

Search result diversification has been effectively employed to tackle query ambiguity, particularly in the context of web search. However, ambiguity can manifest differently in different search verticals, with ambiguous queries spanning, e.g., multiple place names, content genres, or time periods. In this paper, we empirically investigate the need for diversity across four different verticals of a commercial search engine, including web, image, news, and product search. As a result, we introduce the problem of aggregated search result diversification as the task of satisfying multiple information needs across multiple search verticals. Moreover, we propose a probabilistic approach to tackle this problem, as a natural extension of state-of-the-art diversification approaches. Finally, we generalise standard diversity metrics, such as ERR-IA and α-nDCG, into a framework for evaluating diversity across multiple search verticals.

Keywords

Search Result Product Search Information Retrieval Evaluation Commercial Search Engine Query Ambiguity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: WSDM, pp. 5–14 (2009)Google Scholar
  2. 2.
    Arguello, J., Diaz, F., Callan, J., Crespo, J.F.: Sources of evidence for vertical selection. In: SIGIR, pp. 315–322 (2009)Google Scholar
  3. 3.
    Bailey, P., Craswell, N., White, R.W., Chen, L., Satyanarayana, A., Tahaghoghi, S.: Evaluating whole-page relevance. In: SIGIR, pp. 767–768 (2010)Google Scholar
  4. 4.
    Beitzel, S.M., Jensen, E.C., Lewis, D.D., Chowdhury, A., Frieder, O.: Automatic classification of web queries using very large unlabeled query logs. ACM Trans. Inf. Syst. 25(9) (2007)Google Scholar
  5. 5.
    Callan, J.: Distributed information retrieval. In: Croft, W.B. (ed.) Advances in Information Retrieval, ch. 5, pp. 127–150. Kluwer Academic Publishers, Dordrecht (2000)Google Scholar
  6. 6.
    Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: SIGIR, pp. 335–336 (1998)Google Scholar
  7. 7.
    Carterette, B.: An analysis of NP-completeness in novelty and diversity ranking. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 200–211. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  8. 8.
    Chapelle, O., Metlzer, D., Zhang, Y., Grinspan, P.: Expected reciprocal rank for graded relevance. In: CIKM, pp. 621–630 (2009)Google Scholar
  9. 9.
    Chen, H., Karger, D.R.: Less is more: probabilistic models for retrieving fewer relevant documents. In: SIGIR, pp. 429–436 (2006)Google Scholar
  10. 10.
    Clarke, C.L.A., Craswell, N., Soboroff, I.: Overview of the TREC 2009 Web track. In: TREC (2009)Google Scholar
  11. 11.
    Clarke, C.L.A., Craswell, N., Soboroff, I., Ashkan, A.: A comparative analysis of cascade measures for novelty and diversity. In: WSDM, pp. 75–84 (2011)Google Scholar
  12. 12.
    Clarke, C.L.A., Craswell, N., Soboroff, I., Cormack, G.V.: Overview of the TREC 2010 Web track. In: TREC (2010)Google Scholar
  13. 13.
    Clarke, C.L.A., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: SIGIR, pp. 659–666 (2008)Google Scholar
  14. 14.
    Damak, F., Kopliku, A., Pinel-Sauvagnat, K., Boughanem, M.: A user study to evaluate the utility of verticality and diversity in aggregated search. Tech. Rep. 2, IRIT (2010)Google Scholar
  15. 15.
    Deselaers, T., Gass, T., Dreuw, P., Ney, H.: Jointly optimising relevance and diversity in image retrieval. In: CIVR, pp. 1–8 (2009)Google Scholar
  16. 16.
    Diaz, F.: Integration of news content into web results. In: WSDM, pp. 182–191 (2009)Google Scholar
  17. 17.
    Diaz, F., Arguello, J.: Adaptation of offline vertical selection predictions in the presence of user feedback. In: SIGIR, pp. 323–330 (2009)Google Scholar
  18. 18.
    Diaz, F., Lalmas, M., Shokouhi, M.: From federated to aggregated search. In: SIGIR, p. 910 (2010)Google Scholar
  19. 19.
    Gollapudi, S., Sharma, A.: An axiomatic approach for result diversification. In: WWW, pp. 381–390 (2009)Google Scholar
  20. 20.
    Hand, D.J., Smyth, P., Mannila, H.: Principles of data mining. MIT Press, Cambridge (2001)Google Scholar
  21. 21.
    Khuller, S., Moss, A., Naor, J.S.: The budgeted maximum coverage problem. Inf. Proc. Lett. 70(1), 39–45 (1999)MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
    van Leuken, R.H., Garcia, L., Olivares, X., van Zwol, R.: Visual diversification of image search results. In: WWW, pp. 341–350 (2009)Google Scholar
  23. 23.
    Murdock, V., Lalmas, M.: Workshop on aggregated search. SIGIR Forum 42, 80–83 (2008)CrossRefGoogle Scholar
  24. 24.
    Paramita, M.L., Tang, J., Sanderson, M.: Generic and spatial approaches to image search results diversification. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 603–610. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  25. 25.
    Ponnuswami, A.K., Pattabiraman, K., Wu, Q., Gilad-Bachrach, R., Kanungo, T.: On composition of a federated web search result page: using online users to provide pairwise preference for heterogeneous verticals. In: WSDM, pp. 715–724 (2011)Google Scholar
  26. 26.
    Rafiei, D., Bharat, K., Shukla, A.: Diversifying Web search results. In: WWW, pp. 781–790 (2010)Google Scholar
  27. 27.
    Santos, R.L.T., Macdonald, C., Ounis, I.: Exploiting query reformulations for Web search result diversification. In: WWW, pp. 881–890 (2010)Google Scholar
  28. 28.
    Song, R., Luo, Z., Nie, J.Y., Yu, Y., Hon, H.W.: Identification of ambiguous queries in Web search. Inf. Process. Manage. 45(2), 216–229 (2009)CrossRefGoogle Scholar
  29. 29.
    Spärck-Jones, K., Robertson, S.E., Sanderson, M.: Ambiguous requests: implications for retrieval tests, systems and theories. SIGIR Forum 41(2), 8–17 (2007)CrossRefGoogle Scholar
  30. 30.
    Sushmita, S., Joho, H., Lalmas, M., Villa, R.: Factors affecting click-through behavior in aggregated search interfaces. In: CIKM, pp. 519–528 (2010)Google Scholar
  31. 31.
    Vee, E., Srivastava, U., Shanmugasundaram, J., Bhat, P., Yahia, S.A.: Efficient computation of diverse query results. In: ICDE, pp. 228–236 (2008)Google Scholar
  32. 32.
    Wang, J., Zhu, J.: Portfolio theory of information retrieval. In: SIGIR, pp. 115–122 (2009)Google Scholar
  33. 33.
    Zhai, C., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: SIGIR, pp. 10–17 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Rodrygo L. T. Santos
    • 1
  • Craig Macdonald
    • 1
  • Iadh Ounis
    • 1
  1. 1.School of Computing ScienceUniversity of GlasgowGlasgowUK

Personalised recommendations