Advertisement

Information Retrieval

, Volume 15, Issue 5, pp 478–502 | Cite as

On the role of novelty for search result diversification

  • Rodrygo L. T. SantosEmail author
  • Craig Macdonald
  • Iadh Ounis
Article

Abstract

Re-ranking the search results in order to promote novel ones has traditionally been regarded as an intuitive diversification strategy. In this paper, we challenge this common intuition and thoroughly investigate the actual role of novelty for search result diversification, based upon the framework provided by the diversity task of the TREC 2009 and 2010 Web tracks. Our results show that existing diversification approaches based solely on novelty cannot consistently improve over a standard, non-diversified baseline ranking. Moreover, when deployed as an additional component by the current state-of-the-art diversification approaches, our results show that novelty does not bring significant improvements, while adding considerable efficiency overheads. Finally, through a comprehensive analysis with simulated rankings of various quality, we demonstrate that, although inherently limited by the performance of the initial ranking, novelty plays a role at breaking the tie between similarly diverse results.

Keywords

Web search Relevance Diversity 

References

  1. Agrawal, R., Gollapudi, S., Halverson, A., & Ieong, S. (2009). Diversifying search results. In Proceedings of the 2nd ACM international conference on web search and data mining (pp. 5–14).Google Scholar
  2. Amati, G., Ambrosi, E., Bianchi, M., Gaibisso, C., & Gambosi, G. (2007). FUB, IASI-CNR and University of Tor Vergata at TREC 2007 Blog track. In Proceedings of the 16th text REtrieval conference.Google Scholar
  3. Carbonell, J., & Goldstein, J. (1998) The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on research and development in informaion retrieval (pp. 335–336).Google Scholar
  4. Carterette, B., & Chandar, P. (2009). Probabilistic models of ranking novel documents for faceted topic retrieval. In Proceedings of the 18th ACM conference on information and knowledge management (pp. 1287–1296).Google Scholar
  5. Chapelle, O., Metlzer, D., Zhang, Y., & Grinspan, P. (2009). Expected reciprocal rank for graded relevance. InProceedings of the 18th ACM conference on information and knowledge management (pp. 621–630).Google Scholar
  6. Chen, H., & Karger, D. R. (2006). Less is more: probabilistic models for retrieving fewer relevant documents. In Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval (pp. 429–436).Google Scholar
  7. Clarke, C. L. A., Kolla, M., Cormack, G. V., Vechtomova, O., Ashkan, A., Büttcher, S., et al. (2008). Novelty and diversity in information retrieval evaluation. In Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (pp. 659–666).Google Scholar
  8. Clarke, C. L. A., Craswell, N., & Soboroff, I. (2009). Overview of the TREC 2009 web track. In Proceedings of the 18th text retrieval conference.Google Scholar
  9. Clarke, C. L. A., Craswell, N., Soboroff, I., & Cormack, G. V. (2010). Preliminary overview of the TREC 2010 Web track. In Proceedings of the 19th text retrieval conference.Google Scholar
  10. Clarke, C. L. A., Craswell, N., Soboroff, I., & Ashkan, A. (2011). A comparative analysis of cascade measures for novelty and diversity. In Proceedings of the 4th ACM international conference on web search and data mining (pp. 75–84).Google Scholar
  11. Cooper, W. S. (1971). The inadequacy of probability of usefulness as a ranking criterion for retrieval system output. Technical report, University of California, Berkeley.Google Scholar
  12. Craswell, N., Zoeter, O., Taylor, M., & Ramsey, B. (2008) An experimental comparison of click position-bias models. In Proceedings of the 1st ACM international conference on web search and data mining (pp. 87–94).Google Scholar
  13. Goffman, W. (1964). On relevance as a measure. Information Storage and Retrieval, 2(3), 201–203.MathSciNetCrossRefGoogle Scholar
  14. Gordon, M. D., & Lenk, P. (1991). A utility theoretic examination of the probability ranking principle in information retrieval. Journal of the American Society for Information Science and Technology, 42(10), 703–714.CrossRefGoogle Scholar
  15. Hochbaum, D. S. (Ed.). (1997). Approximation algorithms for NP-hard problems. Boston, MA, USA: PWS Publishing Co.Google Scholar
  16. Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.CrossRefGoogle Scholar
  17. Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220(4598), 671–680.MathSciNetzbMATHCrossRefGoogle Scholar
  18. Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to Information Retrieval. Cambridge: Cambridge University Press.zbMATHCrossRefGoogle Scholar
  19. Radlinski, F., & Dumais, S. (2006). Improving personalized web search using result diversification. In Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval (pp. 691–692).Google Scholar
  20. Rafiei, D., Bharat, K., & Shukla, A. (2010) Diversifying Web search results. In Proceedings of the 19th international conference on world wide web (pp. 781–790).Google Scholar
  21. Robertson, S. E. (1977). The probability ranking principle in IR. Journal of Documentation, 33(4), 294–304.CrossRefGoogle Scholar
  22. Sakai, T., & Song, R. (2011). Evaluating diversified search results using per-intent graded relevance. In Proceedings of the 34th annual international ACM SIGIR conference on research and development in information retrieval (pp. 1043–1052).Google Scholar
  23. Sanderson, M. (2008). Ambiguous queries: Test collections need more sense. In Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (pp. 499–506).Google Scholar
  24. Santos, R. L. T., & Ounis, I. (2011). Diversifying for multiple information needs. In Proceedings of the 1st International Workshop on Diversity in Document Retrieval (pp. 37–41).Google Scholar
  25. Santos, R. L. T., Macdonald, C., & Ounis, I. (2010a). Exploiting query reformulations for Web search result diversification. In Proceedings of the 19th international conference on world wide web (pp. 881–890).Google Scholar
  26. Santos, R. L. T., Macdonald, C., & Ounis, I. (2010b). Selectively diversifying Web search results. In Proceedings of the 19th ACM international conference on information and knowledge management (pp. 1179–1188).Google Scholar
  27. Santos, R. L. T., Peng, J., Macdonald, C., & Ounis, I. (2010c). Explicit search result diversification through sub-queries. In Proceedings of the 31st European conference on information retrieval (pp. 87–99).Google Scholar
  28. Santos, R. L. T., Macdonald, C., & Ounis, I. (2011). Intent-aware search result diversification. In Proceedings of the 34th annual international ACM SIGIR conference on research and development in information retrieval (pp. 595–604).Google Scholar
  29. Song, R., Luo, Z., Nie, J. Y., Yu, Y., Hon, H. W. (2009). Identification of ambiguous queries in web search. Information Processing and Management, 45(2), 216–229.CrossRefGoogle Scholar
  30. Spärck-Jones, K., Robertson, S. E., & Sanderson, M. (2007). Ambiguous requests: implications for retrieval tests, systems and theories. SIGIR Forum, 41(2), 8–17.CrossRefGoogle Scholar
  31. Turpin, A., & Scholer, F. (2006). User performance versus precision measures for simple search tasks. In Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval (pp. 11–18).Google Scholar
  32. Wang, J., & Zhu, J. (2009). Portfolio theory of information retrieval. In Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval (pp. 115–122).Google Scholar
  33. Zhai, C., Cohen, W. W., & Lafferty, J. (2003). Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval (pp. 10–17).Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Rodrygo L. T. Santos
    • 1
    Email author
  • Craig Macdonald
    • 1
  • Iadh Ounis
    • 1
  1. 1.School of Computing ScienceUniversity of GlasgowGlasgowUK

Personalised recommendations