Advertisement

Leveraging Dynamic Query Subtopics for Time-Aware Search Result Diversification

  • Tu Ngoc Nguyen
  • Nattiya Kanhabua
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8416)

Abstract

Search result diversification is a common technique for tackling the problem of ambiguous and multi-faceted queries by maximizing query aspects or subtopics in a result list. In some special cases, subtopics associated to such queries can be temporally ambiguous, for instance, the query US Open is more likely to be targeting the tennis open in September, and the golf tournament in June. More precisely, users’ search intent can be identified by the popularity of a subtopic with respect to the time where the query is issued. In this paper, we study search result diversification for time-sensitive queries, where the temporal dynamics of query subtopics are explicitly determined and modeled into result diversification. Unlike aforementioned work that, in general, considered only static subtopics, we leverage dynamic subtopics by analyzing two data sources (i.e., query logs and a document collection). By using these data sources, it provides the insights from different perspectives of how query subtopics change over time. Moreover, we propose novel time-aware diversification methods that leverage the identified dynamic subtopics. A key idea is to re-rank search results based on the freshness and popularity of subtopics. To this end, our experimental results show that the proposed methods can significantly improve the diversity and relevance effectiveness for time-sensitive queries in comparison with state-of-the-art methods.

Keywords

Latent Dirichlet Allocation Latent Dirichlet Allocation Model Random Walk With Restart Search Intent Latent Dirichlet Allocation Topic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: Proceedings of WSDM 2009 (2009)Google Scholar
  2. 2.
    Arun, R., Suresh, V., Veni Madhavan, C.E., Narasimha Murthy, M.N.: On finding the natural number of topics with latent dirichlet allocation: some observations. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) PAKDD 2010, Part I. LNCS, vol. 6118, pp. 391–402. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  3. 3.
    Berberich, K., Bedathur, S.: Temporal diversification of search results. In: SIGIR 2013 Workshop on Time-aware Information Access (TAIA 2013) (2013)Google Scholar
  4. 4.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  5. 5.
    Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of SIGIR 1998 (1998)Google Scholar
  6. 6.
    Carterette, B., Chandar, P.: Probabilistic models of ranking novel documents for faceted topic retrieval. In: Proceedings of CIKM 2009 (2009)Google Scholar
  7. 7.
    Clarke, C.L.A., Craswell, N., Soboroff, I.: Overview of the TREC 2009 web track. In: TREC (2009)Google Scholar
  8. 8.
    Clarke, C.L.A., Craswell, N., Soboroff, I., Voorhees, E.M.: Overview of the TREC 2011 web track. In: TREC (2011)Google Scholar
  9. 9.
    Craswell, N., Szummer, M.: Random walks on the click graph. In: Proceedings of SIGIR 2007 (2007)Google Scholar
  10. 10.
    Dou, Z., Hu, S., Chen, K., Song, R., Wen, J.-R.: Multi-dimensional search result diversification. In: Proceedings of WSDM 2011 (2011)Google Scholar
  11. 11.
    Kanhabua, N., Nørvåg, K.: Improving temporal language models for determining time of non-timestamped documents. In: Christensen-Dalsgaard, B., Castelli, D., Ammitzbøll Jurik, B., Lippincott, J. (eds.) ECDL 2008. LNCS, vol. 5173, pp. 358–370. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  12. 12.
    Kim, D., Oh, A.: Topic chains for understanding a news corpus. In: Gelbukh, A. (ed.) CICLing 2011, Part II. LNCS, vol. 6609, pp. 163–176. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  13. 13.
    Kulkarni, A., Teevan, J., Svore, K.M., Dumais, S.T.: Understanding temporal query dynamics. In: Proceedings of WSDM 2011 (2011)Google Scholar
  14. 14.
    Radlinski, F., Szummer, M., Craswell, N.: Metrics for assessing sets of subtopics. In: Proceedings of SIGIR 2010 (2010)Google Scholar
  15. 15.
    Rafiei, D., Bharat, K., Shukla, A.: Diversifying web search results. In: Proceedings of WWW 2010 (2010)Google Scholar
  16. 16.
    Santos, R.L., Macdonald, C., Ounis, I.: Exploiting query reformulations for web search result diversification. In: Proceedings of WWW 2010 (2010)Google Scholar
  17. 17.
    Song, W., Zhang, Y., Gao, H., Liu, T., Li, S.: HITSCIR system in NTCIR-9 subtopic mining task (2011)Google Scholar
  18. 18.
    Styskin, A., Romanenko, F., Vorobyev, F., Serdyukov, P.: Recency ranking by diversification of result set. In: Proceedings of CIKM 2011 (2011)Google Scholar
  19. 19.
    Whiting, S., Zhou, K., Jose, J., Lalmas, M.: Temporal variance of intents in multi-faceted event-driven information needs. In: Proceedings of SIGIR 2013 (2013)Google Scholar
  20. 20.
    Zhou, K., Whiting, S., Jose, J.M., Lalmas, M.: The impact of temporal intent variability on diversity evaluation. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 820–823. Springer, Heidelberg (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Tu Ngoc Nguyen
    • 1
  • Nattiya Kanhabua
    • 1
  1. 1.L3S Research CenterLeibniz Universität HannoverHannoverGermany

Personalised recommendations