Advertisement

Adapting Document Ranking to Users’ Preferences Using Click-Through Data

  • Min Zhao
  • Hang Li
  • Adwait Ratnaparkhi
  • Hsiao-Wuen Hon
  • Jue Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4182)

Abstract

This paper proposes a new approach to ranking the documents retrieved by a search engine using click-through data. The goal is to make the final ranked list of documents accurately represent users’ preferences reflected in the click-through data. Our approach combines the ranking result of a traditional IR algorithm (BM25) with that given by a machine learning algorithm (Naïve Bayes). The machine learning algorithm is trained on click-through data (queries and their associated documents), while the IR algorithm runs over the document collection. We consider several alternative strategies for combining the result of using click-through data and that of using document data. Experimental results confirm that any method of using click-through data greatly improves the preference ranking, over the method of using BM25 alone. We found that a linear combination of scores of Naïve Bayes and scores of BM25 performs the best for the task. At the same time, we found that the preference ranking methods can preserve relevance ranking, i.e., the preference ranking methods can perform as well as BM25 for relevance ranking.

Keywords

Search Engine Relevance Feedback Query Expansion Preference Ranking Document Retrieval 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Anick, P.: Using terminological feedback for web search refinement – a log-based study. In: Proceedings of SIGIR 2003, pp. 88–95 (2003)Google Scholar
  2. 2.
    Aslam, J.A., Montague, M.H.: Models for Metasearch. In: Proceedings of SIGIR 2001, pp. 275–284 (2001)Google Scholar
  3. 3.
    Bartell, B.T., Cottrell, G.W., Belew, R.K.: Automatic combination of multiple ranked retrieval systems. In: Proceedings of SIGIR 1994, pp. 173–181 (1994)Google Scholar
  4. 4.
    Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: Proceedings of SIGKDD 2000, pp. 407–416 (2000)Google Scholar
  5. 5.
    Buckley, C., Salton, G., Allan, J., Singhal, A.: Automatic Query Expansion Using SMART: TREC 3. TREC 1994, 69–80 (1994)Google Scholar
  6. 6.
    Cui, H., Wen, J.R., Nie, J.Y., Ma, W.Y.: Probabilistic query expansion using query logs. In: Proceedings of WWW 2002, pp. 325–332 (2002)Google Scholar
  7. 7.
    Dumais, S., Joachims, T., Bharat, K., Weigend, A.: SIGIR 2003 Workshop Report: Implicit Measures of User Interests and Preferences. SIGIR Forum 37(2), 50–54 (2003)CrossRefGoogle Scholar
  8. 8.
    Fox, E.A., Shaw, J.A.: Combination of multiple searches. In: Proceedings of TREC-2, pp. 243–249 (1994)Google Scholar
  9. 9.
    Greengrass, E.: Information Retrieval: a Survey (2000), http://www.cs.umbc.edu/cadip/readings/IR.report.120600.book.pdf
  10. 10.
    Hull, D.A.: Using Statistical Testing in the Evaluation of Retrieval Experiments. In: Proceedings of SIGIR 1993, pp. 329–338 (1993)Google Scholar
  11. 11.
    Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of SIGKDD 2002, pp. 133–142 (2002)Google Scholar
  12. 12.
    Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting clickthrough data as implicit feedback. In: SIGIR 2005, pp. 154–161 (2005)Google Scholar
  13. 13.
    Lee, J.H.: Analyses of multiple evidence combination. In: Proceedings of SIGIR 1997, pp. 267–276 (1997)Google Scholar
  14. 14.
    Ling, C.X., Gao, J.F., Zhang, H.J., Qian, W.N., Zhang, H.J.: Improving Encarta search engine performance by mining user logs. International Journal of Pattern Recognition and Artificial Intelligence 16(8), 1101–1116 (2002)CrossRefGoogle Scholar
  15. 15.
    Manmatha, R., Rath, T., Feng, F.: Modeling score distributions for combining the outputs of search engines. In: Proceedings of SIGIR 2001, pp. 267–275 (2001)Google Scholar
  16. 16.
    Mitchell, T.M.: Machine Learning. McGraw Hill, New York (1997)MATHGoogle Scholar
  17. 17.
    Oztekin, B., Karypis, G., Kumar, V.: Expert agreement and content based reranking in a meta search environment using mearf. In: Proceedings of WWW 2002, pp. 333–344 (2002)Google Scholar
  18. 18.
    Radlinski, F., Joachims, T.: Query chains: learning to rank from implicit feedback. In: KDD 2005, pp. 239–248 (2005)Google Scholar
  19. 19.
    Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M.: Okapi at TREC-3. In: Overview of the Third Text REtrieval Conference (TREC-3), pp. 109–126 (1995)Google Scholar
  20. 20.
    Salton, G., Buckley, C.: Improving retrieval performance by relevance feedback. Journal of American Society for Information Sciences 41, 288–297 (1990)CrossRefGoogle Scholar
  21. 21.
    Shen, X., Tan, B., Zhai, C.: Context-sensitive information retrieval using implicit feedback. In: SIGIR 2005, pp. 43–50 (2005)Google Scholar
  22. 22.
    Silverstein, C., Henzinger, M., Marais, H., Moricz, M.: Analysis of a Very Large AltaVista Query Log. Technical Report SRC 1998-014, Digital Systems Research Center (1998)Google Scholar
  23. 23.
    Spink, A., Jansen, B.J., Wolfram, D., Saracevic, T.: From e-sex to e-commerce: web search changes. IEEE Computer 35(3), 107–109 (2002)Google Scholar
  24. 24.
    Spink, A., Wolfram, D., Jansen, B.J., Saracevic, T.: Searching the web: the public and their queries. Journal of the American Society of Information Science and Technology 52(3), 226–234 (2001)CrossRefGoogle Scholar
  25. 25.
    Vogt, C.C., Cottrell, G.W.: Predicting the performance of linearly combined IR systems. In: Proceedings of SIGIR 1998, pp. 190–196 (1998)Google Scholar
  26. 26.
    White, R.W., Ruthven, I., Jose, J.M.: The use of implicit evidence for relevance feedback in web retrieval. In: Crestani, F., Girolami, M., van Rijsbergen, C.J.K. (eds.) ECIR 2002. LNCS, vol. 2291, pp. 93–109. Springer, Heidelberg (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Min Zhao
    • 1
  • Hang Li
    • 2
  • Adwait Ratnaparkhi
    • 3
  • Hsiao-Wuen Hon
    • 2
  • Jue Wang
    • 1
  1. 1.Institute of AutomationChinese Academy of SciencesBeijingChina
  2. 2.Microsoft Research AsiaBeijingChina
  3. 3.Microsoft CorporationRedmondUSA

Personalised recommendations