Skip to main content

Spying Out Accurate User Preferences for Search Engine Adaptation

  • Conference paper
Advances in Web Mining and Web Usage Analysis (WebKDD 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3932))

Included in the following conference series:

Abstract

Most existing search engines employ static ranking algorithms that do not adapt to the specific needs of users. Recently, some researchers have studied the use of clickthrough data to adapt a search engine’s ranking function. Clickthrough data indicate for each query the results that are clicked by users. As a kind of implicit relevance feedback information, clickthrough data can easily be collected by a search engine. However, clickthrough data is sparse and incomplete, thus, it is a challenge to discover accurate user preferences from it. In this paper, we propose a novel algorithm called “Spy Naïve Bayes” (SpyNB) to identify user preferences generated from clickthrough data. First, we treat the result items clicked by the users as sure positive examples and those not clicked by the users as unlabelled data. Then, we plant the sure positive examples (the spies) into the unlabelled set of result items and apply a naïve Bayes classification to generate the reliable negative examples. These positive and negative examples allow us to discover more accurate user’s preferences. Finally, we employ the SpyNB algorithm with a ranking SVM optimizer to build an adaptive metasearch engine. Our experimental results show that, compared with the original ranking, SpyNB can significantly improve the average ranks of users’ click by 20%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bartell, B., Cottrell, G., Belew, R.: Automatic combination of multiple ranked retrieval systemss. In: Proc. of the 17th ACM SIGIR Conference, pp. 173–181 (1994)

    Google Scholar 

  2. Cohen, W., Shapire, R., Singer, Y.: Learning to order things. Journal of Artifical Intelligence Research 10, 243–270 (1999)

    MathSciNet  MATH  Google Scholar 

  3. Boyan, J., Freitag, D., Joachims, T.: A machine learning architecture for optimizing web search engines. In: Proc. of AAAI workshop on Internet-Based Information System (1996)

    Google Scholar 

  4. Joachims, T.: Optimizing search engines using clickthrough data. In: Proc. of the 8th ACM SIGKDD Conference, pp. 133–142 (2002)

    Google Scholar 

  5. Tan, Q., Chai, X., Ng, W., Lee, D.: Applying co-training to clickthrough data for search engine adaptation. In: Proc. of the 9th DASFAA conference, pp. 519–532 (2004)

    Google Scholar 

  6. Li, X., Liu, B.: Learning to classify text using positive and unlabeled data. In: Proc. of 8th International Joint Conference on Artificial Intelligence (2003)

    Google Scholar 

  7. Liu, B., Dai, Y., Li, X., Lee, W.S.: Building text classifiers using positive and unlabeled examples. In: Proc. of the 3rd International Conference on Data Mining (2003)

    Google Scholar 

  8. Liu, B., Lee, W.S., Yu, P., Li, X.: Partially supervised classification of text documents. In: Proc. of the 19th International Conference on Machine Learning (2002)

    Google Scholar 

  9. Yu, H., Han, J., Chang, K.: PEBL: Positive example based learning for web page classification using svm. In: Proc. of the 8th ACM SIGKDD Conference (2002)

    Google Scholar 

  10. Mitchell, T.: Machine Learning. McGraw Hill, Inc., New York (1997)

    MATH  Google Scholar 

  11. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-wesley-Longman, Harlow (1999)

    Google Scholar 

  12. McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: Proc. of AAAI/ICML 1998 Workshop on Learning for Text Categorization, pp. 41–48 (1998)

    Google Scholar 

  13. Hoffgen, K., Simon, H., Horn, K.V.: Robust trainability of single neurons. Journal of Computer and System Sciences 50, 114–125 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  14. Joachims, T.: Making large-scale SVM learning practical. In: Scholkoph, B., et al. (eds.) Advances in Kernel Methods – Support Vector Learning. MIT Press, Cambridge (1999), http://svmlight.joachims.org/

    Google Scholar 

  15. Joachims, T.: Evaluating retrieval performance using clickthrough data. In: Proc. of the SIGIR Workshop on Mathematical/Formal Methods in Information Retrieval (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Deng, L., Ng, W., Chai, X., Lee, DL. (2006). Spying Out Accurate User Preferences for Search Engine Adaptation. In: Mobasher, B., Nasraoui, O., Liu, B., Masand, B. (eds) Advances in Web Mining and Web Usage Analysis. WebKDD 2004. Lecture Notes in Computer Science(), vol 3932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11899402_6

Download citation

  • DOI: https://doi.org/10.1007/11899402_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-47127-1

  • Online ISBN: 978-3-540-47128-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics