Spying Out Accurate User Preferences for Search Engine Adaptation

Deng, Lin; Ng, Wilfred; Chai, Xiaoyong; Lee, Dik-Lun

doi:10.1007/11899402_6

Lin Deng²²,
Wilfred Ng²²,
Xiaoyong Chai²² &
…
Dik-Lun Lee²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3932))

Included in the following conference series:

International Workshop on Knowledge Discovery on the Web

687 Accesses
3 Citations

Abstract

Most existing search engines employ static ranking algorithms that do not adapt to the specific needs of users. Recently, some researchers have studied the use of clickthrough data to adapt a search engine’s ranking function. Clickthrough data indicate for each query the results that are clicked by users. As a kind of implicit relevance feedback information, clickthrough data can easily be collected by a search engine. However, clickthrough data is sparse and incomplete, thus, it is a challenge to discover accurate user preferences from it. In this paper, we propose a novel algorithm called “Spy Naïve Bayes” (SpyNB) to identify user preferences generated from clickthrough data. First, we treat the result items clicked by the users as sure positive examples and those not clicked by the users as unlabelled data. Then, we plant the sure positive examples (the spies) into the unlabelled set of result items and apply a naïve Bayes classification to generate the reliable negative examples. These positive and negative examples allow us to discover more accurate user’s preferences. Finally, we employ the SpyNB algorithm with a ranking SVM optimizer to build an adaptive metasearch engine. Our experimental results show that, compared with the original ranking, SpyNB can significantly improve the average ranks of users’ click by 20%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bartell, B., Cottrell, G., Belew, R.: Automatic combination of multiple ranked retrieval systemss. In: Proc. of the 17th ACM SIGIR Conference, pp. 173–181 (1994)
Google Scholar
Cohen, W., Shapire, R., Singer, Y.: Learning to order things. Journal of Artifical Intelligence Research 10, 243–270 (1999)
MathSciNet MATH Google Scholar
Boyan, J., Freitag, D., Joachims, T.: A machine learning architecture for optimizing web search engines. In: Proc. of AAAI workshop on Internet-Based Information System (1996)
Google Scholar
Joachims, T.: Optimizing search engines using clickthrough data. In: Proc. of the 8th ACM SIGKDD Conference, pp. 133–142 (2002)
Google Scholar
Tan, Q., Chai, X., Ng, W., Lee, D.: Applying co-training to clickthrough data for search engine adaptation. In: Proc. of the 9th DASFAA conference, pp. 519–532 (2004)
Google Scholar
Li, X., Liu, B.: Learning to classify text using positive and unlabeled data. In: Proc. of 8th International Joint Conference on Artificial Intelligence (2003)
Google Scholar
Liu, B., Dai, Y., Li, X., Lee, W.S.: Building text classifiers using positive and unlabeled examples. In: Proc. of the 3rd International Conference on Data Mining (2003)
Google Scholar
Liu, B., Lee, W.S., Yu, P., Li, X.: Partially supervised classification of text documents. In: Proc. of the 19th International Conference on Machine Learning (2002)
Google Scholar
Yu, H., Han, J., Chang, K.: PEBL: Positive example based learning for web page classification using svm. In: Proc. of the 8th ACM SIGKDD Conference (2002)
Google Scholar
Mitchell, T.: Machine Learning. McGraw Hill, Inc., New York (1997)
MATH Google Scholar
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-wesley-Longman, Harlow (1999)
Google Scholar
McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: Proc. of AAAI/ICML 1998 Workshop on Learning for Text Categorization, pp. 41–48 (1998)
Google Scholar
Hoffgen, K., Simon, H., Horn, K.V.: Robust trainability of single neurons. Journal of Computer and System Sciences 50, 114–125 (1995)
Article MathSciNet MATH Google Scholar
Joachims, T.: Making large-scale SVM learning practical. In: Scholkoph, B., et al. (eds.) Advances in Kernel Methods – Support Vector Learning. MIT Press, Cambridge (1999), http://svmlight.joachims.org/
Google Scholar
Joachims, T.: Evaluating retrieval performance using clickthrough data. In: Proc. of the SIGIR Workshop on Mathematical/Formal Methods in Information Retrieval (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Hong Kong University of Science and Technology, Hong Kong
Lin Deng, Wilfred Ng, Xiaoyong Chai & Dik-Lun Lee

Authors

Lin Deng
View author publications
You can also search for this author in PubMed Google Scholar
Wilfred Ng
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyong Chai
View author publications
You can also search for this author in PubMed Google Scholar
Dik-Lun Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Web Intelligence School of Computing, DePaul University, Chicago, Illinois, USA
Bamshad Mobasher
Speed School of Engineering, Department of Computer Engineering & Computer Science, University of Louisville, KY 40292, Louisville, USA
Olfa Nasraoui
College of Architecture and Urban Planning, Tongji University, 1239 Siping Road, 200092, Shanghai, P.R. China
Bing Liu
Data Miners Inc., 77 North Washington Street, MA 02114, Boston, USA
Brij Masand

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Deng, L., Ng, W., Chai, X., Lee, DL. (2006). Spying Out Accurate User Preferences for Search Engine Adaptation. In: Mobasher, B., Nasraoui, O., Liu, B., Masand, B. (eds) Advances in Web Mining and Web Usage Analysis. WebKDD 2004. Lecture Notes in Computer Science(), vol 3932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11899402_6

Download citation

DOI: https://doi.org/10.1007/11899402_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-47127-1
Online ISBN: 978-3-540-47128-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics