Skip to main content
Log in

A Probabilistic Model for Information Retrieval by Mining User Behaviors

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

A query submitted to a search engine provides limited information about the searcher’s real search intent. Alternatively, other information from user’s historical behaviors, e.g., clicks and dwell time, can provide a strong clue to identify the search purpose. In this paper, we: (1) investigate the impact of distributions of users and queries on reranking documents that are initially returned by a search engine; (2) perform tests for all users, i.e., both seen and unseen users in the training period. For unseen users, we explore the knowledge from their similar users seen in the training period. Our experiments show that integration of information from user behavior and document with an optimal weight outperforms combinations with a fixed tradeoff. On average, our model achieves near 3 % improvements than the best baseline approach in terms of metrics, e.g., MAP, P@K and NDCG@K.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. It’s interchangeable with document, represented by d.

  2. https://www.kaggle.com/c/yandex-personalized-web-search-challenge.

  3. http://trec.nist.gov/trec_eval/.

References

  1. Agichtein E, Brill E, Dumais S. Improving web search ranking by incorporating user behavior information. In: SIGIR ’06, ACM, New York; 2006. pp. 19–26.

  2. Bennett PN, Radlinski F, White RW, Yilmaz E. Inferring and using location metadata to personalize web search. In: SIGIR ’11, 2011. pp. 135–144.

  3. Bennett PN, White RW, Chu W, Dumais ST, Bailey P, Borisyuk F, Cui X. Modeling the impact of short- and long-term behavior on search personalization. In: SIGIR ’12, 2012. pp. 185–194.

  4. Cambria E, Hussain A, Havasi C, Eckl C. Senticspace: visualizing opinions and sentiments in a multi-dimensional vector space. In: KES ’10, 2010. pp. 385–393.

  5. Cambria E, Schuller B, Liu B, Wang H, Havasi C. Knowledge-based approaches to concept-level sentiment analysis. IEEE Intell Syst. 2013;28(2):12–4.

    Article  Google Scholar 

  6. Cambria E, Fu J, Bisio F, Poria S. Affectivespace 2: Enabling affective intuition for concept-level sentiment analysis. In: AAAI 15, 2015. p. 508514.

  7. Huang W, Khoury R, Dawborn T, Huang B, Huang M, Huang X. Webevis: analyzing user web behavior through visual metaphors. Sci China Info Sci. 2013;56(5):1–15.

    Article  Google Scholar 

  8. Kim Y, Hassan A, White RW, Zitouni I. Modeling dwell time to predict click-level satisfaction. In: WSDM ’14, ACM, New York; 2014. pp. 193–202.

  9. Kurland O, Lee L. Corpus structure, language models, and ad hoc information retrieval. In: SIGIR ’04, 2004. pp. 194–201.

  10. Matthijs N, Radlinski F. Personalizing web search using long term browsing history. In: WSDM ’11, 2011. pp. 25–34.

  11. Mihalkova L, Mooney R. Learning to disambiguate search queries from short sessions. In: ECML PKDD ’09, 2009. pp. 111–127.

  12. Poria S, Cambria E, Winterstein G, Huang GB. Sentic patterns: dependency-based rules for concept-level sentiment analysis. Knowl Based Syst. 2014;69:45–63.

    Article  Google Scholar 

  13. Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L. Bpr: Bayesian personalized ranking from implicit feedback. In: UAI ’09, 2009. pp. 452–461.

  14. Salakhutdinov R, Mnih A. Bayesian probabilistic matrix factorization using markov chain monte carlo. In: ICML ’08, 2008a. pp. 880–887.

  15. Salakhutdinov R, Mnih A. Probabilistic matrix factorization. In: NIPS 20, 2008b. pp. 1–8.

  16. Saleheen S, Lai W. User centric dynamic web information visualization. Sci China Info Sci. 2013;56(5):1–14.

    Article  Google Scholar 

  17. Shen X, Tan B, Zhai C. Context-sensitive information retrieval using implicit feedback. In: SIGIR ’05, 2005a. pp. 43–50.

  18. Shen X, Tan B, Zhai C. Implicit user modeling for personalized search. In: CIKM ’05, 2005b. pp. 824–831

  19. Sontag D, Collins-Thompson K, Bennett PN, White RW, Dumais S, Billerbeck B. Probabilistic models for personalizing web search. In: WSDM ’12, 2012. pp. 433–442.

  20. Ustinovskiy Y, Serdyukov P. Personalization of web-search using short-term browsing context. In: CIKM ’13, 2013. pp. 1979–1988.

  21. Wang H, He X, Chang MW, Song Y, White RW, Chu W. Personalized ranking model adaptation for web search. In: SIGIR ’13, 2013. pp. 323–332

  22. White RW, Bennett PN, Dumais ST. Predicting short-term interests using activity-based search context. In: CIKM ’10, 2010. pp. 1009–1018.

  23. Xiang B, Jiang D, Pei J, Sun X, Chen E, Li H. Context-aware ranking in web search. In: SIGIR ’10, 2010. pp. 451–458.

Download references

Acknowledgments

This study was partially supported by the Innovation Foundation of NUDT for Postgraduate under No. B130503.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fei Cai.

Ethics declarations

Conflicts of Interest

Fei Cai and Honghui Chen declare that they have no conflict of interest.

Informed Consent

All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, as revised in 2008 (5). Additional informed consent was obtained from all patients for which identifying information is included in this article.

Human and Animal Rights

This article does not contain any studies with human participants performed by any of the authors.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cai, F., Chen, H. A Probabilistic Model for Information Retrieval by Mining User Behaviors. Cogn Comput 8, 494–504 (2016). https://doi.org/10.1007/s12559-015-9377-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-015-9377-1

Keywords

Navigation