Advertisement

StreamMyRelevance!

Prediction of Result Relevance from Real-Time Interactions and Its Application to Hotel Search
  • Maximilian Speicher
  • Sebastian Nuck
  • Andreas Both
  • Martin Gaedke
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8541)

Abstract

The prime aspect of quality for search-driven web applications is to provide users with the best possible results for a given query. Thus, it is necessary to predict the relevance of results a priori. Current solutions mostly engage clicks on results for respective predictions, but research has shown that it is highly beneficial to also consider additional features of user interaction. Nowadays, such interactions are produced in steadily growing amounts by internet users. Processing these amounts calls for streaming-based approaches and incrementally updateable relevance models. We present StreamMyRelevance!—a novel streaming-based system for ensuring quality of ranking in search engines. Our approach provides a complete pipeline from collecting interactions in real-time to processing them incrementally on the server side. We conducted a large-scale evaluation with real-world data from the hotel search domain. Results show that our system yields predictions as good as those of competing state-of-the-art systems, but by design of the underlying framework at higher efficiency, robustness, and scalability.

Keywords

Streaming Real-Time Interaction Tracking Learning to Rank Relevance Prediction 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baldi, P., Brunak, S., Chauvin, Y., Andersen, C.A., Nielsen, H.: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16(5) (2000)Google Scholar
  2. 2.
    Bian, J., Liu, Y., Agichtein, E., Zha, H.: A Few Bad Votes Too Many? Towards Robust Ranking in Social Media. In: Proc. AIRWeb (2008)Google Scholar
  3. 3.
    Chapelle, O., Zhang, Y.: A Dynamic Bayesian Network Click Model for Web Search Ranking. In: Proc. WWW (2009)Google Scholar
  4. 4.
    Craswell, N., Zoeter, O., Tylor, M., Ramsey, B.: An Experimental Comparison of Click Position-Bias Models. In: Proc. WSDM (2008)Google Scholar
  5. 5.
    Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. CACM 51(1) (2008)Google Scholar
  6. 6.
    Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: Proc. KDD (2000)Google Scholar
  7. 7.
    Dupret, G.E., Piwowarski, B.: A User Browsing Model to Predict Search Engine Click Data from Past Observations. In: Proc. SIGIR (2008)Google Scholar
  8. 8.
    Guo, F., Liu, C., Wang, Y.M.: Efficient Multiple-Click Models in Web Search. In: Proc. WSDM (2009)Google Scholar
  9. 9.
    Guo, Q., Agichtein, E.: Beyond Dwell Time: Estimating Document Relevance from Cursor Movements and other Post-click Searcher Behavior. In: Proc. WWW (2012)Google Scholar
  10. 10.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explor. Newsl. 11(1) (2009)Google Scholar
  11. 11.
    Huang, J.: On the Value of Page-Level Interactions in Web Search. In: HCIR Workshop (2011)Google Scholar
  12. 12.
    Huang, J., White, R.W., Buscher, G., Wang, K.: Improving Searcher Models Using Mouse Cursor Activity. In: Proc. SIGIR (2012)Google Scholar
  13. 13.
    Huang, J., White, R.W., Dumais, S.: No Clicks, No Problem: Using Cursor Movements to Understand and Improve Search. In: Proc. CHI (2011)Google Scholar
  14. 14.
    Hulten, G., Spencer, L., Domingos, P.: Mining Time-Changing Data Streams. In: Proc. KDD (2001)Google Scholar
  15. 15.
    Joachims, T.: Optimizing Search Engines using Clickthrough Data. In: Proc. KDD (2002)Google Scholar
  16. 16.
    Liu, C., Guo, F., Faloutsos, C.: BBM: Bayesian Browsing Model from Petabyte-scale Data. In: Proc. KDD (2009)Google Scholar
  17. 17.
  18. 18.
    Navalpakkam, V., Churchill, E.F.: Mouse Tracking: Measuring and Predicting Users’ Experience of Web-based Content. In: Proc. CHI (2012)Google Scholar
  19. 19.
    Radlinski, F.: Addressing Malicious Noise in Clickthrough Data. In: LR4IR Workshop at SIGIR (2007)Google Scholar
  20. 20.
    Speicher, M., Both, A., Gaedke, M.: TellMyRelevance! Predicting the Relevance of Web Search Results from Cursor Interactions. In: Proc. CIKM (2013)Google Scholar
  21. 21.
    Tsymbal, A.: The problem of concept drift: definitions and related work. Technical Report, Trinity College Dublin (2004)Google Scholar
  22. 22.
    Zaharia, M., Das, T., Li, H., Hunter, T., Shenker, S., Stoica, I.: Discretized streams: A fault-tolerant model for scalable stream processing. Technical Report, UC Berkeley (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Maximilian Speicher
    • 1
    • 2
  • Sebastian Nuck
    • 2
    • 3
  • Andreas Both
    • 2
  • Martin Gaedke
    • 1
  1. 1.Chemnitz University of TechnologyChemnitzGermany
  2. 2.R&D, Unister GmbHLeipzigGermany
  3. 3.Leipzig University of Applied SciencesLeipzigGermany

Personalised recommendations