Skip to main content

Using Document-Quality Measures to Predict Web-Search Effectiveness

  • Conference paper
Advances in Information Retrieval (ECIR 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7814))

Included in the following conference series:

Abstract

The query-performance prediction task is estimating retrieval effectiveness in the absence of relevance judgments. The task becomes highly challenging over theWeb due to, among other reasons, the effect of low quality (e.g., spam) documents on retrieval performance. To address this challenge, we present a novel prediction approach that utilizes queryindependent document-quality measures. While using these measures was shown to improve Web-retrieval effectiveness, this is the first study demonstrating the clear merits of using them for query-performance prediction. Evaluation performed with large scale Web collections shows that our methods post prediction quality that often surpasses that of state-of-the-art predictors, including those devised specifically for Web retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Balasubramanian, N., Kumaran, G., Carvalho, V.R.: Predicting query performance on the web. In: Proc. of SIGIR, pp. 785–786 (2010)

    Google Scholar 

  2. Bendersky, M., Croft, W.B., Diao, Y.: Quality-biased ranking of web documents. In: Proc. of WSDM, pp. 95–104 (2011)

    Google Scholar 

  3. Bernstein, Y., Billerbeck, B., Garcia, S., Lester, N., Scholer, F., Zobel, J.: RMIT university at trec 2005: Terabyte and robust track. In: Proc. of TREC-14 (2005)

    Google Scholar 

  4. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proc. of WWW, pp. 107–117 (1998)

    Google Scholar 

  5. Carmel, D., Yom-Tov, E.: Estimating the Query Difficulty for Information Retrieval. In: Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool (2010)

    Google Scholar 

  6. Carmel, D., Yom-Tov, E., Darlow, A., Pelleg, D.: What makes a query difficult? In: Proc. of SIGIR, pp. 390–397 (2006)

    Google Scholar 

  7. Clarke, C.L.A., Craswell, N., Soboroff, I.: Overview of the trec 2009 web track. In: Proc. of TREC (2009)

    Google Scholar 

  8. Cormack, G.V., Smucker, M.D., Clarke, C.L.A.: Efficient and effective spam filtering and re-ranking for large web datasets. Information Retrieval 14(5), 441–465 (2011)

    Article  Google Scholar 

  9. Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: Proc. of SIGIR, pp. 299–306 (2002)

    Google Scholar 

  10. Diaz, F.: Performance prediction using spatial autocorrelation. In: Proc. of SIGIR, pp. 583–590 (2007)

    Google Scholar 

  11. Gyöngyi, Z., Garcia-Molina, H.: Web spam taxonomy. In: Proc. of AIRWeb, pp. 39–47 (2005)

    Google Scholar 

  12. Hauff, C., Kelly, D., Azzopardi, L.: A comparison of user and system query performance predictions. In: Proc. of CIKM, pp. 979–988 (2010)

    Google Scholar 

  13. Hauff, C., Murdock, V., Baeza-Yates, R.A.: Improved query difficulty prediction for the web. In: Proc. of CIKM, pp. 439–448 (2008)

    Google Scholar 

  14. Hummel, S., Shtok, A., Raiber, F., Kurland, O., Carmel, D.: Clarity re-visited. In: Proc. of SIGIR, pp. 1039–1040 (2012)

    Google Scholar 

  15. Kurland, O., Lee, L.: PageRank without hyperlinks: Structural re-ranking using links induced by language models. In: Proc. of SIGIR, pp. 306–313 (2005)

    Google Scholar 

  16. Lavrenko, V., Croft, W.B.: Relevance-based language models. In: Proc. of SIGIR, pp. 120–127 (2001)

    Google Scholar 

  17. Lin, J., Metzler, D., Elsayed, T., Wang, L.: Of Ivory and Smurfs: Loxodontan MapReduce Experiments for Web Search. In: Proc. of TREC 2009 (2010)

    Google Scholar 

  18. Shtok, A., Kurland, O., Carmel, D.: Predicting Query Performance by Query-Drift Estimation. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 305–312. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  19. Shtok, A., Kurland, O., Carmel, D.: Using statistical decision theory and relevance models for query-performance prediction. In: Proc. of SIGIR (2010)

    Google Scholar 

  20. Song, F., Croft, W.B.: A general language model for information retrieval (poster abstract). In: Proc. of SIGIR, pp. 279–280 (1999)

    Google Scholar 

  21. Tomlinson, S.: Robust, Web and Terabyte Retrieval with Hummingbird Search Server at TREC 2004. In: Proc. of TREC-13 (2004)

    Google Scholar 

  22. Vinay, V., Cox, I.J., Milic-Frayling, N., Wood, K.R.: On ranking the effectiveness of searches. In: Proc. of SIGIR, pp. 398–404 (2006)

    Google Scholar 

  23. Voorhees, E.M.: Overview of the TREC 2004 Robust Retrieval Track. In: Proc. of TREC-13 (2004)

    Google Scholar 

  24. Zhai, C., Lafferty, J.D.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proc. of SIGIR, pp. 334–342 (2001)

    Google Scholar 

  25. Zhao, Y., Scholer, F., Tsegay, Y.: Effective Pre-retrieval Query Performance Prediction Using Similarity and Variability Evidence. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 52–64. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  26. Zhou, Y., Croft, B.: Ranking robustness: a novel framework to predict query performance. In: Proc. of CIKM, pp. 567–574 (2006)

    Google Scholar 

  27. Zhou, Y., Croft, B.: Query performance prediction in web search environments. In: Proc. of SIGIR, pp. 543–550 (2007)

    Google Scholar 

  28. Zhou, Y., Croft, W.B.: Document quality models for web ad hoc retrieval. In: Proc. of CIKM, pp. 331–332 (2005)

    Google Scholar 

  29. Zhu, X., Gauch, S.: Incorporating quality metrics in centralized/distributed information retrieval on the world wide web. In: Proc. of SIGIR, pp. 288–295 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Raiber, F., Kurland, O. (2013). Using Document-Quality Measures to Predict Web-Search Effectiveness. In: Serdyukov, P., et al. Advances in Information Retrieval. ECIR 2013. Lecture Notes in Computer Science, vol 7814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36973-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36973-5_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36972-8

  • Online ISBN: 978-3-642-36973-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics