Abstract
The query-performance prediction task is estimating retrieval effectiveness in the absence of relevance judgments. The task becomes highly challenging over theWeb due to, among other reasons, the effect of low quality (e.g., spam) documents on retrieval performance. To address this challenge, we present a novel prediction approach that utilizes queryindependent document-quality measures. While using these measures was shown to improve Web-retrieval effectiveness, this is the first study demonstrating the clear merits of using them for query-performance prediction. Evaluation performed with large scale Web collections shows that our methods post prediction quality that often surpasses that of state-of-the-art predictors, including those devised specifically for Web retrieval.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Balasubramanian, N., Kumaran, G., Carvalho, V.R.: Predicting query performance on the web. In: Proc. of SIGIR, pp. 785–786 (2010)
Bendersky, M., Croft, W.B., Diao, Y.: Quality-biased ranking of web documents. In: Proc. of WSDM, pp. 95–104 (2011)
Bernstein, Y., Billerbeck, B., Garcia, S., Lester, N., Scholer, F., Zobel, J.: RMIT university at trec 2005: Terabyte and robust track. In: Proc. of TREC-14 (2005)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proc. of WWW, pp. 107–117 (1998)
Carmel, D., Yom-Tov, E.: Estimating the Query Difficulty for Information Retrieval. In: Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool (2010)
Carmel, D., Yom-Tov, E., Darlow, A., Pelleg, D.: What makes a query difficult? In: Proc. of SIGIR, pp. 390–397 (2006)
Clarke, C.L.A., Craswell, N., Soboroff, I.: Overview of the trec 2009 web track. In: Proc. of TREC (2009)
Cormack, G.V., Smucker, M.D., Clarke, C.L.A.: Efficient and effective spam filtering and re-ranking for large web datasets. Information Retrieval 14(5), 441–465 (2011)
Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: Proc. of SIGIR, pp. 299–306 (2002)
Diaz, F.: Performance prediction using spatial autocorrelation. In: Proc. of SIGIR, pp. 583–590 (2007)
Gyöngyi, Z., Garcia-Molina, H.: Web spam taxonomy. In: Proc. of AIRWeb, pp. 39–47 (2005)
Hauff, C., Kelly, D., Azzopardi, L.: A comparison of user and system query performance predictions. In: Proc. of CIKM, pp. 979–988 (2010)
Hauff, C., Murdock, V., Baeza-Yates, R.A.: Improved query difficulty prediction for the web. In: Proc. of CIKM, pp. 439–448 (2008)
Hummel, S., Shtok, A., Raiber, F., Kurland, O., Carmel, D.: Clarity re-visited. In: Proc. of SIGIR, pp. 1039–1040 (2012)
Kurland, O., Lee, L.: PageRank without hyperlinks: Structural re-ranking using links induced by language models. In: Proc. of SIGIR, pp. 306–313 (2005)
Lavrenko, V., Croft, W.B.: Relevance-based language models. In: Proc. of SIGIR, pp. 120–127 (2001)
Lin, J., Metzler, D., Elsayed, T., Wang, L.: Of Ivory and Smurfs: Loxodontan MapReduce Experiments for Web Search. In: Proc. of TREC 2009 (2010)
Shtok, A., Kurland, O., Carmel, D.: Predicting Query Performance by Query-Drift Estimation. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 305–312. Springer, Heidelberg (2009)
Shtok, A., Kurland, O., Carmel, D.: Using statistical decision theory and relevance models for query-performance prediction. In: Proc. of SIGIR (2010)
Song, F., Croft, W.B.: A general language model for information retrieval (poster abstract). In: Proc. of SIGIR, pp. 279–280 (1999)
Tomlinson, S.: Robust, Web and Terabyte Retrieval with Hummingbird Search Server at TREC 2004. In: Proc. of TREC-13 (2004)
Vinay, V., Cox, I.J., Milic-Frayling, N., Wood, K.R.: On ranking the effectiveness of searches. In: Proc. of SIGIR, pp. 398–404 (2006)
Voorhees, E.M.: Overview of the TREC 2004 Robust Retrieval Track. In: Proc. of TREC-13 (2004)
Zhai, C., Lafferty, J.D.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proc. of SIGIR, pp. 334–342 (2001)
Zhao, Y., Scholer, F., Tsegay, Y.: Effective Pre-retrieval Query Performance Prediction Using Similarity and Variability Evidence. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 52–64. Springer, Heidelberg (2008)
Zhou, Y., Croft, B.: Ranking robustness: a novel framework to predict query performance. In: Proc. of CIKM, pp. 567–574 (2006)
Zhou, Y., Croft, B.: Query performance prediction in web search environments. In: Proc. of SIGIR, pp. 543–550 (2007)
Zhou, Y., Croft, W.B.: Document quality models for web ad hoc retrieval. In: Proc. of CIKM, pp. 331–332 (2005)
Zhu, X., Gauch, S.: Incorporating quality metrics in centralized/distributed information retrieval on the world wide web. In: Proc. of SIGIR, pp. 288–295 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Raiber, F., Kurland, O. (2013). Using Document-Quality Measures to Predict Web-Search Effectiveness. In: Serdyukov, P., et al. Advances in Information Retrieval. ECIR 2013. Lecture Notes in Computer Science, vol 7814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36973-5_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-36973-5_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36972-8
Online ISBN: 978-3-642-36973-5
eBook Packages: Computer ScienceComputer Science (R0)