Query Performance Prediction for Information Retrieval Based on Covering Topic Score

Lang, Hao; Wang, Bin; Jones, Gareth; Li, Jin-Tao; Ding, Fan; Liu, Yi-Xuan

doi:10.1007/s11390-008-9155-6

Query Performance Prediction for Information Retrieval Based on Covering Topic Score

Regular Paper
Published: 05 August 2008

Volume 23, pages 590–601, (2008)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Hao Lang¹,
Bin Wang¹,
Gareth Jones²,
Jin-Tao Li¹,
Fan Ding¹ &
…
Yi-Xuan Liu¹

69 Accesses
7 Citations
Explore all metrics

Abstract

We present a statistical method called Covering Topic Score (CTS) to predict query performance for information retrieval. Estimation is based on how well the topic of a user’s query is covered by documents retrieved from a certain retrieval system. Our approach is conceptually simple and intuitive, and can be easily extended to incorporate features beyond bag-of-words such as phrases and proximity of terms. Experiments demonstrate that CTS significantly correlates with query performance in a variety of TREC test collections, and in particular CTS gains more prediction power benefiting from features of phrases and proximity of terms. We compare CTS with previous state-of-the-art methods for query performance prediction including clarity score and robustness score. Our experimental results show that CTS consistently performs better than, or at least as well as, these other methods. In addition to its high effectiveness, CTS is also shown to have very low computational complexity, meaning that it can be practical for real applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Query Performance Prediction Through Retrieval Coherency

Query Performance Prediction Using Joint Inverse Document Frequency of Multiple Terms

QPP++ 2023: Query-Performance Prediction and Its Evaluation in New Tasks

References

Carmel D, Yom-Tov E, Soboroff I. Predicting query difficulty. In Proc. SIGIR Workshop, Salvador, Brazil, 2005, http://www.haifa.ibm.com/sigir05-qp/index.html.
Voorhees E M. Overview of the TREC 2004 robust track. In the Online Proceeding of 2004 Text Retrieval Conference (TREC 2004).
Yom-Tov E, Fine S, Carmel D, Darlow A. Learning to estimate query difficulty: Including applications to missing content detection and distributed information retrieval. In Proc. the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Salvador, Brazil, 2005, pp.512–519.
Cronen-Townsend S, Zhou Y, Croft B. Precision prediction based on ranked list coherence. Information Retrieval, 2006, 9(6): 723–755.
Article Google Scholar
Harman D, Buckley C. The NRRC reliable information access (RIA) workshop. In Proc. the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Sheffield, United Kingdom, 2004, pp.528–529.
He B, Ounis I. Inferring query performance using pre-retrieval predictors. In Proc. the SPIRE 2004, Padova, Italy, 2004, pp.43–54.
Plachouras V, He B, Ounis I. University of Glasgow at TREC2004: Experiments in web, robust, and terabyte tracks with terrier. In the Online Proc. 2004 Text Retrieval Conference (TREC 2004).
Mothe J, Tanguy L. Linguistic features to predict query difficulty. In Proc. ACM SIGIR 2005 Workshop on Predicting Query Difficulty-Methods and Applications, 2005.
Swen B, Lu X-Q, Zan H-Y, Su Q, Lai Z-G, Xiang K, Hu J-H. Part-of-speech sense matrix model experiments in the TREC 2004 robust track at ICL, PKU. In the Online Proceeding of 2004 Text Retrieval Conference (TREC 2004).
Cronen-Townsend S, Zhou Y, Croft W B. Predicting query performance. In Proc. the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, 2002, pp.299–306.
Amati G, Carpineto C, Romano G. Query difficulty, robustness and selective application of query expansion. In Proc. the 25th European Conference on Information Retrieval, Sunderland, Great Britain, 2004, pp.127–137.
Zhou Y, Croft W B. Ranking robustness: A novel framework to predict query performance. In Proc. the 15th ACM International Conference on Information and Knowledge Management. Virginia, USA, 2006, pp.567–574.
Vinay V, Cox I J, Milic-Frayling N, Wood K. On ranking the effectiveness of searches. In Proc. the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, USA, 2006, pp.398–404.
C J van Rijsbergen. Information Retrieval. Second Edition, London: Butterworths, 1979.
Google Scholar
Carmel D, Yom-Tov E, Darlow A, Pelleg D. What makes a query difficult? In Proc. the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, USA, 2006, pp.390–397.
Song F, Croft W B. A general language model for information retrieval. In Proc. the 18th ACM International Conference on Information and Knowledge Management, Kansas City, USA, 1999, pp.316–321.
D Metzler, W Bruce Croft. A Markov random field model for term dependencies. In Proc. the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Salvador, Brazil, 2005, pp.472–479.
G Mishne, M de Rijke. Boosting web retrieval through query operations. In Proc. the 27th European Conference on Information Retrieval, pp.502–516.
Yang Y, Liu X. A re-examination of text categorization methods. In Proc. the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, California, USA, 1999, pp.42–49.
Wasserman L. All of Statistics: A Concise Course in Statistical Inference. Springer Press, 2004.
Tao T, Zhai C. Regularized estimation of mixture models for robust pseudo-relevance feedback. In Proc. the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, USA, 2006, pp.162–169.

Download references

Author information

Authors and Affiliations

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
Hao Lang, Bin Wang, Jin-Tao Li, Fan Ding & Yi-Xuan Liu
School of Computing, Dublin City University, Dublin, Ireland
Gareth Jones

Authors

Hao Lang
View author publications
You can also search for this author in PubMed Google Scholar
Bin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Gareth Jones
View author publications
You can also search for this author in PubMed Google Scholar
Jin-Tao Li
View author publications
You can also search for this author in PubMed Google Scholar
Fan Ding
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Xuan Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Lang.

Additional information

This work is supported by the National Natural Science Foundation of China under Grant No. 60603094 and the National Grand Fundamental Research 973 Program of China under Grant No. 2004CB318109.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 100 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lang, H., Wang, B., Jones, G. et al. Query Performance Prediction for Information Retrieval Based on Covering Topic Score. J. Comput. Sci. Technol. 23, 590–601 (2008). https://doi.org/10.1007/s11390-008-9155-6

Download citation

Received: 18 June 2007
Revised: 20 March 2008
Published: 05 August 2008
Issue Date: July 2008
DOI: https://doi.org/10.1007/s11390-008-9155-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Query Performance Prediction for Information Retrieval Based on Covering Topic Score

Abstract

Access this article

Similar content being viewed by others

Query Performance Prediction Through Retrieval Coherency

Query Performance Prediction Using Joint Inverse Document Frequency of Multiple Terms

QPP++ 2023: Query-Performance Prediction and Its Evaluation in New Tasks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

(PDF 100 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Query Performance Prediction for Information Retrieval Based on Covering Topic Score

Abstract

Access this article

Similar content being viewed by others

Query Performance Prediction Through Retrieval Coherency

Query Performance Prediction Using Joint Inverse Document Frequency of Multiple Terms

QPP++ 2023: Query-Performance Prediction and Its Evaluation in New Tasks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

(PDF 100 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation