Predicting Query Performance via Classification

Collins-Thompson, Kevyn; Bennett, Paul N.

doi:10.1007/978-3-642-12275-0_15

Predicting Query Performance via Classification

Kevyn Collins-Thompson²⁴ &
Paul N. Bennett²⁴

Conference paper

2194 Accesses
20 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5993))

Abstract

We investigate using topic prediction data, as a summary of document content, to compute measures of search result quality. Unlike existing quality measures such as query clarity that require the entire content of the top-ranked results, class-based statistics can be computed efficiently online, because class information is compact enough to precompute and store in the index. In an empirical study we compare the performance of class-based statistics to their language-model counterparts for two performance-related tasks: predicting query difficulty and expansion risk. Our findings suggest that using class predictions can offer comparable performance to full language models while reducing computation overhead.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amati, G., Carpineto, C., Romano, G.: Query difficulty, robustness, and selective application of query expansion. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 127–137. Springer, Heidelberg (2004)
Google Scholar
Aslam, J.A., Pavlu, V.: Query hardness estimation using Jensen-Shannon divergence among multiple scoring functions. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 198–209. Springer, Heidelberg (2007)
Chapter Google Scholar
Billerbeck, B.: Efficient Query Expansion. PhD thesis, RMIT University, Melbourne, Australia (2005)
Google Scholar
Chickering, D., Heckerman, D., Meek, C.: A Bayesian approach to learning Bayesian networks with local structure. In: UAI 1997, pp. 80–89 (1997)
Google Scholar
Chickering, D.M.: The WinMine toolkit. Technical Report MSR-TR-2002-103, Microsoft, Redmond, WA (2002)
Google Scholar
Collins-Thompson, K., Callan, J.: Estimation and use of uncertainty in pseudo-relevance feedback. In: Proceedings of SIGIR 2007, pp. 303–310 (2007)
Google Scholar
Cronen-Townsend, S., Croft, W.: Quantifying query ambiguity. In: Proceedings of HCL 2002, pp. 94–98 (2002)
Google Scholar
Diaz, F.: Performance prediction using spatial autocorrelation. In: Proceedings of SIGIR 2007, pp. 583–590 (2007)
Google Scholar
Hauff, C., Murdock, V., Baeza-Yates, R.: Improved query difficulty prediction for the web. In: Proceedings of CIKM 2008, pp. 439–448 (2008)
Google Scholar
He, B., Ounis, I.: Query performance prediction. Information Systems 31, 585–594 (2006)
Article Google Scholar
Heckerman, D., Chickering, D., Meek, C., Rounthwaite, R., Kadie, C.: Dependency networks for inference, collaborative filtering, and data visualization. Journal of Machine Learning Research 1, 49–75 (2000)
Article Google Scholar
Lavrenko, V.: A Generative Theory of Relevance. PhD thesis, University of Massachusetts, Amherst (2004)
Google Scholar
Lemur. Lemur toolkit for language modeling & retrieval (2002), http://www.lemurproject.org
Metzler, D., Croft, W.B.: Latent concept expansion using Markov Random Fields. In: Proceedings of SIGIR 2007, pp. 311–318 (2007)
Google Scholar
Netscape Communication Corp. Open directory project, http://www.dmoz.org
Qiu, G., Liu, K., Bu, J., Chen, C., Kang, Z.: Quantify query ambiguity using ODP metadata. In: Proceedings of SIGIR 2007, pp. 697–698 (2007)
Google Scholar
Smeaton, A., van Rijsbergen, C.J.: The retrieval effects of query expansion on a feedback document retrieval system. The Computer Journal 26(3), 239–246 (1983)
Article Google Scholar
Song, R., Luo, Z., Wen, J.-R., Yu, Y., Hon, H.-W.: Identifying ambiguous queries in web search. In: Proceedings of WWW 2007, pp. 1169–1170 (2007)
Google Scholar
Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: A language model-based search engine for complex queries. In: Proceedings of the International Conference on Intelligence Analysis (2004)
Google Scholar
Vinay, V., Cox, I.J., Milic-Frayling, N., Wood, K.: On ranking the effectiveness of searches. In: Proceedings of SIGIR 2005, pp. 398–404 (2005)
Google Scholar
Winaver, M., Kurland, O., Domshlak, C.: Towards robust query expansion: model selection in the language modeling framework. In: Proceedings of SIGIR 2007, pp. 729–730 (2007)
Google Scholar
YomTov, E., Fine, S., Carmel, D., Darlow, A.: Learning to estimate query difficulty. In: Proceedings of SIGIR 2005, pp. 512–519 (2005)
Google Scholar
Zhou, Y., Croft, W.B.: Ranking robustness: a novel framework to predict query performance. In: Proceedings of CIKM 2006, pp. 567–574 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Microsoft Research, 1 Microsoft Way, Redmond, WA, USA, 98052
Kevyn Collins-Thompson & Paul N. Bennett

Authors

Kevyn Collins-Thompson
View author publications
You can also search for this author in PubMed Google Scholar
Paul N. Bennett
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Adaptive Information Cluster, Dublin City University, Dublin, 9, Ireland
Cathal Gurrin
The Open University, Walton Hall, MK7 6HF, Milton Keynes, UK
Yulan He
Microsoft Research Ltd, 7 JJ Thomson Avenue, CB3 0FB, Cambridge, UK
Gabriella Kazai
Department of Computer Science, University of Essex, Wivenhoe Park, CO4 3SQ, Colchester, UK
Udo Kruschwitz
The Open University, Walton Hall, Milton Keynes, UK
Suzanne Little
University of London, London, UK
Thomas Roelleke
Knowledge Media Institute, The Open University, MK7 6AA, Milton Keynes, UK
Stefan Rüger
Department of Computing Science, University of Glasgow, 17 Lilybank Gardens, G12 8QQ, Glasgow, UK
Keith van Rijsbergen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Collins-Thompson, K., Bennett, P.N. (2010). Predicting Query Performance via Classification. In: Gurrin, C., et al. Advances in Information Retrieval. ECIR 2010. Lecture Notes in Computer Science, vol 5993. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12275-0_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-12275-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12274-3
Online ISBN: 978-3-642-12275-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics