Measuring the Ability of Score Distributions to Model Relevance

Cummins, Ronan

doi:10.1007/978-3-642-25631-8_3

Ronan Cummins²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7097))

Included in the following conference series:

Asia Information Retrieval Symposium

1332 Accesses
9 Citations

Abstract

Modelling the score distribution of documents returned from any information retrieval (IR) system is of both theoretical and practical importance. The goal of which is to be able to infer relevant and non-relevant documents based on their score to some degree of confidence.

In this paper, we show how the performance of mixtures of score distributions can be compared using inference of query performance as a measure of utility. We (1) outline methods which can directly calculate average precision from the parameters of a mixture distribution. We (2) empirically evaluate a number of mixtures for the task of inferring query performance, and show that the log-normal mixture can model more relevance information compared to other possible mixtures. Finally, (3) we perform an empirical analysis of the mixtures using the recall-fallout convexity hypothesis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Arampatzis, A., Kamps, J., Robertson, S.: Where to stop reading a ranked list?: threshold optimization using truncated score distributions. In: SIGIR, pp. 524–531 (2009)
Google Scholar
Arampatzis, A., Robertson, S.: Modeling score distributions in information retrieval. Inf. Retr. 14(1), 26–46 (2011)
Article Google Scholar
Aslam, J.A., Yilmaz, E.: A geometric interpretation and analysis of r-precision. In: CIKM, pp. 664–671 (2005)
Google Scholar
Baumgarten, C.: A probabilistic solution to the selection and fusion problem in distributed information retrieval. In: ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1999, pp. 246–253. ACM, New York (1999)
Google Scholar
Buckley, C., Voorhees, E.M.: Evaluating evaluation measure stability. In: SIGIR, pp. 33–40 (2000)
Google Scholar
Cummins, R., O’Riordan, C.: Learning in a pairwise term-term proximity framework for information retrieval. In: SIGIR, pp. 251–258 (2009)
Google Scholar
Fang, H., Zhai, C.: An exploration of axiomatic approaches to information retrieval. In: SIGIR, pp. 480–487 (2005)
Google Scholar
He, B., Ounis, I.: Query performance prediction. Inf. Syst. 31(7), 585–594 (2006)
Article Google Scholar
Kanoulas, E., Dai, K., Pavlu, V., Aslam, J.A.: Score distribution models: assumptions, intuition, and robustness to score manipulation. In: SIGIR, pp. 242–249 (2010)
Google Scholar
Kanoulas, E., Pavlu, V., Dai, K., Aslam, J.A.: Modeling the Score Distributions of Relevant and Non-Relevant Documents. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 152–163. Springer, Heidelberg (2009)
Chapter Google Scholar
Hastings, N., Evans, M., Peacock, B.: Statistical distributions, third edition. Measurement Science and Technology 12(1), 117 (2001)
Google Scholar
Manmatha, R., Rath, T., Feng, F.: Modeling score distributions for combining the outputs of search engines. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2001, pp. 267–275. ACM, New York (2001)
Google Scholar
Van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butterworth-Heinemann, Newton (1979)
MATH Google Scholar
Robertson, S.: On Score Distributions and Relevance. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 40–51. Springer, Heidelberg (2007)
Chapter Google Scholar
Robertson, S.E., Kanoulas, E., Yilmaz, E.: Extending average precision to graded relevance judgments. In: Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010, pp. 603–610. ACM, New York (2010)
Google Scholar
Swets, J.A.: Information retrieval systems. Science 141(3577), 245–250 (1963)
Article Google Scholar
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22, 179–214 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Technology, National University of Ireland, Galway, Ireland
Ronan Cummins

Authors

Ronan Cummins
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Computer Science and Engineering, University of Wollongong, Dubai Knowledge Village, P.O. Box 20182, Dubai, United Arab Emirates
Mohamed Vall Mohamed Salem
Faculty of Engineering and IT, Dubai International Academic City, Block 11, 1st and 2nd Floor, P.O. Box 345015, Dubai, United Arab Emirates
Khaled Shaalan
Faculty of Computer Science and Engineering, University of Wollongong, Dubai Knowledge Village, P.O. Box 20183, Dubai, United Arab Emirates
Farhad Oroumchian
Department of Electrical and Computer Engineering, University of Tehran, Faculty of Engineering, North Kargar Street, P.O. Box 14395-515, Tehran, Iran
Azadeh Shakery
Faculty of Computer Science and Engineering, University of Wollongong, Dubai knowledge Village, P.O. Box 20183, Dubai, United Arab Emirates
Halim Khelalfa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cummins, R. (2011). Measuring the Ability of Score Distributions to Model Relevance. In: Salem, M.V.M., Shaalan, K., Oroumchian, F., Shakery, A., Khelalfa, H. (eds) Information Retrieval Technology. AIRS 2011. Lecture Notes in Computer Science, vol 7097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25631-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-25631-8_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25630-1
Online ISBN: 978-3-642-25631-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics