Score Distributions in Information Retrieval

  • Avi Arampatzis
  • Stephen Robertson
  • Jaap Kamps
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5766)

Abstract

We review the history of modeling score distributions, focusing on the mixture of normal-exponential by investigating the theoretical as well as the empirical evidence supporting its use. We discuss previously suggested conditions which valid binary mixture models should satisfy, such as the Recall-Fallout Convexity Hypothesis, and formulate two new hypotheses considering the component distributions under some limiting conditions of parameter values. From all the mixtures suggested in the past, the current theoretical argument points to the two gamma as the most-likely universal model, with the normal-exponential being a usable approximation. Beyond the theoretical contribution, we provide new experimental evidence showing vector space or geometric models, and BM25, as being “friendly” to the normal-exponential, and that the non-convexity problem that the mixture possesses is practically not severe.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Robertson, S.: On score distributions and relevance. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 40–51. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  2. 2.
    Nottelmann, H., Fuhr, N.: From uncertain inference to probability of relevance for advanced IR applications. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 235–250. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  3. 3.
    Callan, J.: Distributed information retrieval. In: Advances Information Retrieval: Recent Research from the CIIR, pp. 127–150. Kluwer Academic Publishers, Dordrecht (2000)Google Scholar
  4. 4.
    Lewis, D.D.: Evaluating and optimizing autonomous text classification systems. In: Proceedings SIGIR 1995, pp. 246–254. ACM Press, New York (1995)Google Scholar
  5. 5.
    Oard, D.W., Hedin, B., Tomlinson, S., Baron, J.R.: Overview of the TREC 2008 legal track. In: Proceedings TREC 2008 (2009)Google Scholar
  6. 6.
    Lee, J.H.: Analyses of multiple evidence combination. In: Proceedings SIGIR 1997, pp. 267–276. ACM Press, New York (1997)Google Scholar
  7. 7.
    Manmatha, R., Rath, T.M., Feng, F.: Modeling score distributions for combining the outputs of search engines. In: Proceedings SIGIR 2001, pp. 267–275. ACM Press, New York (2001)Google Scholar
  8. 8.
    Fernández, M., Vallet, D., Castells, P.: Using historical data to enhance rank aggregation. In: Proceedings SIGIR 2006, pp. 643–644. ACM Press, New York (2006)Google Scholar
  9. 9.
    Arampatzis, A., Beney, J., Koster, C.H.A., van der Weide, T.P.: Incrementality, half-life, and threshold optimization for adaptive document filtering. In: Proceeding TREC 2000 (2000)Google Scholar
  10. 10.
    Zhang, Y., Callan, J.: Maximum likelihood estimation for filtering thresholds. In: Proceedings SIGIR 2001, pp. 294–302. ACM Press, New York (2001)Google Scholar
  11. 11.
    Collins-Thompson, K., Ogilvie, P., Zhang, Y., Callan, J.: Information filtering, novelty detection, and named-page finding. In: Proceedings TREC 2002 (2002)Google Scholar
  12. 12.
    Arampatzis, A., Robertson, S., Kamps, J.: Where to stop reading a ranked list? threshold optimization using truncated score distributions. In: Proceedings SIGIR 2009. ACM Press, New York (2009)Google Scholar
  13. 13.
    Swets, J.A.: Information retrieval systems. Science 141(3577), 245–250 (1963)CrossRefGoogle Scholar
  14. 14.
    Swets, J.A.: Effectiveness of information retrieval methods. American Documentation 20, 72–89 (1969)CrossRefGoogle Scholar
  15. 15.
    Bookstein, A.: When the most “pertinent” document should not be retrieved – an analysis of the Swets model. Information Processing and Management 13(6), 377–383 (1977)CrossRefMATHGoogle Scholar
  16. 16.
    Baumgarten, C.: A probabilitstic solution to the selection and fusion problem in distributed information retrieval. In: Proceedings SIGIR 1999, pp. 246–253. ACM Press, New York (1999)Google Scholar
  17. 17.
    Arampatzis, A., van Hameren, A.: The score-distributional threshold optimization for adaptive binary classification tasks. In: Proceedings SIGIR 2001, pp. 285–293. ACM Press, New York (2001)Google Scholar
  18. 18.
    Fernández, M., Vallet, D., Castells, P.: Probabilistic score normalization for rank aggregation. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 553–556. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  19. 19.
    van Rijsbergen, C.J.: Information Retrieval, Butterworth (1979)Google Scholar
  20. 20.
    Cooper, W.S.: Some inconsistencies and misnomers in probabilistic information retrieval. In: Proceedings SIGIR 1991, pp. 57–61. ACM Press, New York (1991)Google Scholar
  21. 21.
    Cooper, W.S., Gey, F.C., Dabney, D.P.: Probabilistic retrieval based on staged logistic regression. In: Proceedings SIGIR 1992, pp. 198–210. ACM Press, New York (1992)Google Scholar
  22. 22.
    Arampatzis, A.: Unbiased s-d threshold optimization, initial query degradation, decay, and incrementality, for adaptive document filtering. In: Proceedings TREC 2001 (2002)Google Scholar
  23. 23.
    Robertson, S.E.: The parametric description of retrieval tests. part 1: The basic parameters. Journal of Documentation 25(1), 1–27 (1969)Google Scholar
  24. 24.
    Robertson, S.E., Bovey, J.D.: Statistical problems in the application of probabilistic models to information retrieval. Technical Report Report No. 5739, BLR&DD (1982)Google Scholar
  25. 25.
    Arampatzis, A., Kamps, J.: Where to stop reading a ranked list? In: Proceedings TREC 2008 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Avi Arampatzis
    • 1
  • Stephen Robertson
    • 2
  • Jaap Kamps
    • 1
  1. 1.University of Amsterdamthe Netherlands
  2. 2.Microsoft ResearchCambridgeUK

Personalised recommendations