The Curious Incidence of Bias Corrections in the Pool

  • Aldo LipaniEmail author
  • Mihai Lupu
  • Allan Hanbury
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9626)


Recently, it has been discovered that it is possible to mitigate the Pool Bias of Precision at cut-off (P@n) when used with the fixed-depth pooling strategy, by measuring the effect of the tested run against the pooled runs. In this paper we extend this analysis and test the existing methods on different pooling strategies, simulated on a selection of 12 TREC test collections. We observe how the different methodologies to correct the pool bias behave, and provide guidelines about which pooling strategy should be chosen.


Information Retrieval Mean Absolute Error Information Retrieval System Test Collection Bias Correction Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bodoff, D., Li, P.: Test theory for assessing ir test collections. In: Proceedings of SIGIR (2007)Google Scholar
  2. 2.
    Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: Proceedings of SIGIR (2004)Google Scholar
  3. 3.
    Büttcher, S., Clarke, C.L.A., Yeung, P.C.K., Soboroff, I.: Reliable information retrieval evaluation with incomplete and biased judgements. In: Proceedings of SIGIR (2007)Google Scholar
  4. 4.
    Clarke, C.L.A., Smucker, M.D.: Time well spent. In: Proceedings of IIiX (2014)Google Scholar
  5. 5.
    Harman, D.: Overview of the first trec conference. In: Proceedings of SIGIR (1993)Google Scholar
  6. 6.
    Jones, K.S.: Letter to the editor. Inf. Process. Manage. 39(1), 156–159 (2003)CrossRefGoogle Scholar
  7. 7.
    Lipani, A., Lupu, M., Hanbury, A.: Splitting water: precision and anti-precision to reduce pool bias. In: Proceedings of SIGIR (2015)Google Scholar
  8. 8.
    Nottelmann, H., Fuhr, N.: From retrieval status values to probabilities of relevance for advanced ir applications. Inf. Retr. 6(3–4), 363–388 (2003)CrossRefzbMATHGoogle Scholar
  9. 9.
    Sakai, T.: Alternatives to bpref. In: Proceedings of SIGIR (2007)Google Scholar
  10. 10.
    Sanderson, M., Zobel, J.: Information retrieval system evaluation: effort, sensitivity, andreliability. In: Proceedings of SIGIR (2005)Google Scholar
  11. 11.
    Jones, K.S., van Rijsbergen, C.J.: Report on the need for and provision of an “ideal” information retrieval test collection. British Library Research and Development Report No. 5266 (1975)Google Scholar
  12. 12.
    Urbano, J., Marrero, M., Martín, D.: On the measurement of test collection reliability. In: Proceedings of SIGIR (2013)Google Scholar
  13. 13.
    Voorhees, E.M.: Topic set size redux. In: Proceedings of SIGIR (2009)Google Scholar
  14. 14.
    Voorhees, E.M.: The effect of sampling strategy on inferred measures. In: Proceedings of SIGIR (2014)Google Scholar
  15. 15.
    Voorhees, E.M., Buckley, C.: The effect of topic set size on retrieval experiment error. In: Proceedings of SIGIR (2002)Google Scholar
  16. 16.
    Webber, W., Park, L.A.F.: Score adjustment for correction of pooling bias. In: Proceedings of SIGIR (2009)Google Scholar
  17. 17.
    Yilmaz, E., Aslam, J.A.: Estimating average precision with incomplete and imperfect judgments. In: Proceedings of CIKM (2006)Google Scholar
  18. 18.
    Yilmaz, E., Kanoulas, E., Aslam, J.A.: A simple and efficient sampling method for estimating AP and NDCG. In: Procedings of SIGIR (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Institute of Software Technology and Interactive Systems (ISIS)Vienna University of TechnologyViennaAustria

Personalised recommendations