Rethinking How to Extend Average Precision to Graded Relevance

  • Marco Ferrante
  • Nicola Ferro
  • Maria Maistro
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8685)


We present two new measures of retrieval effectiveness, inspired by Graded Average Precision(GAP), which extends Average Precision(AP) to graded relevance judgements. Starting from the random choice of a user, we define Extended Graded Average Precision(xGAP) and Expected Graded Average Precision(eGAP), which are more accurate than GAP in the case of a small number of highly relevant documents with high probability to be considered relevant by the users. The proposed measures are then evaluated on TREC 10, TREC 14, and TREC 21 collections showing that they actually grasp a different angle from GAP and that they are robust when it comes to incomplete judgments and shallow pools.


Relevant Document Average Precision Relevance Judgement Random Experiment Binary Measure 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Buckley, C., Voorhees, E.M.: Retrieval Evaluation with Incomplete Information. In: SIGIR 2007, pp. 25–32. ACM Press, USA (2004)Google Scholar
  2. 2.
    Buckley, C., Voorhees, E.M.: Retrieval System Evaluation. In: TREC. Experiment and Evaluation in Information Retrieval, pp. 53–78. MIT Press, USA (2005)Google Scholar
  3. 3.
    Clarke, C.L.A., Craswell, N., Voorhees, H.: Overview of the TREC 2012 Web Track. In: TREC 2012, pp. 1–8. NIST, Special Publication 500-298, USA (2013)Google Scholar
  4. 4.
    Hawking, D., Craswell, N.: Overview of the TREC-2001 Web Track. In TREC 2001, pp. 61–67. NIST, Special Publication 500-250, USA (2001)Google Scholar
  5. 5.
    Järvelin, K., Kekäläinen, J.: Cumulated Gain-Based Evaluation of IR Techniques. ACM Transactions on Information Systems (TOIS) 20(4), 422–446 (2002)CrossRefGoogle Scholar
  6. 6.
    Kendall, M.G.: Rank correlation methods. Griffin, Oxford (1948)zbMATHGoogle Scholar
  7. 7.
    Moffat, A., Zobel, J.: Rank-biased Precision for Measurement of Retrieval Effectiveness. ACM Transactions on Information Systems (TOIS) 27(1), 2:1–2:27 (2008)Google Scholar
  8. 8.
    Resnick, S.I.: A Probability Path. Birkhäuser, Boston (2005)CrossRefGoogle Scholar
  9. 9.
    Robertson, S.E., Kanoulas, E., Yilmaz, E.: Extending Average Precision to Graded Relevance Judgments. In: SIGIR 2010, pp. 603–610. ACM Press, USA (2010)Google Scholar
  10. 10.
    Voorhees, E.: Evaluation by Highly Relevant Documents. In: SIGIR 2001, pp. 74–82. ACM Press, USA (2001)Google Scholar
  11. 11.
    Voorhees, E.M.: Overview of the TREC 2005 Robust Retrieval Track. In: TREC 2005. NIST, Special Pubblication 500-266, USA (2005)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Marco Ferrante
    • 1
  • Nicola Ferro
    • 2
  • Maria Maistro
    • 2
  1. 1.Dept. of MathematicsUniversity of PaduaItaly
  2. 2.Dept. of Information EngineeringUniversity of PaduaItaly

Personalised recommendations