On the Null Distribution of the Precision and Recall Curve

  • Miguel Lopes
  • Gianluca Bontempi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8725)


Precision recall curves (pr-curves) and the associated area under (AUPRC) are commonly used to assess the accuracy of information retrieval (IR) algorithms. An informative baseline is random selection. The associated probability distribution makes it possible to assess pr-curve significancy (as a p-value relative to the null of random). To our knowledge, no analytical expression of the null distribution of empirical pr-curves is available, and the only measure of significancy used in the literature relies on non-parametric Monte Carlo simulations. In this paper, we derive analytically the expected null pr-curve and AUPRC, for different interpolation strategies. The AUPRC variance is also derived, and we use it to propose a continuous approximation to the null AUPRC distribution, based on the beta distribution. Properties of the empirical pr-curve and common interpolation strategies are also discussed.


Information retrieval precision-recall curves statistical significancy assessment 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Askey, R.: Orthogonal polynomials and special functions (1975); Lectures given at the National Science Foundation regional conference held at Virginia Polytechnic Institute in June 1974Google Scholar
  2. 2.
    Boyd, K., Eng, K.H., David Page, C.: Area under the precision-recall curve: Point estimates and confidence intervals. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013, Part III. LNCS, vol. 8190, pp. 451–466. Springer, Heidelberg (2013), CrossRefGoogle Scholar
  3. 3.
    Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)CrossRefGoogle Scholar
  4. 4.
    Brodersen, K.H., Ong, C.S., Stephan, K.E., Buhmann, J.M.: The binormal assumption on precision-recall curves. In: Proceedings of the 2010 20th International Conference on Pattern Recognition, ICPR 2010, pp. 4263–4266. IEEE Computer Society, Washington, DC (2010)CrossRefGoogle Scholar
  5. 5.
    Clémençon, S., Vayatis, N.: Nonparametric estimation of the precision-recall curve. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 185–192. ACM, New York (2009)Google Scholar
  6. 6.
    Davis, J., Goadrich, M.: The relationship between precision-recall and roc curves. In: Proceedings of the 23rd International Conference on Machine Learning, ICML 2006, pp. 233–240. ACM, New York (2006)Google Scholar
  7. 7.
    Forbes, C., Evans, M., Hastings, N., Peacock, B.: Statistical Distributions. Wiley (2011)Google Scholar
  8. 8.
    Keilwagen, J., Grosse, I., Grau, J.: Area under precision-recall curves for weighted and unweighted data. PLoS One 9(3), e92209 (2014)Google Scholar
  9. 9.
    Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefzbMATHGoogle Scholar
  10. 10.
    Marbach, D., Costello, J.C., Küffner, R., Vega, N.M., Prill, R.J., Camacho, D.M., Allison, K.R., Kellis, M., Collins, J.J., Stolovitzky, G., the DREAM5 Consortium, Saeys, Y.: Wisdom of crowds for robust gene network inference. Nature Methods 9(8), 796–804 (2012)Google Scholar
  11. 11.
    Natrella, M.: NIST/SEMATECH e-Handbook of Statistical Methods. NIST/SEMATECH (July 2010)Google Scholar
  12. 12.
    Van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butterworth-Heinemann, Newton (1979)Google Scholar
  13. 13.
    Gustavo Stolovitzky, Robert J. Prill, and Andrea CalifanoGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Miguel Lopes
    • 1
  • Gianluca Bontempi
    • 1
  1. 1.Machine Learning Group, ULBInteruniversity Institute of Bioinformatics in Brussels (IB)2BrusselsBelgium

Personalised recommendations