Advertisement

Rate-Constrained Ranking and the Rate-Weighted AUC

  • Louise A. C. Millard
  • Peter A. Flach
  • Julian P. T. Higgins
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8725)

Abstract

Ranking tasks, where instances are ranked by a predicted score, are common in machine learning. Often only a proportion of the instances in the ranking can be processed, and this quantity, the predicted positive rate (PPR), may not be known precisely. In this situation, the evaluation of a model’s performance needs to account for these imprecise constraints on the PPR, but existing metrics such as the area under the ROC curve (AUC) and early retrieval metrics such as normalised discounted cumulative gain (NDCG) cannot do this. In this paper we introduce a novel metric, the rate-weighted AUC (rAUC), to evaluate ranking models when constraints across the PPR exist, and provide an efficient algorithm to estimate the rAUC using an empirical ROC curve. Our experiments show that rAUC, AUC and NDCG often select different models. We demonstrate the usefulness of rAUC on a practical application: ranking articles for rapid reviews in epidemiology.

Keywords

Random Forest True Positive Rate Support Vector Machine Model Rapid Review True Negative Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Albert, J.: Learnbayes: Functions for learning Bayesian inference. R package version 2.12 (2008)Google Scholar
  2. 2.
    Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)CrossRefGoogle Scholar
  3. 3.
    Bradley, A.P.: Half-AUC for the evaluation of sensitive or specific classifiers. Pattern Recognition Letters 38, 93–98 (2014)CrossRefGoogle Scholar
  4. 4.
    Dodd, L.E., Pepe, M.S.: Partial AUC estimation and regression. Biometrics 59(3), 614–623 (2003)CrossRefzbMATHMathSciNetGoogle Scholar
  5. 5.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Letters 27(8), 861–874 (2006)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Flach, P.A.: The geometry of ROC space: Understanding machine learning metrics through ROC isometrics. In: Proceedings of the 20th International Conference on Machine Learning, ICML 2003, pp. 194–201 (2003)Google Scholar
  7. 7.
    Ganann, R., Ciliska, D., Thomas, H.: Expediting systematic reviews: Methods and implications of rapid reviews. Implementation Science 5(1), 56 (2010)CrossRefGoogle Scholar
  8. 8.
    Hand, D.J.: Measuring classifier performance: A coherent alternative to the area under the ROC curve. Machine Learning 77(1), 103–123 (2009)CrossRefGoogle Scholar
  9. 9.
    Higgins, J., Altman, D.G.: Assessing risk of bias in included studies. In: Cochrane Handbook for Systematic Reviews of Interventions. Cochrane Book Series, pp. 187–241 (2008)Google Scholar
  10. 10.
    Järvelin, K., Kekäläinen, J.: IR evaluation methods for retrieving highly relevant documents. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 41–48. ACM (2000)Google Scholar
  11. 11.
    Jarvelin, K., Kekalainen, J.: Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) 20(4), 422–446 (2002)CrossRefGoogle Scholar
  12. 12.
    Jiang, Y., Metz, C.E., Nishikawa, R.M.: A receiver operating characteristic partial area index for highly sensitive diagnostic tests. Radiology 201(3), 745–750 (1996)CrossRefGoogle Scholar
  13. 13.
    Macskassy, S.A., Provost, F., Rosset, S.: ROC confidence bands: An empirical evaluation. In: Proceedings of the 22nd International Conference on Machine Learning, ICML 2005, pp. 537–544. ACM (2005)Google Scholar
  14. 14.
    McClish, D.K.: Analyzing a portion of the ROC curve. Medical Decision Making 9(3), 190–195 (1989)CrossRefGoogle Scholar
  15. 15.
    Sheridan, R.P., Singh, S.B., Fluder, E.M., Kearsley, S.K.: Protocols for bridging the peptide to nonpeptide gap in topological similarity searches. Journal of Chemical Information and Computer Sciences 41(5), 1395–1406 (2001)Google Scholar
  16. 16.
    Swamidass, J., Azencott, C.-A., Daily, K., Baldi, P.: A CROC stronger than ROC: measuring, visualizing and optimizing early retrieval. Bioinformatics 26(10), 1348–1356 (2010)CrossRefGoogle Scholar
  17. 17.
    Truchon, J.-F., Bayly, C.I.: Evaluating virtual screening methods: good and bad metrics for the “early recognition” problem. Journal of Chemical Information and Modeling 47(2), 488–508 (2007)CrossRefGoogle Scholar
  18. 18.
    Zhao, W., Hevener, K.E., White, S.W., Lee, R.E., Boyett, J.M.: A statistical framework to evaluate virtual screening. BMC Bioinformatics 10(1), 225 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Louise A. C. Millard
    • 1
    • 2
    • 3
  • Peter A. Flach
    • 1
    • 3
  • Julian P. T. Higgins
    • 2
    • 4
  1. 1.Intelligent Systems LaboratoryUniversity of BristolUnited Kingdom
  2. 2.School of Social and Community MedicineUniversity of BristolUnited Kingdom
  3. 3.MRC Integrative Epidemiology UnitUniversity of BristolUnited Kingdom
  4. 4.Centre for Reviews and DisseminationUniversity of YorkYorkUnited Kingdom

Personalised recommendations