ROC curves in cost space
- 1.3k Downloads
ROC curves and cost curves are two popular ways of visualising classifier performance, finding appropriate thresholds according to the operating condition, and deriving useful aggregated measures such as the area under the ROC curve (AUC) or the area under the optimal cost curve. In this paper we present new findings and connections between ROC space and cost space. In particular, we show that ROC curves can be transferred to cost space by means of a very natural threshold choice method, which sets the decision threshold such that the proportion of positive predictions equals the operating condition. We call these new curves rate-driven curves, and we demonstrate that the expected loss as measured by the area under these curves is linearly related to AUC. We show that the rate-driven curves are the genuine equivalent of ROC curves in cost space, establishing a point-point rather than a point-line correspondence. Furthermore, a decomposition of the rate-driven curves is introduced which separates the loss due to the threshold choice method from the ranking loss (Kendall τ distance). We also derive the corresponding curve to the ROC convex hull in cost space; this curve is different from the lower envelope of the cost lines, as the latter assumes only optimal thresholds are chosen.
KeywordsCost curves ROC curves Cost-sensitive evaluation Ranking performance Operating condition Kendall tau distance Area Under the ROC Curve (AUC)
We would like to thank the anonymous referees for their helpful comments. This work was supported by the MEC/MINECO projects CONSOLIDER-INGENIO CSD2007-00022 and TIN 2010-21062-C02-02, GVA project PROMETEO/2008/051, the COST—European Cooperation in the field of Scientific and Technical Research IC0801 AT, and the REFRAME project granted by the European Coordinated Research on Long-term Challenges in Information and Communication Sciences & Technologies ERA-Net (CHIST-ERA), and funded by the Engineering and Physical Sciences Research Council in the UK and the Ministerio de Economía y Competitividad in Spain.
- Drummond, C., & Holte, R. (2000). Explicitly representing expected cost: an alternative to ROC representation. In Knowl. discovery & data mining (pp. 198–207). Google Scholar
- Elkan, C. (2001). The foundations of cost-sensitive learning. In B. Nebel (Ed.), Proc. of the 17th intl. conf. on artificial intelligence (IJCAI-01) (pp. 973–978). Google Scholar
- Flach, P. (2003). The geometry of ROC space: understanding machine learning metrics through ROC isometrics. In Machine learning, proceedings of the twentieth international conference (ICML 2003) (pp. 194–201). Google Scholar
- Flach, P., Hernández-Orallo, J., & Ferri, C. (2011). A coherent interpretation of AUC as a measure of aggregated classification performance. In Proc. of the 28th intl. conference on machine learning, ICML2011. Google Scholar
- Frank, A., & Asuncion, A. (2010). UCI machine learning repository. http://archive.ics.uci.edu/ml.
- Hernández-Orallo, J., Flach, P., & Ferri, C. (2011). Brier curves: a new cost-based visualisation of classifier performance. In Proceedings of the 28th international conference on machine learning, ICML2011. Google Scholar
- Hernández-Orallo, J., Flach, P., & Ferri, C. (2012). A unified view of performance metrics: translating threshold choice into expected classification loss. Journal of Machine Learning Research, 13, 2813–2869. Google Scholar