A critical analysis of variants of the AUC

Vanderlooy, Stijn; Hüllermeier, Eyke

doi:10.1007/s10994-008-5070-x

A critical analysis of variants of the AUC

Open access
Published: 15 July 2008

Volume 72, pages 247–262, (2008)
Cite this article

Download PDF

You have full access to this open access article

Machine Learning Aims and scope Submit manuscript

A critical analysis of variants of the AUC

Download PDF

Stijn Vanderlooy¹ &
Eyke Hüllermeier²

2356 Accesses
28 Citations
1 Altmetric
Explore all metrics

Abstract

The area under the ROC curve, or AUC, has been widely used to assess the ranking performance of binary scoring classifiers. Given a sample, the metric considers the ordering of positive and negative instances, i.e., the sign of the corresponding score differences. From a model evaluation and selection point of view, it may appear unreasonable to ignore the absolute value of these differences. For this reason, several variants of the AUC metric that take score differences into account have recently been proposed. In this paper, we present a unified framework for these metrics and provide a formal analysis. We conjecture that, despite their intuitive appeal, actually none of the variants is effective, at least with regard to model evaluation and selection. An extensive empirical analysis corroborates this conjecture. Our findings also shed light on recent research dealing with the construction of AUC-optimizing classifiers.

Article PDF

SCHEP — A Geometric Quality Measure for Regression Rule Sets, Gauging Ranking Consistency Throughout the Real-Valued Target Space

On the Noise Resilience of Ranking Measures

Efficient AUC Optimization for Information Ranking Applications

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Asuncion, A., & Newman, D. (2007). UCI machine learning repository.
Bradley, A. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7), 1145–1159.
Article Google Scholar
Brefeld, U., & Scheffer, T. (2005). AUC maximizing support vector learning. In Ferri, C., Lachiche, N., Macskassy, S., & Rakotomamonjy, A. (Eds.), Proceedings of the 2nd workshop on ROC analysis in machine learning (ROCML 2005). Bonn, Germany, August 11, 2005.
Calders, T., & Jaroszewicz, S. (2007). Efficient AUC optimization for classification. In J. Kok, J. Koronacki, R. L. de Mántaras, S. Matwin, D. Mladenic, & A. Skowron (Eds.), Proceedings of the 11th European conference on principles and practice of knowledge discovery in databases (PKDD 2007) (pp. 42–53). Warsaw, Poland, September 17–21, 2007. Berlin: Springer.
Google Scholar
Caruana, R., & Niculescu-Mizil, A. (2006). An empirical comparison of supervised learning algorithms. In W. Cohen & A. Moore (Eds.), Proceedings of the 23rd international conference on machine learning (ICML 2006) (pp. 161–168). Pittsburgh, PA, USA, June 25–29, 2006. New York: Assoc. Comput. Mach.
Chapter Google Scholar
Cortes, C., & Mohri, M. (2003). AUC optimization vs. error rate minimization. In S. Thrun, L. Saul, & B. Schölkopf (Eds.), Advances in neural information processing systems 16 (NIPS 2003). Vancouver, BC, Canada, December 8–13, 2003. Cambridge: MIT Press.
Google Scholar
Ferri, C., Flach, P., & Hernández-Orallo, J. (2003). Improving the AUC of probabilistic estimation trees. In N. Lavrac, D. Gamberger, L. Todorovski, & H. Blockeel (Eds.), Proceedings of the 14th European conference on machine learning (ECML 2003) (pp. 121–132). Cavtat-Dubrovnik, Croatia, September 22–26, 2003. Berlin: Springer.
Google Scholar
Ferri, C., Flach, P., Hernández-Orallo, J., & Senad, A. (2005). Modifying ROC curves to incorporate predicted probabilities. In C. Ferri, N. Lachiche, S. Macskassy, & A. Rakotomamonjy (Eds.), Proceedings of the 2nd workshop on ROC analysis in machine learning (ROCML 2005). Bonn, Germany, August 11, 2005.
Friedman, J. (1997). On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1, 55–77.
Article Google Scholar
Hand, D., & Till, R. (2001). A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning, 45, 171–186.
Article MATH Google Scholar
Hanley, J., & McNeil, B. (1982). The meaning and use of the area under a receiver operator characteristic ROC curve. Radiology, 143(1), 29–36.
Google Scholar
Herschtal, A., & Raskutti, B. (2004). Optimising area under the ROC curve using gradient descent. In C. Brodley (Ed.), Proceedings of the 21st international conference on machine learning (ICML 2004). Banff, Alberta, Canada, July 4–8, 2004. New York: Assoc. Comput. Mach.
Google Scholar
Ling, C., Huang, J., & Zhang, H. (2003). AUC: a statistically consistent and more discriminating measure than accuracy. In G. Gottlob & T. Walsh (Eds.), Proceedings of the 18th international joint conference on artificial intelligence (IJCAI 2003) (pp. 519–526). Acapulco, Mexico, August 9–15, 2003. Menlo Park: AAAI Press.
Google Scholar
Mann, H., & Whitney, D. (1947). On a test whether one of two random variables is stochastically larger than the other. The Annals of Mathematical Statistics, 18(1), 50–60.
Article MATH MathSciNet Google Scholar
Provost, F., & Domingos, P. (2003). Tree-induction fir probability based ranking. Machine Learning, 52(3), 199–215.
Article MATH Google Scholar
Provost, F., & Fawcett, T. (2001). Robust classification for imprecise environments. Machine Learning, 42(3), 203–231.
Article MATH Google Scholar
Provost, F., Fawcett, T., & Kohavi, R. (1998). The case against accuracy estimation for comparing induction algorithms. In J. Shavlik (Ed.), Proceedings of the 15th international conference on machine learning (ICML 1998) (pp. 43–48). Madison, WI, USA, July 24–27, 1998. San Mateo: Morgan Kaufmann.
Google Scholar
Rakotomamonjy, A. (2004). Optimizing area under ROC curve with SVMs. In J. Hernández-Orallo, C. Ferri, N. Lachiche, & P. Flach (Eds.), Proceedings of the 1st workshop on ROC analysis and artificial intelligence (ROCAI 2004) (pp. 71–80). Valencia, Spain, August 22, 2004.
Steck, H. (2007). Hinge rank loss and the area under the ROC curve. In J. Kok, J. Koronacki, R. L. de Mántaras, S. Matwin, D. Mladenic, & A. Skowron (Eds.), Proceedings of the 18th European conference on machine learning (ECML 2007) (pp. 347–358). Warsaw, Poland, September 17–21, 2007. Berlin: Springer.
Google Scholar
Tax, D., & Veenman, C. (2005). Tuning the hyperparameter of an AUC-optimized classifier. In K. Verbeeck, K. Tuyls, A. Nowe, B. Manderick, & B. Kuijpers (Eds.), Proceedings of the 17th Belgium-Netherlands conference on artificial intelligence (BNAIC 2005) (pp. 224–231). Brussels, Belgium, October 17–18, 2005. Brussels: Royal Flemish Academy of Belgium for Science and Arts.
Google Scholar
Tax, D., Duin, R., & Arzhaeva, Y. (2006). Linear model combining by optimizing the area under the ROC curve. In Y. Tang, P. Wang, G. Lorette, D. Yeung, & H. Yan (Eds.), Proceedings of the 18th international conference on pattern recognition (ICPR 2006) (pp. 119–122). Hong Kong, China, August 20–24, 2006. Los Alamitos: IEEE Comput. Soc.
Chapter Google Scholar
Witten, I., & Frank, E. (2005). Data mining: practical machine learning tools and techniques (2nd ed.). San Mateo: Morgan Kaufmann.
MATH Google Scholar
Wu, S., Flach, P., & Ferri, C. (2007). An improved model selection heuristic for AUC. In J. Kok, J. Koronacki, R. L. de Mántaras, S. Matwin, D. Mladenic, & A. Skowron (Eds.), Proceedings of the 18th European conference on machine learning (ECML 2007) (pp. 478–489). Warsaw, Poland, September 17–21, 2007. Berlin: Springer.
Google Scholar
Yan, L., Dodier, R., Mozer, M., & Wolniewicz, R. (2003). Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic. In T. Fawcett & N. Mishra (Eds.), Proceedings of the 20th international conference on machine learning (ICML 2003) (pp. 848–855). Washington, DC, USA, August 21–24, 2003. Menlo Park: AAAI Press.
Google Scholar

Download references

Author information

Authors and Affiliations

MICC, Department of Computer Science, Maastricht University, Maastricht, The Netherlands
Stijn Vanderlooy
Department of Mathematics and Computer Science, Marburg University, Marburg, Germany
Eyke Hüllermeier

Authors

Stijn Vanderlooy
View author publications
You can also search for this author in PubMed Google Scholar
Eyke Hüllermeier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eyke Hüllermeier.

Additional information

Editors: Walter Daelemans, Bart Goethals, Katharina Morik.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Vanderlooy, S., Hüllermeier, E. A critical analysis of variants of the AUC. Mach Learn 72, 247–262 (2008). https://doi.org/10.1007/s10994-008-5070-x

Download citation

Received: 22 June 2008
Revised: 22 June 2008
Accepted: 23 June 2008
Published: 15 July 2008
Issue Date: September 2008
DOI: https://doi.org/10.1007/s10994-008-5070-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A critical analysis of variants of the AUC

Abstract

Article PDF

Similar content being viewed by others

SCHEP — A Geometric Quality Measure for Regression Rule Sets, Gauging Ranking Consistency Throughout the Real-Valued Target Space

On the Noise Resilience of Ranking Measures

Efficient AUC Optimization for Information Ranking Applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A critical analysis of variants of the AUC

Abstract

Article PDF

Similar content being viewed by others

SCHEP — A Geometric Quality Measure for Regression Rule Sets, Gauging Ranking Consistency Throughout the Real-Valued Target Space

On the Noise Resilience of Ranking Measures

Efficient AUC Optimization for Information Ranking Applications

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation