On the Noise Resilience of Ranking Measures

Berrar, Daniel

doi:10.1007/978-3-319-46672-9_6

On the Noise Resilience of Ranking Measures

Daniel Berrar¹⁹

Conference paper
First Online: 30 September 2016

2849 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9948))

Abstract

Performance measures play a pivotal role in the evaluation and selection of machine learning models for a wide range of applications. Using both synthetic and real-world data sets, we investigated the resilience to noise of various ranking measures. Our experiments revealed that the area under the ROC curve (AUC) and a related measure, the truncated average Kolmogorov-Smirnov statistic (taKS), can reliably discriminate between models with truly different performance under various types and levels of noise. With increasing class skew, however, the H-measure and estimators of the area under the precision-recall curve become preferable measures. Because of its simple graphical interpretation and robustness, the lower trapezoid estimator of the area under the precision-recall curve is recommended for highly imbalanced data sets.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Berrar, D.: An empirical evaluation of ranking measures with respect to robustness to noise. J. Artif. Intell. Res. 49, 241–267 (2014)
MathSciNet MATH Google Scholar
Boyd, K., Eng, K.H., Page, C.D.: Area under the precision-recall curve: point estimates and confidence intervals. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 451–466. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40994-3_29
Chapter Google Scholar
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ACM (2006)
Google Scholar
Drummond, C.: Machine learning as an experimental science, revisited. In: Proceedings of the 21st National Conference on Artificial Intelligence: Workshop on Evaluation Methods for Machine Learning, pp. 1–5. AAAI Press (2006)
Google Scholar
Fawcett, T.: ROC graphs: notes and practical considerations for researchers. Technical Report HPL-2003-4, HP Laboratories, pp. 1–38 (2004)
Google Scholar
Ferri, C., Hernández-Orallo, J., Modroiu, R.: An experimental comparison of performance measures for classification. Pattern Recogn. Lett. 30, 27–38 (2009)
Article Google Scholar
Flach, P.: ROC analysis. In: Sammut, C., Webb, G. (eds.) Encyclopedia of Machine Learning, pp. 869–874. Springer, US (2010)
Google Scholar
Hand, D.: Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach. Learn. 77, 103–123 (2009)
Article Google Scholar
Hand, D., Till, R.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach. Learn. 45, 171–186 (2001)
Article MATH Google Scholar
Hernández-Orallo, J., Flach, P., Ferri, C.: A unified view of performance metrics: translating threshold choice into expected classification loss. J. Mach. Learn. Res. 13, 2813–2869 (2012)
MathSciNet MATH Google Scholar
Lichman, M.: UCI Machine Learning Repository (2013). http://archive.ics.uci.edu/ml
Oentaryo, R., Lim, E.P., Finegold, M., Lo, D., Zhu, F., Phua, C., Cheu, E.Y., Yap, G.E., Sim, K., Nguyen, M.N., Perera, K., Neupane, B., Faisal, M., Aung, Z., Woon, W.L., Chen, W., Patel, D., Berrar, D.: Detecting click fraud in online advertising: a data mining approach. J. Mach. Learn. Res. 15(1), 99–140 (2014)
MathSciNet Google Scholar
Parker, C.: On measuring the performance of binary classifiers. Knowl. Inf. Syst. 35, 131–152 (2013)
Article Google Scholar
Prati, R.C., Batista, G., Monard, M.C.: A survey on graphical methods for classification predictive performance evaluation. IEEE Trans. Knowl. Data Eng. 23(11), 1601–1618 (2011)
Article Google Scholar
Provost, F., Fawcett, T.: Robust classification for imprecise environments. Mach. Learn. 42(3), 203–231 (2001)
Article MATH Google Scholar
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2015). https://www.R-project.org/

Download references

Author information

Authors and Affiliations

School of Arts and Sciences, College of Engineering, Shibaura Institute of Technology, 307 Fukasaku, Minuma-ku, Saitama, 337-8570, Japan
Daniel Berrar

Authors

Daniel Berrar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Berrar .

Editor information

Editors and Affiliations

The University of Tokyo, Tokyo, Japan
Akira Hirose
Kobe University, Kobe, Japan
Seiichi Ozawa
Okinawa Institute of Science and Technology Graduate University, Onna, Japan
Kenji Doya
Nara Institute of Science and Technology, Ikoma, Japan
Kazushi Ikeda
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee
Chinese Academy of Sciences, Beijing, China
Derong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Berrar, D. (2016). On the Noise Resilience of Ranking Measures. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9948. Springer, Cham. https://doi.org/10.1007/978-3-319-46672-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-46672-9_6
Published: 30 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46671-2
Online ISBN: 978-3-319-46672-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics