On the Noise Resilience of Ranking Measures

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9948)


Performance measures play a pivotal role in the evaluation and selection of machine learning models for a wide range of applications. Using both synthetic and real-world data sets, we investigated the resilience to noise of various ranking measures. Our experiments revealed that the area under the ROC curve (AUC) and a related measure, the truncated average Kolmogorov-Smirnov statistic (taKS), can reliably discriminate between models with truly different performance under various types and levels of noise. With increasing class skew, however, the H-measure and estimators of the area under the precision-recall curve become preferable measures. Because of its simple graphical interpretation and robustness, the lower trapezoid estimator of the area under the precision-recall curve is recommended for highly imbalanced data sets.


Ranking Classification Noise Robustness ROC curve AUC H-measure taKS Precision-recall curve 


  1. 1.
    Berrar, D.: An empirical evaluation of ranking measures with respect to robustness to noise. J. Artif. Intell. Res. 49, 241–267 (2014)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Boyd, K., Eng, K.H., Page, C.D.: Area under the precision-recall curve: point estimates and confidence intervals. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 451–466. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40994-3_29 CrossRefGoogle Scholar
  3. 3.
    Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ACM (2006)Google Scholar
  4. 4.
    Drummond, C.: Machine learning as an experimental science, revisited. In: Proceedings of the 21st National Conference on Artificial Intelligence: Workshop on Evaluation Methods for Machine Learning, pp. 1–5. AAAI Press (2006)Google Scholar
  5. 5.
    Fawcett, T.: ROC graphs: notes and practical considerations for researchers. Technical Report HPL-2003-4, HP Laboratories, pp. 1–38 (2004)Google Scholar
  6. 6.
    Ferri, C., Hernández-Orallo, J., Modroiu, R.: An experimental comparison of performance measures for classification. Pattern Recogn. Lett. 30, 27–38 (2009)CrossRefGoogle Scholar
  7. 7.
    Flach, P.: ROC analysis. In: Sammut, C., Webb, G. (eds.) Encyclopedia of Machine Learning, pp. 869–874. Springer, US (2010)Google Scholar
  8. 8.
    Hand, D.: Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach. Learn. 77, 103–123 (2009)CrossRefGoogle Scholar
  9. 9.
    Hand, D., Till, R.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach. Learn. 45, 171–186 (2001)CrossRefzbMATHGoogle Scholar
  10. 10.
    Hernández-Orallo, J., Flach, P., Ferri, C.: A unified view of performance metrics: translating threshold choice into expected classification loss. J. Mach. Learn. Res. 13, 2813–2869 (2012)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Lichman, M.: UCI Machine Learning Repository (2013).
  12. 12.
    Oentaryo, R., Lim, E.P., Finegold, M., Lo, D., Zhu, F., Phua, C., Cheu, E.Y., Yap, G.E., Sim, K., Nguyen, M.N., Perera, K., Neupane, B., Faisal, M., Aung, Z., Woon, W.L., Chen, W., Patel, D., Berrar, D.: Detecting click fraud in online advertising: a data mining approach. J. Mach. Learn. Res. 15(1), 99–140 (2014)MathSciNetGoogle Scholar
  13. 13.
    Parker, C.: On measuring the performance of binary classifiers. Knowl. Inf. Syst. 35, 131–152 (2013)CrossRefGoogle Scholar
  14. 14.
    Prati, R.C., Batista, G., Monard, M.C.: A survey on graphical methods for classification predictive performance evaluation. IEEE Trans. Knowl. Data Eng. 23(11), 1601–1618 (2011)CrossRefGoogle Scholar
  15. 15.
    Provost, F., Fawcett, T.: Robust classification for imprecise environments. Mach. Learn. 42(3), 203–231 (2001)CrossRefzbMATHGoogle Scholar
  16. 16.
    R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2015).

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.School of Arts and Sciences, College of EngineeringShibaura Institute of TechnologyMinuma-kuJapan

Personalised recommendations