Knowledge and Information Systems

, Volume 35, Issue 1, pp 131–152

On measuring the performance of binary classifiers

Regular Paper

DOI: 10.1007/s10115-012-0558-x

Cite this article as:
Parker, C. Knowl Inf Syst (2013) 35: 131. doi:10.1007/s10115-012-0558-x


If one is given two binary classifiers and a set of test data, it should be straightforward to determine which of the two classifiers is the superior. Recent work, however, has called into question many of the methods heretofore accepted as standard for this task. In this paper, we analyze seven ways of determining whether one classifier is better than another, given the same test data. Five of these are long established, and two are relative newcomers. We review and extend work showing that one of these methods is clearly inappropriate and then conduct an empirical analysis with a large number of datasets to evaluate the real-world implications of our theoretical analysis. Both our empirical and theoretical results converge strongly toward one of the newer methods.


Performance measures Binary classification Supervised learning Evaluation 

Copyright information

© Springer-Verlag London Limited 2012

Authors and Affiliations

  1. 1.BigML, Inc.CorvallisUSA

Personalised recommendations