Data Mining and Knowledge Discovery

, Volume 30, Issue 4, pp 848–890 | Cite as

Binarised regression tasks: methods and evaluation metrics

  • José Hernández-OralloEmail author
  • Cèsar Ferri
  • Nicolas Lachiche
  • Adolfo Martínez-Usó
  • M. José Ramírez-Quintana


Some supervised tasks are presented with a numerical output but decisions have to be made in a discrete, binarised, way, according to a particular cutoff. This binarised regression task is a very common situation that requires its own analysis, different from regression and classification—and ordinal regression. We first investigate the application cases in terms of the information about the distribution and range of the cutoffs and distinguish six possible scenarios, some of which are more common than others. Next, we study two basic approaches: the retraining approach, which discretises the training set whenever the cutoff is available and learns a new classifier from it, and the reframing approach, which learns a regression model and sets the cutoff when this is available during deployment. In order to assess the binarised regression task, we introduce context plots featuring error against cutoff. Two special cases are of interest, the \( UCE \) and \( OCE \) curves, showing that the area under the former is the mean absolute error and the latter is a new metric that is in between a ranking measure and a residual-based measure. A comprehensive evaluation of the retraining and reframing approaches is performed using a repository of binarised regression problems created on purpose, concluding that no method is clearly better than the other, except when the size of the training data is small.


Regression Classification Reframing Mean absolute error Cutoff Binarisation 



We thank the anonymous reviewers for their comments, which have helped to improve this paper significantly. We thank Peter Flach and Meelis Kull for their insightful comments and very useful suggestions. This work was supported by the Spanish MINECO under Grant TIN 2013-45732-C4-1-P and by Generalitat Valenciana PROMETEOII2015/013. This research has been developed within the REFRAME project, granted by the European Coordinated Research on Long-term Challenges in Information and Communication Sciences & Technologies ERA-Net (CHIST-ERA), and funded by the Ministerio de Economía y Competitividad in Spain (PCIN-2013-037) and the Agence Nationale pour la Recherche in France (ANR-12-CHRI-0005-03).


  1. Bache K, Lichman M (2013) UCI machine learning repository.
  2. Bella A, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ (2014) Aggregative quantification for regression. Data Min Knowl Discov 28(2):475–518MathSciNetCrossRefzbMATHGoogle Scholar
  3. Bi J, Bennett KP (2003) Regression error characteristic curves. In: Twentieth international conference on machine learning (ICML-2003). Washington, DCGoogle Scholar
  4. Brooks AD (2007) knnflex: a more flexible KNN. R package version 1.1.1Google Scholar
  5. Cohen I, Goldszmidt M (2004) Properties and benefits of calibrated classifiers. Knowl Discov Database 2004:125–136Google Scholar
  6. Drummond C, Holte R (2000) Explicitly representing expected cost: an alternative to ROC representation. In: Knowledge discovery and data mining, pp 198–207Google Scholar
  7. Drummond C, Holte R (2006) Cost curves: an improved method for visualizing classifier performance. Mach Learn 65:95–130CrossRefGoogle Scholar
  8. Fawcett T (2006) ROC graphs with instance-varying costs. Pattern Recognit Lett 27(8):882–891MathSciNetCrossRefGoogle Scholar
  9. Fawcett T, Provost F (1997) Adaptive fraud detection. Data Min Knowl Discov 1(3):291–316CrossRefGoogle Scholar
  10. Federal Financial Institutions Examination Council (2013) Home mortgage disclosure act (HMDA).
  11. Ferri C, Hernández-Orallo J (2004) Cautious classifiers. In: Proceedings of the 1st international workshop on ROC analysis in artificial intelligence (ROCAI-2004), pp 27–36Google Scholar
  12. Ferri C, Hernández-Orallo J, Modroiu R (2009) An experimental comparison of performance measures for classification. Pattern Recognit Lett 30(1):27–38CrossRefGoogle Scholar
  13. Flach P (2003) The geometry of ROC space: understanding machine learning metrics through ROC isometrics. In: Machine learning, proceedings of the twentieth international conference (ICML 2003), pp 194–201Google Scholar
  14. Guo Y, Schuurmans D (2008) Discriminative batch mode active learning. In: Platt J, Koller D, Singer Y, Roweis S (eds) Advances in neural information processing systems, vol 20. Curran Associates, Inc, pp 593–600Google Scholar
  15. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor Newsl 11(1):10–18CrossRefGoogle Scholar
  16. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer series in statistics. Springer New York Inc., New YorkCrossRefzbMATHGoogle Scholar
  17. Hernández-Orallo J (2013) ROC curves for regression. Pattern Recognit 46(12):3395–3411CrossRefzbMATHGoogle Scholar
  18. Hernández-Orallo J (2014) Probabilistic reframing for context-sensitive regression. ACM Trans Knowl Discov Data 8(3)Google Scholar
  19. Hernández-Orallo J, Flach P, Ferri C (2012) A unified view of performance metrics: translating threshold choice into expected classification loss. J Mach Learn Res (JMLR) 13:2813–2869MathSciNetzbMATHGoogle Scholar
  20. Hornik K, Buchta C, Zeileis A (2009) Open-source machine learning: R meets Weka. Comput Stat 24(2):225–232. doi: 10.1007/s00180-008-0119-7 MathSciNetCrossRefzbMATHGoogle Scholar
  21. Hsu CN, Knoblock CA (1998) Discovering robust knowledge from databases that change. Data Min Knowl Discov 2(1):69–95CrossRefGoogle Scholar
  22. Kocjan E, Kononenko I (2009) Regression as cost-sensitive classification. In: International multiconference on information society, pp 38–41Google Scholar
  23. Koenker R (2005) Quantile regression, vol 38. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
  24. Langford J, Oliveira R, Zadrozny B (2012) Predicting conditional quantiles via reduction to classification. arXiv:1206.6860
  25. Langford J, Zadrozny B (2005) Estimating class membership probabilities using classifier learners. In: Proceedings of the tenth international workshop on artificial intelligence and statistics (AISTAT05), pp 198–205Google Scholar
  26. Martin A, Doddington G, Kamm T, Ordowski M, Przybocki M (1997) The DET curve in assessment of detection task performance. In: Fifth european conference on speech communication and technology. CiteseerGoogle Scholar
  27. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRefGoogle Scholar
  28. Piatetsky-Shapiro G, Masand B (1999) Estimating campaign benefits and modeling lift. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, p 193Google Scholar
  29. Pietraszek T (2007) On the use of ROC analysis for the optimization of abstaining classifiers. Mach Learn 68(2):137–169CrossRefGoogle Scholar
  30. Prati RC, Batista GE, Monard MC (2011) A survey on graphical methods for classification predictive performance evaluation. IEEE Trans Knowl Data Eng 23:1601–1618. doi: 10.1109/TKDE.2011.59 CrossRefGoogle Scholar
  31. Rosset S, Perlich C, Zadrozny B (2007) Ranking-based evaluation of regression models. Knowl Inf Syst 12(3):331–353CrossRefGoogle Scholar
  32. Sammut C, Webb G (2011) Encyclopedia of machine learning. Encyclopedia of machine learning. Springer, New YorkzbMATHGoogle Scholar
  33. Swets JA, Dawes RM, Monahan J (2000) Better decisions through science. Sci Am 283(4):82–87CrossRefGoogle Scholar
  34. Torgo L (2005) Regression error characteristic surfaces. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, pp 697–702Google Scholar
  35. Torgo L, Gama J (1996) Regression by classification. In: Advances in artificial intelligence. Springer, pp 51–60Google Scholar
  36. The keel-dataset repository (2002).
  37. Yang Y, Wu X, Zhu X (2006) Mining in anticipation for concept change: proactive-reactive prediction in data streams. Data Min Knowl Discov 13(3):261–289MathSciNetCrossRefGoogle Scholar

Copyright information

© The Author(s) 2015

Authors and Affiliations

  • José Hernández-Orallo
    • 1
    Email author
  • Cèsar Ferri
    • 1
  • Nicolas Lachiche
    • 2
  • Adolfo Martínez-Usó
    • 1
  • M. José Ramírez-Quintana
    • 1
  1. 1.Departament de Sistemes Informàtics i ComputacióUniversitat Politècnica de ValènciaValenciaSpain
  2. 2.ICube, Université de Strasbourg, CNRSIllkirch CedexFrance

Personalised recommendations