Abstract
Species distribution modelling has become a common approach in ecology in the last decades. As in any modelling exercise, evaluation of the predicted suitability surfaces is a key process, and the area under the receiver operating characteristic (ROC) curve (AUC) has become the most popular statistic for this purpose. A close covariation between the AUC and threshold-dependent discrimination measures (sensitivity Se and specificity Sp) raises into question the advantage of the threshold-independence of the AUC. In this study, the relationship between the AUC and several threshold-dependent discrimination measures is characterized in detail, and the sensitivity of the pattern to variations in the shape of the ROC curve is assessed. Hypothetical suitability values, coming from normal and skew-normal distributions, were simulated for both instances of presence and absence. The flexibility of the skew-normal distribution allowed for the simulation of a wide range of ROC curve configurations. The relationship between the AUC and threshold-dependent measures was graphically assessed; independently of the ROC curve shape, a nonlinear asymptotic relationship between the AUC and Se (and Sp) was obtained after applying the threshold that makes Se = Sp. A nonlinear asymptotic relationship between the AUC and the Youden index was also reported. These results imply that the AUC does not appropriately measure changes in the discrimination of models, and it is especially incapable of distinguishing between models with high discrimination capacity. Se or Sp derived from the application of the threshold that makes them equal is a preferred measure of discrimination power. Together with the rate of false positives and negatives, and with the prevalence of the species, these statistics provide more information about the discrimination capacity of the models than the AUC.
Similar content being viewed by others
References
Acevedo P, Jiménez-Valverde A, Lobo JM, Real R (2012) Delimiting the geographical background in species distribution modelling. J Biogeogr 39:1383–1390
Adams NM, Hand DJ (1999) Comparing classifiers when the misallocation costs are uncertain. Pattern Recogn 32:1139–1147
Adams NM, Hand DJ (2000) An improved measure for comparing diagnostic tests. Comp Biol Med 30:89–96
Allouche O, Tsoar A, Kadmon R (2006) Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). J Appl Ecol 43:1223–1232
Anderson RP, Raza A (2010) The effect of the extent of the study region on GIS models of species geographic distributions and estimates of niche evolution: preliminary tests with montane rodents (genus Nephelomys) in Venezuela. J Biogeogr 37:1378–1393
Anderson RP, Lew D, Peterson AT (2003) Evaluating predictive models of species′ distributions: criteria for selecting optimal models. Ecol Model 162:211–232
Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat 12:171–178
Azzalini A (2005) The skew-normal distribution and related multivariate families. Scand J Stat 32:159–188
Azzalini A (2010) R package “sn”: The skew-normal and skew-t distributions (version 0.4-15). http://www.R-project.org. Accessed April 2010
Azzalini A, Capitanio A (1999) Statistical applications of the multivariate skew normal distribution. J R Stat Soc B 61:579–602
Bamber D (1975) The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol 12:387–415
Barbosa AM, Real R, Román-Muñoz A, Brown JA (2013) New measures for assessing model equilibrium and prediction mismatch in species distribution models. Divers Distrib 19:1333–1338
Barve N, Barve V, Jiménez-Valverde A, Lira-Noriega A, Maher SP, Peterson AT, Soberón J, Villalobos F (2011) The crucial role of the accessibility area in ecological niche modeling and species distribution modeling. Ecol Model 222:1810–1819
Brenner H, Gefeller O (1997) Variation of sensitivity, specificity, likelihood ratios and predictive values with disease prevalence. Stat Med 16:981–991
Elith J, Kearney M, Phillips S (2010) The art of modelling range-shifting species. Methods Ecol Evol 1:330–342
Eng J (2005) Receiver operating characteristic analysis: a primer. Acad Radiol 12:909–916
Faraggi D, Reiser B (2002) Estimation of the area under the ROC curve. Stat Med 21:3093–3106
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27:861–874
Fielding AH (2002) What are the appropriate characteristics of an accuracy measure? In: Scott JM, Heglund PJ, Haufler JB, Morrison M, Raphael MG, Wall WB, Samson F (eds) Predicting species occurrences. Issues of accuracy and scale. Island Press, Covelo, pp 271–280
Fielding AH, Bell JF (1997) A review of methods for the assessment of prediction errors in conservation presence–absence models. Environ Conserv 24:38–49
Flach P, Matsubara ET (2008) On classification, ranking, and probability estimation. In: de Raedt L, Dietterich T, Getoor L, Kersting K, Muggleton SH (eds) Seminar proceeding: probabilistic, logical and relational learning—a further synthesis. Internationales Begegnungs, Dagstuhl. URL http://www.informatik.uni-trier.de/~ley/db/conf/dagstuhl/P7161.html
Fluch R, Faraggi D, Reiser B (2005) Estimation of the Youden index and its associated cutoff point. Biometrical J 47:458–472
Franklin J (2009) Mapping species distributions. Spatial inference and prediction. Cambridge University Press, Cambridge
Futuyma DJ (1998) Evolutionary biology. Sinauer Associates Inc, Massachusetts
Gaston KJ (2003) The structure and dynamics of geographic ranges. Oxford University Press, Oxford
Hajian-Tilaki KO, Hanley JA, Joseph LN, Collet J-P (1997) A comparison of parametric and nonparametric approaches to ROC analysis of quantitative diagnostic tests. Med Decis Making 17:94–102
Hand DJ (2009) Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach Learn 77:103–123
Hand DJ (2010) Evaluating diagnostic tests: the area under the ROC curve and the balance of errors. Stat Med 29:1502–1510
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic curve. Radiology 143:29–36
Hautus MJ, O′Mahony M, Lee H-S (2008) Decision strategies determined from the shape of the same-different roc curve: what are the effects of incorrect assumptions? J Sens Stud 23:743–764
Hijmans RJ (2012) Cross-validation of species distribution models: removing spatial sorting bias and calibration with a null model. Ecology 93:679–688
Hilden J (1991) The area under the ROC curve and its competitors. Med Decis Making 11:95–101
Hilden J (2000) Prevalence-free utility-respecting summary indices of diagnostic power do not exist. Stat Med 19:431–440
Hilden J (2005) What properties should an overall measure of test performance possess? Clin Chem 51:471
Hosmer DW, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, New York
Jiménez-Valverde A (2012) Insights into the area under the receiver operating characteristic curve as a discrimination measure in species distribution modelling. Global Ecol Biogeogr 21:498–507
Jiménez-Valverde A, Lobo JM (2007) Threshold criteria for conversion of probability of species presence to either or presence–absence. Acta Oecol 31:361–369
Jiménez-Valverde A, Lobo JM, Hortal J (2008) Not as good as they seem: the importance of concepts in species distribution modelling. Divers Distrib 14:885–890
Jiménez-Valverde A, Lira-Noriega A, Soberón J, Peterson AT (2010) Marshalling existing biodiversity data to evaluate biodiversity status and trends in planning exercises. Ecol Res 25:947–957
Jiménez-Valverde A, Peterson AT, Soberón J, Overton J, Aragón P, Lobo JM (2011) Use of niche models in invasive species risk assessments. Biol Invasions 13:2785–2797
Jiménez-Valverde A, Acevedo P, Barbosa AM, Lobo JM, Real R (2013) Discrimination capacity in species distribution modelling depends on the representativeness of the environmental domain. Global Ecol Biogeogr 22:508–516
Jollife IT, Stephenson DB (eds) (2003) Forecast verification: a practitioner’s guide in atmospheric science. Wiley, Chichester
Kikillus KH, Hare KM, Hartley S (2010) Minimizing false-negatives when predicting the potential distribution of an invasive species: a bioclimatic envelope for the red-eared slider at global and regional scales. Anim Conserv 13:5–15
Krzanowski WJ, Hand DJ (2009) ROC curves for continuous data. Chapman & Hall, Boca Raton
Liu C, Berry PM, Dawson TP, Pearson RG (2005) Selecting thresholds of occurrence in the prediction of species distributions. Ecography 28:385–393
Liu C, White M, Newell G (2011) Measuring and comparing the accuracy of species distribution models with presence–absence data. Ecography 34:232–243
Liu C, White M, Newell G (2013) Selecting thresholds for the prediction of species occurrence with presence-only data. J Biogeogr 40:778–789
Lobo JM, Jiménez-Valverde A, Lobo JM (2008) AUC: a misleading measure of the performance of predictive distribution models. Global Ecol Biogeogr 17:145–151
Lobo JM, Jiménez-Valverde A, Hortal J (2010) The uncertain nature of absences and their importance in species distribution modelling. Ecography 33:103–114
Manel S, Williams HC, Ormerod SJ (2001) Evaluating presence–absence models in ecology: the need to account for prevalence. J Appl Ecol 38:921–931
Marzban C (2004) The ROC curve and the area under it as performance measures. Weather Forecast 19:1106–1114
Metz CE (1978) Basic principles of ROC analysis. Semin Nucl Med 8:283–298
Moons KGM, Harrell FE (2003) Sensitivity and specificity should be de-emphasized in diagnostic accuracy studies. Acad Radiol 10:670–672
Mouton AM, De Baets B, Van Broekhoven E, Goethals PLM (2009) Prevalence-adjusted optimisation of fuzzy models for species distribution. Ecol Model 220:1776–1786
Mouton AM, De Baets B, Goethals PLM (2010) Ecological relevance of performance criteria for species distribution models. Ecol Model 221:1995–2002
Obuchowski NA (2005) ROC analysis. Am J Radiol 184:364–372
Perkins NJ, Schisterman EF (2005) The Youden index and the optimal cut-point corrected for measurement error. Biometrical J 47:428–441
Peterson AT, Papeş M, Soberón J (2008) Rethinking receiver operating characteristic analysis applications in ecological niche modelling. Ecol Model 213:63–72
Peterson AT, Soberón J, Pearson RG, Anderson RP, Martínez-Meyer E, Nakamura M, Araújo MB (2011) Ecological niches and geographic distributions. Princeton University Press, Princeton
R Development Core Team (2009) R: A language and environment for statistical computing. Version 2.10.1. R Foundation for Statistical Computing, Vienna. http://www.R-project.org. Accessed Jan 2010
Shapiro DE (1999) The interpretation of diagnostic tests. Stat Methods Med Res 8:113–134
Sing T, Sander O, Beerenwinkel N, Lengauer T (2009) ROCR: visualizing the performance of scoring classifiers. R package version 1.0-4. http://www.R-project.org. Accessed Jan 2011
Smith AB (2013) On evaluating species distribution models with random background sites in place of absences when test presences disproportionately sample suitable habitat. Divers Distrib. doi:10.1111/ddi.12031
Smits N (2010) A note on Youden′s J and its cost ratio. BMC Med Res Methodol 10:89
Soberón J (2010) Niche and area of distribution modeling: a population ecology perspective. Ecography 33:159–167
Turner DA (1978) An intuitive approach to receiver operating characteristic curve analysis. J Nucl Med 19:213–220
Webb GI, Ting KM (2005) On the application of ROC analysis to predict classification performance under varying class distributions. Mach Learn 58:25–32
Yin J, Tian L (2013) Joint confidence region estimation for area under ROC curve and Youden index. Stats Med. doi:10.1002/sim.5992
Youden WJ (1950) Index for rating diagnostic tests. Cancer 3:32–35
Zou KH, O′Malley J, Mauri L (2007) Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models. Circulation 115:654–657
Zweig MH, Campbell G (1993) Receiver-operating characteristic plots: a fundamental evaluation tool in clinical medicine. Clin Chem 39:561–577
Acknowledgments
Comments made by Jorge M. Lobo and two anonymous referees helped to improve the manuscript. Lucía Maltez kindly reviewed the English. A. J.-V. was supported by the CSIC JAE-Doc Program which is partially financed by the European Social Fund.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
See Fig. 8.
Rights and permissions
About this article
Cite this article
Jiménez-Valverde, A. Threshold-dependence as a desirable attribute for discrimination assessment: implications for the evaluation of species distribution models. Biodivers Conserv 23, 369–385 (2014). https://doi.org/10.1007/s10531-013-0606-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10531-013-0606-1