Threshold-dependence as a desirable attribute for discrimination assessment: implications for the evaluation of species distribution models

Jiménez-Valverde, Alberto

doi:10.1007/s10531-013-0606-1

Threshold-dependence as a desirable attribute for discrimination assessment: implications for the evaluation of species distribution models

Original Paper
Published: 07 January 2014

Volume 23, pages 369–385, (2014)
Cite this article

Biodiversity and Conservation Aims and scope Submit manuscript

Alberto Jiménez-Valverde¹

1697 Accesses
63 Citations
Explore all metrics

Abstract

Species distribution modelling has become a common approach in ecology in the last decades. As in any modelling exercise, evaluation of the predicted suitability surfaces is a key process, and the area under the receiver operating characteristic (ROC) curve (AUC) has become the most popular statistic for this purpose. A close covariation between the AUC and threshold-dependent discrimination measures (sensitivity Se and specificity Sp) raises into question the advantage of the threshold-independence of the AUC. In this study, the relationship between the AUC and several threshold-dependent discrimination measures is characterized in detail, and the sensitivity of the pattern to variations in the shape of the ROC curve is assessed. Hypothetical suitability values, coming from normal and skew-normal distributions, were simulated for both instances of presence and absence. The flexibility of the skew-normal distribution allowed for the simulation of a wide range of ROC curve configurations. The relationship between the AUC and threshold-dependent measures was graphically assessed; independently of the ROC curve shape, a nonlinear asymptotic relationship between the AUC and Se (and Sp) was obtained after applying the threshold that makes Se = Sp. A nonlinear asymptotic relationship between the AUC and the Youden index was also reported. These results imply that the AUC does not appropriately measure changes in the discrimination of models, and it is especially incapable of distinguishing between models with high discrimination capacity. Se or Sp derived from the application of the threshold that makes them equal is a preferred measure of discrimination power. Together with the rate of false positives and negatives, and with the prevalence of the species, these statistics provide more information about the discrimination capacity of the models than the AUC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prevalence affects the evaluation of discrimination capacity in presence-absence species distribution models

Article 26 February 2021

Alberto Jiménez-Valverde

Effects of sample size, data quality, and species response in environmental space on modeling species distributions

Article 06 October 2023

Lifei Wang & Donald A. Jackson

Basic Introduction to Species Distribution Modelling

References

Acevedo P, Jiménez-Valverde A, Lobo JM, Real R (2012) Delimiting the geographical background in species distribution modelling. J Biogeogr 39:1383–1390
Article Google Scholar
Adams NM, Hand DJ (1999) Comparing classifiers when the misallocation costs are uncertain. Pattern Recogn 32:1139–1147
Article Google Scholar
Adams NM, Hand DJ (2000) An improved measure for comparing diagnostic tests. Comp Biol Med 30:89–96
Article CAS Google Scholar
Allouche O, Tsoar A, Kadmon R (2006) Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). J Appl Ecol 43:1223–1232
Article Google Scholar
Anderson RP, Raza A (2010) The effect of the extent of the study region on GIS models of species geographic distributions and estimates of niche evolution: preliminary tests with montane rodents (genus Nephelomys) in Venezuela. J Biogeogr 37:1378–1393
Article Google Scholar
Anderson RP, Lew D, Peterson AT (2003) Evaluating predictive models of species′ distributions: criteria for selecting optimal models. Ecol Model 162:211–232
Article Google Scholar
Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat 12:171–178
Google Scholar
Azzalini A (2005) The skew-normal distribution and related multivariate families. Scand J Stat 32:159–188
Article Google Scholar
Azzalini A (2010) R package “sn”: The skew-normal and skew-t distributions (version 0.4-15). http://www.R-project.org. Accessed April 2010
Azzalini A, Capitanio A (1999) Statistical applications of the multivariate skew normal distribution. J R Stat Soc B 61:579–602
Article Google Scholar
Bamber D (1975) The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol 12:387–415
Article Google Scholar
Barbosa AM, Real R, Román-Muñoz A, Brown JA (2013) New measures for assessing model equilibrium and prediction mismatch in species distribution models. Divers Distrib 19:1333–1338
Article Google Scholar
Barve N, Barve V, Jiménez-Valverde A, Lira-Noriega A, Maher SP, Peterson AT, Soberón J, Villalobos F (2011) The crucial role of the accessibility area in ecological niche modeling and species distribution modeling. Ecol Model 222:1810–1819
Article Google Scholar
Brenner H, Gefeller O (1997) Variation of sensitivity, specificity, likelihood ratios and predictive values with disease prevalence. Stat Med 16:981–991
Article CAS PubMed Google Scholar
Elith J, Kearney M, Phillips S (2010) The art of modelling range-shifting species. Methods Ecol Evol 1:330–342
Article Google Scholar
Eng J (2005) Receiver operating characteristic analysis: a primer. Acad Radiol 12:909–916
Article PubMed Google Scholar
Faraggi D, Reiser B (2002) Estimation of the area under the ROC curve. Stat Med 21:3093–3106
Article PubMed Google Scholar
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27:861–874
Article Google Scholar
Fielding AH (2002) What are the appropriate characteristics of an accuracy measure? In: Scott JM, Heglund PJ, Haufler JB, Morrison M, Raphael MG, Wall WB, Samson F (eds) Predicting species occurrences. Issues of accuracy and scale. Island Press, Covelo, pp 271–280
Google Scholar
Fielding AH, Bell JF (1997) A review of methods for the assessment of prediction errors in conservation presence–absence models. Environ Conserv 24:38–49
Article Google Scholar
Flach P, Matsubara ET (2008) On classification, ranking, and probability estimation. In: de Raedt L, Dietterich T, Getoor L, Kersting K, Muggleton SH (eds) Seminar proceeding: probabilistic, logical and relational learning—a further synthesis. Internationales Begegnungs, Dagstuhl. URL http://www.informatik.uni-trier.de/~ley/db/conf/dagstuhl/P7161.html
Fluch R, Faraggi D, Reiser B (2005) Estimation of the Youden index and its associated cutoff point. Biometrical J 47:458–472
Article Google Scholar
Franklin J (2009) Mapping species distributions. Spatial inference and prediction. Cambridge University Press, Cambridge
Google Scholar
Futuyma DJ (1998) Evolutionary biology. Sinauer Associates Inc, Massachusetts
Google Scholar
Gaston KJ (2003) The structure and dynamics of geographic ranges. Oxford University Press, Oxford
Google Scholar
Hajian-Tilaki KO, Hanley JA, Joseph LN, Collet J-P (1997) A comparison of parametric and nonparametric approaches to ROC analysis of quantitative diagnostic tests. Med Decis Making 17:94–102
Article CAS PubMed Google Scholar
Hand DJ (2009) Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach Learn 77:103–123
Article Google Scholar
Hand DJ (2010) Evaluating diagnostic tests: the area under the ROC curve and the balance of errors. Stat Med 29:1502–1510
PubMed Google Scholar
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic curve. Radiology 143:29–36
CAS PubMed Google Scholar
Hautus MJ, O′Mahony M, Lee H-S (2008) Decision strategies determined from the shape of the same-different roc curve: what are the effects of incorrect assumptions? J Sens Stud 23:743–764
Article Google Scholar
Hijmans RJ (2012) Cross-validation of species distribution models: removing spatial sorting bias and calibration with a null model. Ecology 93:679–688
Article PubMed Google Scholar
Hilden J (1991) The area under the ROC curve and its competitors. Med Decis Making 11:95–101
Article CAS PubMed Google Scholar
Hilden J (2000) Prevalence-free utility-respecting summary indices of diagnostic power do not exist. Stat Med 19:431–440
Article CAS PubMed Google Scholar
Hilden J (2005) What properties should an overall measure of test performance possess? Clin Chem 51:471
Article CAS PubMed Google Scholar
Hosmer DW, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, New York
Book Google Scholar
Jiménez-Valverde A (2012) Insights into the area under the receiver operating characteristic curve as a discrimination measure in species distribution modelling. Global Ecol Biogeogr 21:498–507
Article Google Scholar
Jiménez-Valverde A, Lobo JM (2007) Threshold criteria for conversion of probability of species presence to either or presence–absence. Acta Oecol 31:361–369
Article Google Scholar
Jiménez-Valverde A, Lobo JM, Hortal J (2008) Not as good as they seem: the importance of concepts in species distribution modelling. Divers Distrib 14:885–890
Article Google Scholar
Jiménez-Valverde A, Lira-Noriega A, Soberón J, Peterson AT (2010) Marshalling existing biodiversity data to evaluate biodiversity status and trends in planning exercises. Ecol Res 25:947–957
Article Google Scholar
Jiménez-Valverde A, Peterson AT, Soberón J, Overton J, Aragón P, Lobo JM (2011) Use of niche models in invasive species risk assessments. Biol Invasions 13:2785–2797
Article Google Scholar
Jiménez-Valverde A, Acevedo P, Barbosa AM, Lobo JM, Real R (2013) Discrimination capacity in species distribution modelling depends on the representativeness of the environmental domain. Global Ecol Biogeogr 22:508–516
Article Google Scholar
Jollife IT, Stephenson DB (eds) (2003) Forecast verification: a practitioner’s guide in atmospheric science. Wiley, Chichester
Google Scholar
Kikillus KH, Hare KM, Hartley S (2010) Minimizing false-negatives when predicting the potential distribution of an invasive species: a bioclimatic envelope for the red-eared slider at global and regional scales. Anim Conserv 13:5–15
Article Google Scholar
Krzanowski WJ, Hand DJ (2009) ROC curves for continuous data. Chapman & Hall, Boca Raton
Book Google Scholar
Liu C, Berry PM, Dawson TP, Pearson RG (2005) Selecting thresholds of occurrence in the prediction of species distributions. Ecography 28:385–393
Article Google Scholar
Liu C, White M, Newell G (2011) Measuring and comparing the accuracy of species distribution models with presence–absence data. Ecography 34:232–243
Article CAS Google Scholar
Liu C, White M, Newell G (2013) Selecting thresholds for the prediction of species occurrence with presence-only data. J Biogeogr 40:778–789
Article Google Scholar
Lobo JM, Jiménez-Valverde A, Lobo JM (2008) AUC: a misleading measure of the performance of predictive distribution models. Global Ecol Biogeogr 17:145–151
Article Google Scholar
Lobo JM, Jiménez-Valverde A, Hortal J (2010) The uncertain nature of absences and their importance in species distribution modelling. Ecography 33:103–114
Article Google Scholar
Manel S, Williams HC, Ormerod SJ (2001) Evaluating presence–absence models in ecology: the need to account for prevalence. J Appl Ecol 38:921–931
Article Google Scholar
Marzban C (2004) The ROC curve and the area under it as performance measures. Weather Forecast 19:1106–1114
Article Google Scholar
Metz CE (1978) Basic principles of ROC analysis. Semin Nucl Med 8:283–298
Article CAS PubMed Google Scholar
Moons KGM, Harrell FE (2003) Sensitivity and specificity should be de-emphasized in diagnostic accuracy studies. Acad Radiol 10:670–672
Article PubMed Google Scholar
Mouton AM, De Baets B, Van Broekhoven E, Goethals PLM (2009) Prevalence-adjusted optimisation of fuzzy models for species distribution. Ecol Model 220:1776–1786
Article Google Scholar
Mouton AM, De Baets B, Goethals PLM (2010) Ecological relevance of performance criteria for species distribution models. Ecol Model 221:1995–2002
Article Google Scholar
Obuchowski NA (2005) ROC analysis. Am J Radiol 184:364–372
Google Scholar
Perkins NJ, Schisterman EF (2005) The Youden index and the optimal cut-point corrected for measurement error. Biometrical J 47:428–441
Article Google Scholar
Peterson AT, Papeş M, Soberón J (2008) Rethinking receiver operating characteristic analysis applications in ecological niche modelling. Ecol Model 213:63–72
Article Google Scholar
Peterson AT, Soberón J, Pearson RG, Anderson RP, Martínez-Meyer E, Nakamura M, Araújo MB (2011) Ecological niches and geographic distributions. Princeton University Press, Princeton
Google Scholar
R Development Core Team (2009) R: A language and environment for statistical computing. Version 2.10.1. R Foundation for Statistical Computing, Vienna. http://www.R-project.org. Accessed Jan 2010
Shapiro DE (1999) The interpretation of diagnostic tests. Stat Methods Med Res 8:113–134
Article CAS PubMed Google Scholar
Sing T, Sander O, Beerenwinkel N, Lengauer T (2009) ROCR: visualizing the performance of scoring classifiers. R package version 1.0-4. http://www.R-project.org. Accessed Jan 2011
Smith AB (2013) On evaluating species distribution models with random background sites in place of absences when test presences disproportionately sample suitable habitat. Divers Distrib. doi:10.1111/ddi.12031
Google Scholar
Smits N (2010) A note on Youden′s J and its cost ratio. BMC Med Res Methodol 10:89
Article PubMed Central PubMed Google Scholar
Soberón J (2010) Niche and area of distribution modeling: a population ecology perspective. Ecography 33:159–167
Article Google Scholar
Turner DA (1978) An intuitive approach to receiver operating characteristic curve analysis. J Nucl Med 19:213–220
CAS PubMed Google Scholar
Webb GI, Ting KM (2005) On the application of ROC analysis to predict classification performance under varying class distributions. Mach Learn 58:25–32
Article Google Scholar
Yin J, Tian L (2013) Joint confidence region estimation for area under ROC curve and Youden index. Stats Med. doi:10.1002/sim.5992
Google Scholar
Youden WJ (1950) Index for rating diagnostic tests. Cancer 3:32–35
Article CAS PubMed Google Scholar
Zou KH, O′Malley J, Mauri L (2007) Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models. Circulation 115:654–657
Article PubMed Google Scholar
Zweig MH, Campbell G (1993) Receiver-operating characteristic plots: a fundamental evaluation tool in clinical medicine. Clin Chem 39:561–577
CAS PubMed Google Scholar

Download references

Acknowledgments

Comments made by Jorge M. Lobo and two anonymous referees helped to improve the manuscript. Lucía Maltez kindly reviewed the English. A. J.-V. was supported by the CSIC JAE-Doc Program which is partially financed by the European Social Fund.

Author information

Authors and Affiliations

Dpto. Biogeografía y Cambio Global, Museo Nacional de Ciencias Naturales de Madrid, 28006, Madrid, Spain
Alberto Jiménez-Valverde

Authors

Alberto Jiménez-Valverde
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alberto Jiménez-Valverde.

Appendix

See Fig. 8.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiménez-Valverde, A. Threshold-dependence as a desirable attribute for discrimination assessment: implications for the evaluation of species distribution models. Biodivers Conserv 23, 369–385 (2014). https://doi.org/10.1007/s10531-013-0606-1

Download citation

Received: 04 September 2013
Accepted: 20 December 2013
Published: 07 January 2014
Issue Date: February 2014
DOI: https://doi.org/10.1007/s10531-013-0606-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Threshold-dependence as a desirable attribute for discrimination assessment: implications for the evaluation of species distribution models

Abstract

Access this article

Similar content being viewed by others

Prevalence affects the evaluation of discrimination capacity in presence-absence species distribution models

Effects of sample size, data quality, and species response in environmental space on modeling species distributions

Basic Introduction to Species Distribution Modelling

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Threshold-dependence as a desirable attribute for discrimination assessment: implications for the evaluation of species distribution models

Abstract

Access this article

Similar content being viewed by others

Prevalence affects the evaluation of discrimination capacity in presence-absence species distribution models

Effects of sample size, data quality, and species response in environmental space on modeling species distributions

Basic Introduction to Species Distribution Modelling

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation