Abstract
Species distribution Modelling (SDM) constitutes a useful tool to predict the distribution of freshwater species based on selected habitat variables. Model performance (goodness-of-fit) expressed as coefficient of determination is straightforward to judge on SDM quality but does not sufficiently address predictive success. Hence, predictive performance (e.g. AUC) accounts for the correctness of predicted species occurrences. In this study, we compared the model and predictive performance of SDMs on eleven macroinvertebrate taxa in a mountain catchment, with emphasis on species prevalence. SDMs were based on two regression methods using broad-scale environmental predictors (land use, instream habitat quality). We applied a cross-validation and a field validation approach using newly sampled field data. Opposed to other species, SDMs showed acceptable performance (pseudo-R 2 > 0.3) for the stonefly Dinocras cephalotes and the caddisflies Silo piceus and Silo pallipes. Model performance was neither positively nor linearly correlated with predictive accuracy. The comparison of cross- and field validation revealed an overestimation of the discriminatory power of cross-validated models. SDMs of less prevalent species tend to over-predict absences rather than presences. Consequently, model performance is decoupled from predictive performance. The validation results suggest the use of new field data providing a more reliable benchmark for SDM assessment.
Similar content being viewed by others
References
Allan, J. D., 2004. Landscapes and riverscapes: the influence of land use on stream ecosystems. Annual Review of Ecology Evolution and Systematics 35: 257–284.
Allouche, O., A. Tsoar & R. Kadmon, 2006. Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). Journal of Applied Ecology 43: 1223–1232.
Araújo, M. B. & A. Guisan, 2006. Five (or so) challenges for species distribution modelling. Journal of Biogeography 33: 1677–1688.
Araújo, M. B. & M. Luoto, 2007. The importance of biotic interactions for modelling species distributions under climate change. Global Ecology and Biogeography 16: 743–753.
Araújo, M. B., R. G. Pearson, W. Thuiller & M. Erhard, 2005. Validation of species-climate impact models under climate change. Global Change Biology 11: 1504–1513.
Arlot, S. & A. Celisse, 2010. A survey of cross-validation procedures for model selection. Statistics Surveys 4: 40–79.
ATKIS, 2007. ATKIS - Objektartenkatalog Basis-DLM. Version 3.2. http://www.atkis.de.
Bahn, V. & B. J. McGill, 2013. Testing the predictive performance of distribution models. Oikos 122: 321–331.
Barbosa, A. M., R. Real, A.-R. Muñoz & J. A. Brown, 2013. New measures for assessing model equilibrium and prediction mismatch in species distribution models. Diversity and Distributions 19: 1333–1338.
Barry, S. & J. Elith, 2006. Error and uncertainty in habitat models. Journal of Applied Ecology 43: 413–423.
Bizzi, S., B. W. J. Surridge & D. N. Lerner, 2013. Structural Equation Modelling: a novel statistical framework for exploring the spatial distribution of benthic macroinvertebrates in riverine ecosystems. River Research and Application 29: 743–759.
Brotons, L., W. Thuiller, M. B. Araújo & A. H. Hirzel, 2004. Presence-absence versus presence-only modelling methods for predicting bird habitat suitability. Ecography 27: 437–448.
Buisson, L. & G. Grenouillet, 2009. Contrasted impacts of climate change on stream fish assemblages along an environmental gradient. Diversity and Distributions 15: 613–626.
De Araújo, C. B., L. O. Marcondes-Machado & G. C. Costa, 2014. The importance of biotic interactions in species distribution models: a test of the Eltonian noise hypothesis using parrots. Journal of Biogeography 41: 513–523.
Domisch, S., M. Kuemmerlen, S. C. Jähnig & P. Haase, 2013. Choice of study area and predictors affect habitat suitability projections, but not the performance of species distribution models of stream biota. Ecological Modelling 257: 1–10.
Elith, J., C. H. Graham, R. P. Anderson, M. Dudík, S. Ferrier, A. Guisan, R. J. Hijmans, F. Huettmann, J. R. Leathwick, A. Lehmann, J. Li, L. G. Lohmann, B. A. Loiselle, G. Manion, C. Moritz, M. Nakamura, Y. Nakazawa, J. Mc C Overton, A. T. Peterson, S. J. Phillips, K. S. Richardson, R. Scachetti-Pereira, R. E. Schapire, J. Soberón, S. Williams, M. S. Wisz & N. E. Zimmermann, 2006. Novel methods improve prediction of species’ distributions from occurrence data. Ecography 29: 129–151.
Eskildsen, A., P. C. le Roux, R. K. Heikkinen, T. T. Høye, W. D. Kissling, J. Pöyry, M. S. Wisz & M. Luoto, 2013. Testing species distribution models across space and time: high latitude butterflies and recent warming. Global Ecology and Biogeography 22: 1293–1303.
ESRI, 2011. ArcGIS Desktop: Release 10.0. Redlands, CA: Environmental Systems Research Institute. www.esri.com.
Evangelista, P. H., S. Kumar, T. J. Stohlgren, C. S. Jarnevich, A. W. Crall, J. B. Norman III & D. T. Barnett, 2008. Modelling invasion for a habitat generalist and a specialist plant species. Diversity and Distributions 14: 808–817.
Fawcett, T., 2006. An introduction to ROC analysis. Pattern Recognition Letters 27: 861–874.
Feld, C. K., 2013. Response of three lotic assemblages to riparian and catchment-scale land use: implications for designing catchment monitoring programmes. Freshwater Biology 58(4): 715–729. doi:10.1111/fwb.12077.
Fielding, A. H. & J. F. Bell, 1997. A review of methods for the assessment of prediction errors in conservation presence/absence models. Environmental Conservation 24: 38–49.
Franklin, J., K. E. Wejnert, S. A. Hathaway, C. J. Rochester & R. N. Fisher, 2009. Effect of species rarity on the accuracy of species distribution models for reptiles and amphibians in southern California. Diversity and Distributions 15: 167–177.
Free, G., A. G. Solimini, B. Rossaro, L. Marziali, R. Giacchini, B. Paracchini, M. Ghiani, S. A. Vaccaro, B. M. A. Gawlik, R. D. Fresner, G. D. Santner, M. Schönhuber & A. C. Cardoso, 2009. Modelling lake macroinvertebrate species in the shallow sublittoral: relative roles of habitat, lake morphology, aquatic chemistry and sediment composition. Hydrobiologia 633(1): 123–136.
Freeman, E. A. & G. Moisen, 2008. Presence absence: an R package for presence-absence model analysis. Journal of Statistical Software 23: 1–31. http://www.jstatsoft.org/v23/i11.
Gies, M., M. Sondermann, D. Hering & C. K. Feld, 2015. Are species distribution models based on broad-scale environmental variables transferable across adjacent watersheds? A case study with eleven macroinvertebrate species. Fundamental and Applied Limnology 186: 63–97.
Guisan, A. & W. Thuiller, 2005. Predicting species distribution: offering more than simple habitat models. Ecology Letters 8: 993–1009.
Hanley, J. A. & B. J. McNeil, 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143: 29–36.
Hastie, T., R. Tibshirani & J. H. Friedman, 2001. The elements of statistical learning: data mining, inference, and prediction. Springer, New York.
Hosmer, D.W. & S. Lemeshow, 2000. Applied logistic regression. New York, Wiley-Interscience Publication, 392 pp. ISBN 0-471-61553-6.
IBM Corp. Released 2011. IBM SPSS Statistics for Windows, Version 20.0. IBM Corp, Armonk.
Jiménez-Valverde, A., 2012. Insights into the area under the receiver operating characteristic curve (AUC) as a discrimination measure in species distribution modelling. Global Ecology and Biogeography 21: 498–507.
Jiménez-Valverde, A., J. M. Lobo & J. Hortal, 2009. The effect of prevalence and its interaction with sample size on the reliability of species distribution models. Community Ecology 10(2): 196–205.
Jiménez-Valverde, A. & J. M. Lobo, 2006. The ghost of unbalanced species distribution data in geographical model predictions. Diversity and Distributions 12: 521–524.
Jyväsjärvi, J., J. Aroviita & H. Hämäläinen, 2011. Evaluation of approaches for measuring taxonomic completeness of lake profundal macroinvertebrate assemblages. Freshwater Biology 56(9): 1876–1892.
Kail, J. & D. Hering, 2005. Using large wood to restore streams in Central Europe: potential use and likely effects. Landscape Ecology 20: 755–772.
Kail, J. & D. Hering, 2009. The influence of adjacent stream reaches on the local ecological status of Central European mountain streams. River Research and Applications 25: 537–550.
Kohavi, R., 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence 2: 1137–1143.
Landesumweltamt Nordrhein-Westfalen (LUA) & Ministerium für Umwelt und Naturschutz, Landwirtschaft und Verbraucherschutz des Landes Nordrhein-Westfalen (MUNLV), 2005. Gewässerstrukturgüte in Nordrhein-Westfalen – Bericht 2005. 109 pp.
Lawson, C. R., J. A. Hodgson, R. J. Wilson & S. A. Richards, 2014. Prevalence, thresholds and the performance of presence-absence models. Methods in Ecology and Evolution 5: 54–64.
Le Roux, P. C., J. Lenoir, L. Pellissier, M. S. Wisz & M. Luoto, 2013. Horizontal, but not vertical, biotic interactions affect fine-scale plant distribution patterns in a low-energy system. Ecology 94: 671–682.
Lobo, J. M., A. Jiménez-Valverde & R. Real, 2008. AUC: a misleading measure of the performance of predictive distribution models. Global Ecology and Biogeography 17: 145–151.
Lock, K. & P. L. M. Goethals, 2013. Habitat suitability modelling for mayflies (Ephemeroptera) in Flanders (Belgium). Ecological Informatics 17: 30–35.
Manel, S., J.-M. Dias & S. Ormerod, 1999. Comparing discriminant analysis, neural networks and logistic regression for predicting species distributions: a case study with a Himalayan river bird. Ecological Modelling 120: 337–347.
Manel, S., H. Williams & S. Ormerod, 2001. Evaluating presence-absence models in ecology: the need to account for prevalence. Journal of Applied Ecology 38: 921–931.
McCune, B., 2006. Non-parametric habitat models with automatic interactions. Journal of Vegetation Science 17: 819–830.
McCune, B., 2007. Improved estimates of incident radiation and heat load using non-parametric regression against topographic variables. Journal of Vegetation Science 18: 751–754.
McCune, B. & M.J. Mefford, 2009. HyperNiche. Nonparametric Multiplicative Habitat Modeling, Version 2.20. MjM Software, Gleneden Beach, Oregon, U.S.A.
McPherson, J. M. & W. Jetz, 2007. Effects of species’ ecology on the accuracy of distribution models. Ecography 30: 151–153.
McPherson, J. M., W. Jetz & D. J. Rogers, 2004. The effects of species’ range sizes on the accuracy of distribution models: ecological phenomenon or statistical artefact? Journal of Applied Ecology 41: 811–823.
Ministerium für Umwelt und Naturschutz, Landwirtschaft und Verbraucherschutz des Landes Nordrhein-Westfalen (MUNLV), 2005. Ergebnisbericht Ruhr, Wasserrahmenrichtlinie in NRW – Bestandsaufnahme. 445 pp.
Mladenoff, D. J., T. A. Sickley & A. Wydeven, 1999. Predicting gray wolf landscape recolonization: logistic regression models vs. new field data. Ecological Applications 9: 37–44.
Mouton, A. M., B. De Bates & P. L. M. Goethals, 2010. Ecological relevance of performance criteria for species distribution models. Ecological Modelling 221: 1995–2002.
Nagelkerke, N.J.D., 1992. Maximum Likelihood Estimation of Functional Relationships, Pays-Bas. Lecture Notes in Statistics 69. ISBN 0-387-97721-X.
Olden, J. D. & D. A. Jackson, 2000. Torturing data for the sake of generality: how valid are our regression models? Ecoscience 7: 501–510.
Olden, J. D. & D. A. Jackson, 2002. A comparison of statistical approaches for modelling fish species distributions. Freshwater Biology 47: 1976–1995.
Osborne, P. E. & S. Suárez-Seoane, 2002. Should data be partitioned spatially before building large-scale distribution models? Ecological Modelling 157: 249–259.
Park, S. H., J. M. Goo & C.-H. Jo, 2004. Receiver operating characteristic (ROC) Curve: practical review for radiologists. Korean Journal of Radiology 5: 11–18.
Pearce, J. & S. Ferrier, 2000. Evaluating the predictive performance of habitat models developed using logistic regression. Ecological Modelling 133: 225–245.
Peterson, A. T., 2001. Predicting species’ geographic distributions based on ecological niche modeling. The Condor 103: 599–605.
Peterson, A. T., 2003. Predicting the geography of species’ invasions via ecological niche modeling. The Quarterly Review of Biology 78: 419–433.
Pottgiesser, T. & M. Sommerhäuser, 2008. Beschreibung und Bewertung der deutschen Fließgewässertypen - Steckbriefe und Anhang. http://www.wasserblick.net/servlet/is/18727/?lang=de. Accessed 5 September 2011.
Power, M., 1993. The predictive validation of ecological and environmental models. Ecological Modelling 68: 33–50.
R Development Core Team, 2012. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. http://www.R-project.org/.
Randin, C. F., T. Dirnbock, S. Dullinger, N. E. Zimmermann, M. Zappa & A. Guisan, 2006. Are niche-based species distribution models transferable in space? Journal of Biogeography 33: 1689–1703.
Reusser, D. A. & H. Lee II, 2008. Predictions for an invaded world: a strategy to predict the distribution of native and non-indigenous species at multiple scales. ICES Journal of Marine Science 65: 742–745.
Ruhrverband, 2009. Ruhrgütebericht 2008, 197 pp.
Ruhrverband, 2013. Ruhrgütebericht 2012, 211 pp.
Scherr, M. A., D. Wooster & S. Rao, 2011. Interactions between macroinvertebrate taxa and complex environmental gradients influencing abundance and distribution in the Umatilla River, Northeastern Oregon. Journal of Freshwater Ecology 26(2): 255–266.
Segurado, P. & M. B. Araújo, 2004. An evaluation of methods for modeling species distributions. Journal of Biogeography 31: 1555–1568.
Sing, T., Sander, O., Niko Beerenwinkel, N. & T. Lengauer, 2009. ROCR: Visualizing the performance of scoring classifiers. R package version 1.0-4. http://CRAN.R-Project.org/package=ROCR.
Stockwell, D. R. B. & A. T. Peterson, 2002. Effects of sample size on accuracy of species distribution models. Ecological Modelling 148: 1–13.
Swets, K. A., 1988. Measuring the accuracy of diagnostic systems. Science 240: 1285–1293.
Thuiller, W., S. Lavorel & M. B. Araújo, 2005a. Niche properties and geographical extent as predictors of species sensitivity to climate change. Global Ecology and Biogeography 14: 347–357.
Thuiller, W., D. M. Richardson, P. Pyšek, G. F. Midgley, G. O. Hughes & M. Rouget, 2005b. Niche-based modelling as a tool for predicting the risk of alien plant invasions at a global scale. Global Change Biology 11: 2234–2250.
Tsoar, A., O. Allouche, O. Steinitz, D. Rotem & R. Kadmon, 2007. A comparative evaluation of presence-only methods for modelling species distribution. Diversity and Distributions 13: 397–405.
Vaughan, I. P. & S. J. Ormerod, 2005. The continuing challenges of testing species distribution models. Journal of Applied Ecology 42: 720–730.
Yost, A. C., 2008. Probabilistic modeling and mapping of plant indicator species in a Northeast Oregon industrial forest, USA. Ecological Indicators 8: 46–56.
Acknowledgement
This work was financially and ideally supported by the Deutsche Bundesstiftung Umwelt (DBU) as well as by the German Research Foundation (DFG, grant no. HE 2764/2-1). We are grateful to the North Rhine-Westphalia State Agency for Nature, Environment and Consumer Protection (LANUV) for providing physical habitat quality survey data and the digital river network (3A). We also thank the district government Cologne in North Rhine-Westphalia (Geobasis NRW) for providing the digital terrain model (DGM5).
Author information
Authors and Affiliations
Corresponding author
Additional information
Handling editor: Sonja Stendera
Rights and permissions
About this article
Cite this article
Gies, M., Sondermann, M., Hering, D. et al. A comparison of modelled and actual distributions of eleven benthic macroinvertebrate species in a Central European mountain catchment. Hydrobiologia 758, 123–140 (2015). https://doi.org/10.1007/s10750-015-2280-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10750-015-2280-7