Efficient Modelling of Presence-Only Species Data via Local Background Sampling
In species distribution modelling, records of species presence are often modelled as a realization of a spatial point process whose intensity is a function of environmental covariates. One way to fit a spatial point process model is to apply logistic regression to an artificial case–control sample consisting of the observed presence records combined with a simulated pattern of background points, usually a uniform random sample from within the study’s spatial domain. In this paper we propose local background sampling as an alternative to uniform background sampling when using logistic regression to fit spatial point process models to data. Our method is similar to the local case–control sampling procedure of Fithian and Hastie (Ann Appl Stat 42:1693–1724, 2014), but differs in that background points are sampled with probability proportional to an initial intensity estimate based on a pilot point process model. We compare local background sampling with uniform background sampling in a simulation study and in an example modelling the distributions of bumble bees (genus Bombus) in Ontario, Canada. Our results show local background sampling to be more efficient than uniform background sampling in all simulated settings and across all species analysed.
Supplementary materials accompanying this paper appear online.
KeywordsCase–control sampling Logistic regression Spatial point processes Species distribution modelling
Funding was provided by the Natural Sciences and Engineering Research Council of Canada (Discovery Grant 261497-2011-RGPIN).
- Feng, X., Castro, M. C., Linde, E., and Papeş, M. (2017). Armadillo Mapper: A case study of an online application to update estimates of species’ potential distributions. Tropical Conservation Science 10, 1–5.Google Scholar
- GBIF (2019). GBIF occurrence download. https://doi.org/10.15468/dl.fvby3r.
- Phillips, S. J., Dudík, M., and Schapire, R. E. (2017). Maxent software for modeling species niches and distributions (Version 3.4.1). http://biodiversityinformatics.amnh.org/open_source/maxent/.
- Rinnhofer, L. J., Roura-Pascual, N., Arthofer, W., Dejaco, T., Thaler-Knoflach, B., Wachter, G. A., Christian, E., Steiner, F. M., and Schlick-Steiner, B. C. (2012). Iterative species distribution modelling and ground validation in endemism research: an alpine jumping bristletail example. Biodiversity and Conservation 21, 2845–2863.CrossRefGoogle Scholar