Abstract
Obtaining the diversity samples of invasive alien species (species presence and absence samples) is vital for species distribution models. However, because of the enhanced focus on collecting presence samples, most datasets regarding invasive species lack explicit absence samples. Thus, the generation of effective pseudo-absence samples of invasive species is a critical issue for building species distribution models. This paper proposes a pseudo-absence sampling approach based on outlier detection in the geographical characteristic space. First, principal component analysis is used to model the linear correlation of the original variables, and a statistical index is built to determine the weight of the principal components. Next, in the geographical characteristic space built based on the principal components and their corresponding weights, the local outlier factor is obtained to identify the pseudo-absence samples. The dataset regarding the invasive species Erigeron annuus in the Yangtze River Economic Belt is used to illustrate the general process of the proposed approach. The prediction results from logistical regression with the proposed approach are better than these with the spatial random sampling, surface range envelope, and one-class support vector machine models. These findings validate the effectiveness of the proposed sampling approach.
Similar content being viewed by others
References
Basconcillo JQ, Duran G, Francisco AA, Abastillas RG, Hilario FD, Juanillo EL, Solis ALS, Lucero AJR, Maratas SLA (2017) Evaluation of spatial interpolation techniques for operational climate monitoring in the Philippines. SOLA Sci Online Lett Atmosp 13:114–119
Bedia J, Herrera S, Gutiérrez JM (2013) Dangers of using global bioclimatic datasets for ecological niche modeling. limitations for future climate projections. Glob Ecol Biogeogr 107:1–12
Booth TH (2014) Using biodiversity databases to verify and improve descriptions of tree species climatic requirements. For Ecol Manag 315:95–102
Breunig MM, Kriegel HP, Ng RT, Sander J (2000). LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data, Dallas, TX, USA, pp 93–104
Chen H, Chen L, Albright TP (2007) Predicting the potential distribution of invasive exotic species using GIS and information-theoretic approaches: a case of ragweed (Ambrosia artemisiifolia L.) distribution in China. Chin Sci Bull 52(9):1223–1230
Daly C, Halbleib M, Smith JI, Gibson WP, Doggett MK, Taylor GH, Curtis J, Pasteris PP (2008) Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. Int J Climatol 28:2031–2064
Domisch S, Kuemmerlen M, Jähnig S, Haase P (2013) Choice of study area and predictors affect habitat suitability projections, but not the performance of species distribution models of stream biota. Ecol Model 257:1–10
Guisan A, Tingley R, Baumgartner JB, Naujokaitis-Lewis I, Sutcliffe PR, Tulloch AIT, Regan TJ, Brotons L, Mcdonald-Madden E, Mantyka-Pringle C (2013) Predicting species distributions for conservation decisions. Ecol Lett 16:1424–1435
Gundogdu KS, Guney I (2007) Spatial analyses of groundwater levels using universal kriging. J Earth Syst Sci 116(1):49–55
Hanspach J, Kühn I, Schweiger O, Pompe S, Klotz S (2011) Geographical patterns in prediction errors of species distribution models. Glob Ecol Biogeogr 20(5):779–788
Hawkins DM (1980) Identification of outliers. Chapman and Hall, London
Hirzel AH, Hausser J, Chessel D, Perrin N (2002) Ecological-Niche factor analysis: how to compute habitat-suitability maps without absence data? Ecology 83(7):2027–2036
Hutchinson GE (1957) Concluding remarks. Cold Spring Harb Symp Quant Biol 22:415–427
Iturbide M, Bedia J, Herrera S, Hierro O, Pinto M, Gutiérrez JM (2015) A framework for species distribution modelling with improved pseudo-absence generation. Ecol Model 312:166–174
Jódar J, Sapriza G, Herrera C, Lambán LJ, Medina A (2015) Combining point and regular lattice data in geostatistical interpolation. J Geogr Syst 17:275–296
Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer-Verlag, New York
Journel AG, Huijbregts CJ (1978) Mining geostatistics. Academic Press, London
Kershaw AP (1997) A bioclimatic analysis of early to Middle Miocene brown coal floras, Latrobe Valley, south-eastern Australia. Aust J Bot 45:373–387
Kumar V (2007) Optimal contour mapping of groundwater levels using universal kriging: a case study. Int Assoc Sci Hydrol Bull 52(5):1038–1050
Lobo JM, Jimenez-Valverde A, Hortal J (2010) The uncertain nature of absences and their importance in species distribution modelling. Ecography 33:103–114
Mateo RG, Croat TB, Felicísimo AM, Munoz J (2010) Profile or group discriminative techniques? generating reliable species distribution models using pseudo-absences and target-group absences from natural history collections. Divers Distrib 16(1):84–94
Miller J (2010) Species distribution modeling. Geogr Compass 4:490–509
Pecchi M, Marchi M, Burton V, Giannetti F, Chirici G (2019) Species distribution modelling to support forest management: a literature review. Ecol Modell 411:108817
Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions. Ecol Model 190:231–259
Piri I, Khanamani A, Shojaei S, Fathizad H (2017) Determination of the best geostatistical method for climatic zoning in Iran. Appl Ecol Environ Res 15(1):93–103
Pouteau R, Meyer JY, Stoll B (2011) A SVM-based model for predicting distribution of the invasive tree Miconia calvescens in tropical rainforests. Ecol Model 222(15):2631–2641
Senay SD, Worner SP, Takayoshi I, Andrew D (2013) Novel three-step pseudo-absence selection technique for improved species distribution modelling. PLoS ONE 8(8):1–16
Shi Y, Gong JY, Deng M, Yang XX, Xu F (2018) A graph-based approach for detecting spatial cross-outliers from two types of spatial point events. Comput Environ Urban Syst 72:88–103
Thuiller W, Lafourcade B, Engler R, Araújo MB (2010) Biomod: a platform for ensemble forecasting of species distributions. Ecography 32(3):369–373
United Nations (2015) Transforming our world: the 2030 agenda for sustainable development. https://sustainabledevelopment.un.org/post2015/transformingourworld/publication
Watts MJ, Worner SP (2008) Comparing ensemble and cascaded neural networks that combine biotic and abiotic variables to predict insect species distribution. Eco Inf 3(6):354–366
Wong DW, Yuan L, Perlin S (2004) Comparison of spatial interpolation methods for the estimation of air quality data. J Expo Anal Environ Epidemiol 14:404–415
Xie W, Deng H, Chong Z (2019) The spatial and heterogeneity impacts of population urbanization on fine particulate (PM2.5) in the Yangtze river economic belt, China. Int J Environ Res Public Health 16(6):1058
Yang WT, Deng M, Xu F, Wang H (2018) Prediction of hourly PM2.5 using a space-time support vector regression model. Atmos Environ 181:12–19
Yang WT, Deng M, Yang XX, Wei DS (2019) Predictive soil pollution mapping: a hybrid approach for a dataset with outliers. IEEE Access 7:46668–46676
Young M, Carr MH, Robertson M (2015) Application of species distribution models to explain and predict the distribution, abundance, and assemblage structure of nearshore temperate reef fishes. Divers Distrib 21(12):1428–1440
Zhu AX, Lu G, Liu J (2018) Spatial prediction based on third law of geography. Ann GIS 24(4):225–240
Acknowledgements
This study was jointly supported by the National Science Foundation of China (Nos. 41801311 and 41871320), the Philosophy and Social Science Foundation of Hunan Province, China (No. 18YBQ050), and the Scientific Research Fund of Hunan Provincial Education Department (No. 19C0777).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Yang, W., He, H., Wei, D. et al. Generating pseudo-absence samples of invasive species based on outlier detection in the geographical characteristic space. J Geogr Syst 24, 261–279 (2022). https://doi.org/10.1007/s10109-021-00362-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10109-021-00362-6
Keywords
- Invasive species
- Spatial prediction
- Spatial sampling
- Principal component analysis
- Local outlier detection