Abstract
Conventional methods of machine learning have been widely used to generate spatial prediction models because such methods can adaptively learn the mapping relationships among spatial data with limited prior knowledge. However, the direct application of these methods to build a global model without considering spatial heterogeneity cannot accurately describe the local relationships among spatial variables, which might lead to inaccurate predictions. To avoid these shortcomings, we have presented a unified framework for handling spatial heterogeneity by incorporating the geographically weighted scheme into machine learning methods. The proposed framework has the potential to extend the existing models of machine learning for analysing heterogeneous spatial data. Furthermore, geographically weighted support vector regression (GWSVR) has been introduced as an implementation of the proposed framework. Experimental studies on environmental datasets were used to test the ability of model predictions. The results show that the mean absolute percentage error, normalized mean square error, and relative error percentage of the GWSVR model are 0.436, 0.903, and 0.558, respectively, when analysing soil metal chromium (Cr) concentrations and 0.221, 0.287, and 0.206, respectively, when predicting PM2.5 concentrations; these values are lower than those obtained using support vector regression, geographically weighted regression (GWR), and GWR-kriging models. These case studies have proved the validity and feasibility of the proposed framework.
This is a preview of subscription content, access via your institution.












References
Abedini M, Ghasemian B, Shirzadi A, Bui DT (2019) A comparative study of support vector machine and logistic model tree classifiers for shallow landslide susceptibility modeling. Environ Earth Sci 78(18):560
Anselin L (1988) Spatial econometrics: methods and models. Kluwer, Dordrecht
Anselin L, Griffith DA (1988) Do spatial effects really matter in regression analysis? Reg Sci Assoc 65:11–34
Anselin L, Rey S (1991) Properties of tests for spatial dependence in linear regression models. Geogr Anal 23(2):112–131
Arabameri A, Pradhan B, Rezaei K (2019) Gully erosion zonation mapping using integrated geographically weighted regression with certainty factor and random forest models in GIS. J Environ Manag 232:928–942
Bishop CM (2006) Pattern recognition and machine learning. Springe, New York
Breiman L (2001) Statistical modeling: the two cultures. Stat Sci 16(3):199–231
Brunsdon CH, Fotheringham AS, Charlton ME (1996) Geographically weighted regression: a method for exploring spatial nonstationarity. Geogr Anal 28(4):281–298
Brunsdon CH, Fotheringham AS, Charlton ME (1998) Geographically weighted regression. J R Stat Soc D Stat 47(3):431–443
Brunsdon CH, Fotheringham AS, Charlton ME (2007) Geographically weighted discriminant analysis. Geogr Anal 39(4):376–396
Carlin BP, Louis TA (2008) Bayesian methods for data analysis. CRC Press, Boca Raton
Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27
Chapi K, Singh VP, Shirzadi A, Shahabi H, Bui DT, Pham BT, Khosravi K (2017) A novel hybrid artificial intelligence approach for flood susceptibility assessment. Environ Model Softw 95:229–245
Chen H, Chen L, Albright TP (2007) Predicting the potential distribution of invasive exotic species using GIS and information-theoretic approaches: a case of ragweed (Ambrosia artemisiifolia L.) distribution in China. Chin Sci Bull 52(9):1223–1230
Cheng T, Wang JQ, Li X (2011) A hybrid framework for space–time modeling of environmental data. Geogr Anal 43(2):188–210
Cressie NAC (1996) Change of support and the modifiable areal unit problem. J Geogr Syst 3(2):159–180
Deng M, Yang WT, Liu QL (2017) Geographically weighted extreme learning machine: a method for space-time prediction. Geogr Anal 49(4):433–450
Domisch S, Kuemmerlen M, Jähnig S, Haase P (2013) Choice of study area and predictors affect habitat suitability projections, but not the performance of species distribution models of stream biota. Ecol Model 257:1–10
Donkelaar AV, Martin RV, Spurr R, Burnett RT (2015) High-resolution satellite-derived PM2.5 from optimal estimation and geographically weighted regression over North America. Environ Sci Technol 49(17):10482–10491
Du ZH, Wang ZY, Wu SS, Zhang F, Liu RY (2020) Geographically neural network weighted regression for the accurate estimation of spatial non-stationarity. Int J Geogr Inf Sci 34(7):1353–1377
Dubin R (1988) Estimation of regression coefficients in the presence of spatially autocorrelated errors. Rev Econ Stat 70:466–474
Dunham MH, Ayewah N, Li Z, Bean K, Huang J (2005) Spatio-temporal prediction using data mining tools. In: Manolopoulos Y, Papadopoulos AN, Vassilakopoulos MG (eds) Spatial databases: technologies, techniques, and trends. IGI Global, Hershey, pp 251–271
Elhorst JP (2003) Specification and estimation of spatial panel data models. Int Reg Sci Rev 26(3):244–268
Feng YJ, Yang Q, Hong Z, Cui L (2016) Modelling coastal land use change by incorporating spatial autocorrelation into cellular automata models. Geocarto Int 33(5):1–44
Fotheringham AS, Brunsdon CH, Charlton ME (2000) Quantitative geography: perspectives on spatial data analysis. SAGE, London
Fotheringham AS, Brunsdon CH, Charlton ME (2003) Geographically weighted regression: the analysis of spatially varying relationships. Wiley, Chichester
Ganiz MC, George C, Pottenger WM (2011) Higher order naïve bayes: a novel non-IID approach to text classification. IEEE Trans Knowl Data Eng 23(7):1022–1034
Goodchild MF (2004) GIScience: geography, form, and process. Ann Assoc Am Geogr 94:709–714
Goovaerts P (1997) Geostatistics for natural resources evaluation. Oxford University Press, New York
Harris P, Charlton M, Fotheringham AS (2010) Moving window kriging with geographically weighted variograms. Stoch Environ Res Risk Assess 24:1193–1209
Harris P, Brunsdon C, Charlton M (2011) Geographically weighted principal components analysis. Int J Geogr Inf Sci 25(10):1717–1736
Hong H, Panahi M, Shirzadi A, Ma T, Liu J, Zhu AX, Chen W, Kougias I, Kazakis N (2018) Flood susceptibility assessment in Hengfeng area coupling adaptive neuro-fuzzy inference system with genetic algorithm and differential evolution. Sci Total Environ 621:1124–1141
Huang B, Wu B, Barry M (2010) Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices. Int J Geogr Inf Sci 24(3):383–401
Hudson G, Wackernagel H (1994) Mapping temperature using kriging with external drift: theory and an example from Scotland. Int J Climatol 14(1):77–91
Kanevski M, Pozdnoukhov A, Timonin V (2009) Machine learning for spatial environmental data: theory, applications, and software. EPFL Press, Lausanne, pp 1–19
Khosravi K, Shahabi H, Pham BT, Adamowski J, Shirzadi A, Pradhan B, Dou J, Ly HB, Gróf G, Ho HL, Hong HY, Chapi K, Prakash I (2019) A comparative assessment of flood susceptibility modeling using multi-criteria decision-making analysis and machine learning methods. J Hydrol 573:311–323
Kumar S, Lal R, Liu DS (2012) A geographically weighted regression kriging approach for mapping soil organic carbon stock. Geoderma 189–190:627–634
Li LF (2019) Geographically weighted machine learning and downscaling for high-resolution spatiotemporal estimations of wind speed. Remote Sens 11(11):1378
Lloyd CD (2010) Nonstationary models for exploring and mapping monthly precipitation in the United Kingdom. Int J Climatol 30:390–405
Lu BB, Harris P, Charlton M, Brunsdon C (2014) The GWmodel R package: further topics for exploring spatial heterogeneity using geographically weighted models. Geo Spat Inf Sci 17(2):85–101
Maoh H, Kanaroglou P (2007) Geographic clustering of firms and urban form: a multivariate analysis. J Geogr Syst 9(1):29–52
Miller HJ, Han JW (2009) Geographic data mining and knowledge discovery. CRC Press, New York
Mirbagheri B, Alimohammadi A (2017) Improving urban cellular automata performance by integrating global and geographically weighted logistic regression models. Trans GIS 21(6):1280–1297
Nakaya T, Fotheringham AS, Brundon C, Charlton M (2005) Geographically weighted Poisson regression for disease association mapping. Stat Med 24(17):2695–2717
Páez A, Long F, Farber S (2008) Moving window approaches for hedonic price estimation: an empirical comparison of modelling techniques. Urban Stud 45:1565–1581
Pereira C, Mello RD (2011) Learning process behavior for fault detection. Int J Artif Intell Trans 20(5):969–980
Pfeifer PE, Deutsch SJ (1980) A STARIMA model-building procedure with application to description and regional forecasting. Trans Inst Br Geogr 5(3):330–349
Pham BT, Shirzadi A, Shahabi H, Omidvar E, Singh SK, Sahana M, Asl DT, Ahmad BB, Quoc NK, Lee S (2019) Landslide susceptibility assessment by novel hybrid machine learning algorithms. Sustainability 11(16):4386
Tan X, Guo PT, Wu W, Li MF, Liu HB (2017) Prediction of soil properties by using geographically weighted regression at a regional scale. Soil Res 55(4):318–331
Tobler WR (1970) A computer movie simulating urban growth in the Detroit region. Econ Geogr 46:234–240
Vapnik V (2000) The nature of statistical learning theory. Springer, Berlin
Wang Y, Hong H, Chen W, Li S, Panahi M, Khosravi K, Shirzadi A, Shahabi H, Panahi S, Costache R (2019) Flood susceptibility mapping in Dingnan county (China) using adaptive neuro-fuzzy inference system with biogeography-based optimization and imperialistic competitive algorithm. J Environ Manag 247:712–729
Wu SS, Wang ZY, Du ZH, Huang B, Liu RY (2020) Geographically and temporally neural network weighted regression for modeling spatiotemporal non-stationary relationships. Int J Geogr Inf Sci 35(3):582–608
Xie Y, Eftelioglu E, Ali RY, Tang X, Li Y, Doshi R, Shekhar S (2017) Transdisciplinary foundations of geospatial data science. ISPRS Int J Geo Inf 6(12):395
Yang WT, Deng M, Xu F, Wang H (2018) Prediction of hourly PM2.5 using a space-time support vector regression model. Atmos Environ 181:12–19
Yang WT, Deng M, Yang XX, Wei DS (2019) Predictive soil pollution mapping: a hybrid approach for a dataset with outliers. IEEE Access 7:46668–46676
Yu X, Wang Y, Niu R, Hu Y (2016) A combination of geographically weighted regression, particle swarm optimization and support vector machine for landslide susceptibility mapping: a case study at Wanzhou in the Three Gorges Area, China. Int J Environ Res Public Health 13(5):487
Zhao R, Yao MX, Yang LC, Qi H, Meng XL, Zhou FJ (2021) Using geographically weighted regression to predict the spatial distribution of frozen ground temperature: a case in the Qinghai-Tibet plateau. Environ Res Lett 16:024003
Acknowledgements
This study was jointly supported by the National Science Foundation of China (No. 41801311 and No. 41901406), the Natural Science Foundation of Hunan, China (No. 2021JJ30245), the Philosophy and Social Science Foundation of Hunan Province, China (No. 18YBQ050), and the Scientific Research Fund of Hunan Provincial Education Department (No. 19C0777).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, W., Deng, M., Tang, J. et al. Geographically weighted regression with the integration of machine learning for spatial prediction. J Geogr Syst (2022). https://doi.org/10.1007/s10109-022-00387-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10109-022-00387-5
Keywords
- Spatial data prediction
- Spatial heterogeneity
- Support vector regression
- Environmental pollution
JEL Classification
- C13
- C31
- Q53