Skip to main content

Spatial regression graph convolutional neural networks: A deep learning paradigm for spatial multivariate distributions

A Correction to this article was published on 03 March 2022

This article has been updated

Abstract

Geospatial artificial intelligence (GeoAI) has emerged as a subfield of GIScience that uses artificial intelligence approaches and machine learning techniques for geographic knowledge discovery. The non-regularity of data structures has recently led to different variants of graph neural networks in the field of computer science, with graph convolutional neural networks being one of the most prominent that operate on non-euclidean structured data where the numbers of nodes connections vary and the nodes are unordered. These networks use graph convolution – commonly known as filters or kernels – in place of general matrix multiplication in at least one of their layers. This paper suggests spatial regression graph convolutional neural networks (SRGCNNs) as a deep learning paradigm that is capable of handling a wide range of geographical tasks where multivariate spatial data needs modeling and prediction. The feasibility of SRGCNNs lies in the feature propagation mechanisms, the spatial locality nature, and a semi-supervised training strategy. In the experiments, this paper demonstrates the operation of SRGCNNs with social media check-in data in Beijing and house price data in San Diego. The results indicate that a well-trained SRGCNN model is capable of learning from samples and performing reasonable predictions for unobserved locations. The paper also presents the effectiveness of incorporating the idea of geographically weighted regression for handling heterogeneity between locations in the model approach. Compared to conventional spatial regression approaches, SRGCNN-based models tend to generate much more accurate and stable results, especially when the sampling ratio is low. This study offers to bridge the methodological gap between graph deep learning and spatial regression analytics. The proposed idea serves as an example to illustrate how spatial analytics can be combined with state-of-the-art deep learning models, and to enlighten future research at the front of GeoAI.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Change history

Notes

  1. https://pytorch.org

  2. http://pysal.org/packages

  3. http://insideairbnb.com/about.html

References

  1. Paelinck JHP, Klaassen LH, Ancot J-P, Verster ACP (1979) Spatial econometrics, vol 1. Saxon House

  2. Anselin L (1988) Spatial econometrics: Methods and models, vol 4. Springer Science & Business Media, Berlin

    MATH  Google Scholar 

  3. LeSage JP (1997) Regression analysis of spatial data. J Region Anal Policy 27:83–94

    Google Scholar 

  4. LeSage JP, Fischer MM (2008) Spatial growth regressions: model specification, estimation and interpretation. Spatial Economic Analysis 3:275–304

    Google Scholar 

  5. Lehmann A, Overton JMcC, Leathwick JR (2002) Grasp: generalized regression analysis and spatial prediction. Ecol Modell 157:189–207

    Google Scholar 

  6. Anselin L (2010) Thirty years of spatial econometrics. Papers Region Sci 89(1):3–25

    Google Scholar 

  7. Fischer MM, Wang J (2011) Spatial data analysis: models, methods and techniques. Springer Science & Business Media

  8. Liu Y, Liu X, Gao S, Gong L, Kang C, Zhi Y, Chi G, Shi L (2015) Social sensing: A new approach to understanding our socioeconomic environments. Ann Assoc Amer Geograph 105:512–530

    Google Scholar 

  9. Vatsavai R, Chandola V (2016) Guest editorial: big spatial data. GeoInformatica 20:797–799

    Google Scholar 

  10. Cheng T, Adepeju M (2014) Modifiable temporal unit problem (mtup) and its effect on space-time cluster detection. PloS one 9:e100465

    Google Scholar 

  11. Haworth James, Cheng Tao (2012) Non-parametric regression for space–time forecasting under missing data. Comput Environ Urban Syst 36:538–550

    Google Scholar 

  12. Kelejian HH, Prucha IR (2007) The relative efficiencies of various predictors in spatial econometric models containing spatial lags. Region Sci Urban Econ 37:363–374

    Google Scholar 

  13. Fischer MM (1998) Computational neural networks: a new paradigm for spatial analysis. Environ Plann A 30:1873–1891

    Google Scholar 

  14. Gu Y, Wylie BK, Boyte SP, Picotte J, Howard DM, Smith K, Nelson KJ (2016) An optimal sample data usage strategy to minimize overfitting and underfitting effects in regression tree models based on remotely-sensed data. Remote Sens 8:943

    Google Scholar 

  15. Mennis J, Guo D (2009) Spatial data mining and geographic knowledge discovery–an introduction. Comput Environ Urban Syst 33:403–408

    Google Scholar 

  16. Reichstein M, Camps-Valls G, Stevens B, Jung M, Denzler J, Carvalhais N, et al. (2019) Deep learning and process understanding for data-driven earth system science. Nature 566:195–204

    Google Scholar 

  17. Janowicz K, Gao S, McKenzie G, Hu Y, Bhaduri B (2020) GeoAI: spatially explicit artificial intelligence techniques for geographic knowledge discovery and beyond. Int J Geograph Inf Sci 34:625–636

    Google Scholar 

  18. Li W, Hsu C-Y (2020) Automated terrain feature identification from remote sensing imagery: a deep learning approach. Int J Geograph Inf Sci 34 (4):637–660

    Google Scholar 

  19. Yan X, Ai T, Yang M, Yin H (2019) A graph convolutional neural network for classification of building patterns using spatial vector data. ISPRS J Photogramm Remote Sens 150:259–273

    Google Scholar 

  20. Zhang F, Wu L, Zhu D, Liu Y (2019) Social sensing from street-level imagery: A case study in learning spatio-temporal urban mobility patterns. ISPRS J Photogramm Remote Sens 153:48–58

    Google Scholar 

  21. Liu P, De Sabbata S (2021) A graph-based semi-supervised approach to classification learning in digital geographies. Comput Environ Urban Syst 86:101583

    Google Scholar 

  22. Zhu D, Cheng X, Zhang F, Yao X, Gao Y, Liu Y (2020) Spatial interpolation using conditional generative adversarial neural networks. Int J Geograph Inf Sci 34:735–758

    Google Scholar 

  23. Xing X, Huang Z, Cheng X, Zhu D, Kang C, Zhang F, Liu Y (2020) Mapping human activity volumes through remote sensing imagery. IEEE J Sel Top Appl Earth Observ Remote Sens 13:5652–5668

    Google Scholar 

  24. Du Z, Wang Z, Wu S, Zhang F, Liu R (2020) Geographically neural network weighted regression for the accurate estimation of spatial non-stationarity. Int J Geograph Inf Sci 34:1353–1377

    Google Scholar 

  25. Zhu D, Zhang F, Wang S, Wang Y, Cheng X, Huang Z, Liu Y (2020) Understanding place characteristics in geographic contexts through graph convolutional neural networks. Ann Amer Assoc Geograph 110:408–420

    Google Scholar 

  26. Xiao L, Lo S, Zhou J, Liu J, Yang L (2020) Predicting vibrancy of metro station areas considering spatial relationships through graph convolutional neural networks: The case of shenzhen, china. Environ Plann B: Urban Anal City Sci:2399808320977866

  27. Schmidhuber J (2015) Deep learning in neural networks: An overview. Neural networks 61:85–117

    Google Scholar 

  28. Bronstein MM, Bruna J, LeCun Y, Szlam A, Vandergheynst P (2017) Geometric deep learning: going beyond euclidean data. IEEE Signal Process Mag 34:18–42

    Google Scholar 

  29. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444

    Google Scholar 

  30. Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems, pp 3844–3852

  31. Niepert M, Ahmed M, Kutzkov K (2016) Learning convolutional neural networks for graphs. In: International conference on machine learning, pp 2014–2023

  32. Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Advances in neural information processing systems, pp 1024–1034

  33. Fan RKC (1997) Spectral graph theory. American Mathematical Society

  34. Hammond DK, Vandergheynst P, Gribonval R (2009) Wavelets on graphs via spectral graph theory. Appl Comput Harmon Anal 30:129–150

    MathSciNet  MATH  Google Scholar 

  35. Bruna J, Zaremba W, Szlam A, Lecun Y (2014) Spectral networks and locally connected networks on graphs. In: International Conference on Learning Representations

  36. Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations

  37. Chen R, Wang X, Zhang W, Zhu X, Li A, Yang C (2019) A hybrid CNN-LSTM model for typhoon formation forecasting. GeoInformatica 23:375–396

    Google Scholar 

  38. Zhao L, Song Y, Zhang C, Liu Y, Wang P, Lin T, Deng M, Li H (2019) T-GCN: A temporal graph convolutional network for traffic prediction. IEEE Transactions on Intelligent Transportation Systems 21(9):3848–3858

    Google Scholar 

  39. Zhang Y, Cheng T, Ren Y, Xie K (2020) A novel residual graph convolution deep learning model for short-term network-based traffic forecasting. Int J Geograph Inf Sci 34:969–995

    Google Scholar 

  40. Bai J, Zhu J, Song Y, Zhao L, Hou Z, Du R, Li H (2021) A3t-gcn: Attention temporal graph convolutional network for traffic forecasting. ISPRS Int J Geo-Inf 10:485

    Google Scholar 

  41. Bui K-HN, Cho J, Yi H (2021) Spatial-temporal graph neural network for traffic forecasting: An overview and open research issues. Appl Intell:1–12

  42. Hu S, Gao S, Wu L, Xu Y, Zhang Z, Cui H, Gong X (2021) Urban function classification at road segment level using taxi trajectory data: A graph convolutional neural network approach. Comput Environ Urban Syst 87:101619

    Google Scholar 

  43. Griffith DA, Paelinck JHP (2011) Non-standard spatial statistics and spatial econometrics. Springer Science & Business Media

  44. Arbia G (2014) A primer for spatial econometrics with applications in r. Springer

  45. Anselin L, Rey SJ (2014) Modern spatial econometrics in practice: A guide to geoda, geodaspace and pysal. GeoDa Press LLC

  46. Kelejian H, Piras G (2017) Spatial econometrics. Academic Press

  47. Yamagata Y, Seya H (2019) Spatial analysis using big data: Methods and urban applications. Academic Press

  48. LeSage JP, Pace RK (2009) Introduction to spatial econometrics. CRC Press/Taylor & Francis, Boca Raton

  49. Fotheringham AS, Yang W, Kang W (2017) Multiscale Geographically Weighted Regression (MGWR). Ann Amer Assoc Geograph 107:1247–1265 (en)

    Google Scholar 

  50. Ord K (1975) Estimation methods for models of spatial interaction. J Amer Stat Assoc 70:120–126

    MathSciNet  MATH  Google Scholar 

  51. Kelejian HH, Prucha IR (1998) A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J Real Estate Finance Econ 17(1):99–121

    Google Scholar 

  52. Kelejian HH, Prucha IR (1999) A generalized moments estimator for the autoregressive parameter in a spatial model. Int Econ Rev 40(2):509–533

    MathSciNet  Google Scholar 

  53. Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT press Cambridge

  54. Paola JD, Schowengerdt RA (1995) A detailed comparison of backpropagation neural network and maximum-likelihood classifiers for urban land use classification. IEEE Trans Geosci Remote Sens 33(4):981–996

    Google Scholar 

  55. Vatsavai RR, Bhaduri B (2011) A hybrid classification scheme for mining multisource geospatial data. GeoInformatica 15:29–47

    Google Scholar 

  56. Zhu D, Wang N, Wu L, Liu Y (2017) Street as a big geo-data assembly and analysis unit in urban studies: A case study using beijing taxi data. Appl Geograph 86:152–164

    Google Scholar 

  57. Long Y, Liu X (2013) How mixed is beijing, china? a visual exploration of mixed land use. Environ Plann A 45:2797–2798

    Google Scholar 

  58. Chen L, Gao Y, Zhu D, Yuan Y, Liu Y (2019) Quantifying the scale effect in geospatial big data using semi-variograms. PloS one 14:e0225139

    Google Scholar 

  59. Anselin L (2009) Spatial regression. In: Fotheringham AS, Rogerson PA (eds) The SAGE Handbook of Spatial Analysis. SAGE Publications, Los Angeles, pp 255–275

  60. Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv:1710.10903

  61. Wu S, Wang Z, Du Z, Huang B, Zhang F, Liu R (2020) Geographically and temporally neural network weighted regression for modeling spatiotemporal non-stationary relationships. Int J Geograph Inf Sci:1–27

  62. Kwan M-P (2012) The uncertain geographic context problem. Ann Assoc Amer Geograph 102:958–968

    Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge editors, the anonymous reviewers, Dr. Tao Cheng, Dr. Yang Zhang, Dr. Ximeng Cheng, and Dr. Fan Zhang for their helpful comments. This work was partially supported by the New Faculty Set-up Funding of College of Liberal Arts, University of Minnesota (1000-10964-20042-5672018). Prof. Yu Liu is supported by the National Key Research and Development Program of China (2017YFB0503602) and the National Natural Science Foundation of China (41625003).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Di Zhu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised due to incorrect wording in the Introduction.

Appendices

Appendix

In this section, we discuss SRGCNNs in a more typical spatial regression scenario: house price modeling. Utilizing a house rent price dataset at San Diego, C.A., U.S., we carefully evaluate the regression accuracies across models, and investigate model performances given different sets of explanatory variables.

A.1 Data description and feature selection

The open data behind the Inside Airbnb siteFootnote 3 was collected for this appendix experiment. The data is sourced from publicly available information in the Airbnb site, which includes the daily rent price of listed properties and many additional attributes for the listings. To compare SRGCNNs with traditional spatial regression models, we will examine some information in all Airbnb listings in San Diego, C.A., U.S. on July, 07, 2016.

In Fig. 11, we visualize the logarithm prices to the base e (ln(price)) in Fig. 11a. The map utilizes a percentile color scheme to highlight both the extreme high prices (head 1%, in yellow) and low prices (tail 1%, in black). There are 6,110 collected Airbnb listings in total, the prices are obviously spatial autocorrelated, with high-value clusters as well as low-value clusters within the study area. In Fig.11b, we present the house characteristics that are of interest in this experiment, with both continuous variables (e.g., number of accommodate people as in “accommodates”, and number of beds as in “beds”) and categorical variables (e.g., rent type as in “rt_XXX”, and property groupy in “pg_XXX”). Also, we include a binary variable named “coastal” to indicate whether a house is near the ocean.

Fig. 11
figure 11

Daily price in local currency for Airbnb listings in San Diego, U.S., collected on July, 07, 2016. a Map visualization of ln(price). b Listing’s attributes as the interested variables

In the following regression analysis, we will use ln(price) as the dependent variable y, and examine two independent sets X (four variables) and X+ (eleven variables). The basic independent variable set X = {“accommodates”, “bathrooms”, “bedrooms”, “beds”} contains only the four continuous intrinsic characteristics: number of accommodate people, number of bedrooms, number of bathrooms, and number of beds. While the extended independent variable set X+ = {“accommodates”, “bathrooms”, “bedrooms”, “beds”, “rt_Private_room”, “rt_Shared_room”, “pg_House”, “pg_Condominium”, “pg_Townhouse”, “pg_Other”, “coastal”} contains additional characteristics of rent type, property group, and the coastal indicator. The rent types are used as dummy variables, denoting whether a listing belongs to a private room, a shared room, or an entire home. The property groups are also used as dummy variables, indicating whether the listing is an apartment, a condom, a townhouse, a single family house or others.

Note that it is possible to include other surrounding environmental context as the independent variables, such as the distance to the highways and number of parks in the neighborhood to further improve the regression accuracy. However, the selection of informative feature variables is beyond the scope of our paper. Here, we just provide two different sets of independent variables in order to shed light on the influence of feature engineering in SRGCNN-based models. Future applications are invited to test out SRGCNNs with different feature combinations in specialized tasks.

A.2 Model training

The regressions on prices at all locations (100% training ratio) are performed using linear regression model (LR), spatial autoregressive model (SAR), and the SRGCNN-GW model (19). For simplicity, models with the additional variable set X+ are referred to as LR+, SAR+, and SRGCNN-GW+, respectively. We choose SRGCNN-GW model here rather than the basic SRGCNN model because SRGCNN-GW is better at fitting the training dataset, while the basic SRGCNN model is better for prediction (as discussed in Section 5.2).

We consider k= 20 nearest neighbors for each location to construct the spatial weights matrix in SAR and the graph structure in SRGCNN-GW. It is optional to change the way of defining the spatial structure, e.g., a different k, or using other measurements such as distance, queen adjacency. We won’t dive into this because the influence of geographic contexts on spatial regression is another topic to investigate [25] and it is beyond the scope of this paper.

We adopt similar training settings as introduced in Section 4.2.3. The learning rate is changed to η = 3 × 10− 2. Training epochs are capped at 15,000 for SRGCNN-GW and 18,000 for SRGCNN-GW+. We record the best results among all epochs. The MSE Loss and MAPE during the training process are plotted in Fig. 12. The hidden feature units are set to be 4 × 8 = 32 for SRGCNN-GW and 11 × 8 = 88 for SRGCNN-GW+, considering the different input features provided in X and X+. As can be seen, SRGCNN-GW reaches the lowest MAPE at epoch 12,161, while SRGCNN-GW+ reaches its lowest MAPE at epoch 15,508. After that, both models exhibit overfitting as the MAPE starts to rise up again. For the whole training, SRGCNN-GW+ converged slightly slower (with more epochs) compared to SRGCNN-GW, because there are more feature parameters to be learned in the geographic weighted graph convolutions.

Fig. 12
figure 12

Training process of the SRGCNN-GW models (a) SRGCNN-GW: X as the explanatory variables. b SRGCNN-GW+: X+ as the explanatory variables

A.3 Evaluation of the results

The results across all models are summarized in Table 4. We report the R2 and MAPE as two metrics to evaluate the goodness of model fitting. Also, the Z-scored Morans’ I value of prediction errors and the fitted ln(price) are included to indicate how the models capture the spatial effects in data.

Table 4 Model performances on fitting all prices (sampling ratio:100%)

It is encouraging to find that SRGCNN-GW significantly outperforms LR and SAR regarding both R2 and MAPE. Using the basic independent variable set X, we can see that LR, the non-spatial linear regression model, can only explain about 56% of the real process; SAR, the most common used spatial lagged model, increases the goodness of fit to about 62%; SRGCNN-GW model, however, reaches a much higher goodness of fit around 83%. By adding more explanatory information, all models exhibit better results using X+. LR increases from 55% to 68%, SAR increases from 62% to 71%, and SRGCNN increases from 83% to 87%. Since SRGCNN-based models consider more complex spatial relationships during the modeling, the influence of additional independent variables is less than traditional models such as linear regression and spatial lagged models. With respect to the MAPE, conclusions are exactly the same, SRGCNN-GW+ reaches the lowest fitting error at only 3.76%. Seen from the Z-scored autocorrelations, SAR and SAR+ are better at handling the spatial errors. SRGCNN-GW models also have lower error autocorrelations compared to LR models. SRGCNN-GW models reports higher global autocorrelation of the fitted price than SAR, indicating an explicitly modeling of spatial structure in its graph convolution layers.

Results are also compared in Fig. 13. The spatial distributions of ln(price) are plotted in the first row for both the original data and the model predictions. The scatter plots and the Pearson correlation coefficients ρ are presented in the second row to further evaluate the models. As shown, SRGCNN-GW+ has done an outstanding job fitting the price data, with a modeled spatial pattern really similar to the original one and a highest Pearson correlation ρ = 0.9334. The number of input features does have influences on the modeling accuracies, but it is still not clear on how to select informative variables for SRGCNN models. Future works are to develop specialized methods for the visualization and analysis of complex feature parameters in SRGCNN models with regard to regression statistics.

Fig. 13
figure 13

Model comparison in maps and scatterplots

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhu, D., Liu, Y., Yao, X. et al. Spatial regression graph convolutional neural networks: A deep learning paradigm for spatial multivariate distributions. Geoinformatica (2021). https://doi.org/10.1007/s10707-021-00454-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10707-021-00454-x

Keywords

  • Spatial regression
  • Graph convolutional neural networks
  • Deep learning
  • GeoAI
  • Social sensing