Journal of Geographical Systems

, Volume 9, Issue 4, pp 371–396 | Cite as

A systematic investigation of cross-validation in GWR model estimation: empirical analysis and Monte Carlo simulations

  • Steven FarberEmail author
  • Antonio Páez
Original Article


In geographically weighted regression, one must determine a window size which will be used to subset the data locally. Typically, a cross-validation procedure is used to determine a globally optimal window size. Preliminary investigations indicate that the global cross-validation score is heavily influenced by a small number of observations in the dataset. At present, the ramifications of this behaviour in cross-validation are unknown. The research reported here explores the extent to which individual and groups of observations impact optimal window size determination, and whether one can explain why some points are more influential than others. In addition, we strive to examine the impact neighbourhood specification has on model quality in terms of predictive capabilities and the ability of the method to retrieve spatially varying processes. The analysis is based on several datasets and using simulated data in order to compare and validate results. The results provide some practical guidelines for the use of cross-validation.


Geographically weighted regression Cross-validation score Influential points Goodness-of-fit Polarization 

JEL Classification




The authors would like to thank Jean Paelinck and the participants of the Spatial Statistics and Econometrics sessions in the 2006 North American Regional Science Meetings in Toronto for their feedback and suggestions. Three anonymous reviewers provided valuable comments and insights that helped improve the paper. This research was supported by NSERC grant #261872-03. Thanks are also due to Ontario’s Municipal Property Assessment Corporation and in particular Mr. Bill Bradley for their kind support regarding the use of Toronto’s housing data. All views expressed in the paper are those of the authors alone.

Supplementary material

10109_2007_51_MOESM1_ESM.xls (80 kb)
Supplementary material (80.0 KB XLS)
10109_2007_51_MOESM2_ESM.txt (9 kb)
Supplementary material (8.77 KB TXT)


  1. Anselin L (1988) Spatial econometrics: methods and models. Kluwer, DordrechtGoogle Scholar
  2. Brunsdon C, Fotheringham AS, Charlton ME (1996) Geographically weighted regression: a method for exploring spatial nonstationarity. Geogr Anal 28(4):281–298CrossRefGoogle Scholar
  3. Farber S (2004) A comparison of localized regression models in an hedonic house price context. M.A. Dissertation. Centre for the Study of Commercial Activity, Ryerson UniversityGoogle Scholar
  4. Farber S, Yeates M (2006) A comparison of localized regression models in a hedonic house price context. Can J Reg Sci 29(3):405–420Google Scholar
  5. Fotheringham AS, Brunsdon C, Charlton M (2002) Geographically weighted regression: the analysis of spatially varying relationships. Wiley, ChichesterGoogle Scholar
  6. Fox J (1997) Applied regression analysis, linear models and related methods. Sage Publications, Thousand OaksGoogle Scholar
  7. Griffith DA (1988) Advanced spatial statistics: special topics in the exploration of quantitative spatial data series. Kluwer, DordrechtGoogle Scholar
  8. Long F (2006) Modelling spatial variations of housing prices in Toronto, ON. M.A. Dissertation. School of Geography and Earth Sciences, McMaster UniversityGoogle Scholar
  9. Nakaya T, Fotheringham AS, Brunsdon C, Charlton M (2005) Geographically weighted Poisson regression for disease association mapping. Stat Med 24(17):2695–2717CrossRefGoogle Scholar
  10. Páez A, Uchida T, Miyamoto K (2001) Spatial association and heterogeneity issues in land price models. Urban Stud 38(9):1493–1508CrossRefGoogle Scholar
  11. Páez A, Uchida T, Miyamoto K (2002a) A general framework for estimation and inference of geographically weighted regression models: 1. Location-specific kernel bandwidths and a test for locational heterogeneity. Environ Plann A 34(4):733–754CrossRefGoogle Scholar
  12. Páez A, Uchida T, Miyamoto K (2002b) A general framework for estimation and inference of geographically weighted regression models: 2. Spatial association and model specification tests. Environ Plann A 34(5):883–904CrossRefGoogle Scholar
  13. Wang N, Mei CL, Yan XD (2007) Local linear estimation of spatially varying coefficient models: an improvement on geographically weighted regression technique. Environ Plann A (forthcoming)Google Scholar
  14. Wheeler DC, Calder CA (2007) An assessment of coefficient accuracy in linear regression models with spatially varying coefficients. J Geogr Syst 9(2):145–166CrossRefGoogle Scholar
  15. Wheeler D, Tiefelsdorf M (2005) Multicollinearity and correlation among local regression coefficients in geographically weighted regression. J Geogr Syst 7(2):161–187CrossRefGoogle Scholar
  16. Yu DL (2006) Spatially varying development mechanisms in the Greater Beijing Area: a geographically weighted regression investigation. Ann Reg Sci 40(1):173–190CrossRefGoogle Scholar
  17. Zhang LJ, Gove JH (2005) Spatial assessment of model errors from four regression techniques. Forest Sci 51(4):334–346Google Scholar
  18. Zhang LJ, Gove JH, Heath LS (2005) Spatial residual analysis of six modeling techniques. Ecol Model 186(2):154–177CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  1. 1.Centre for Spatial Analysis/School of Geography and Earth SciencesMcMaster UniversityHamiltonCanada

Personalised recommendations