Skip to main content
Log in

A systematic investigation of cross-validation in GWR model estimation: empirical analysis and Monte Carlo simulations

  • Original Article
  • Published:
Journal of Geographical Systems Aims and scope Submit manuscript

Abstract

In geographically weighted regression, one must determine a window size which will be used to subset the data locally. Typically, a cross-validation procedure is used to determine a globally optimal window size. Preliminary investigations indicate that the global cross-validation score is heavily influenced by a small number of observations in the dataset. At present, the ramifications of this behaviour in cross-validation are unknown. The research reported here explores the extent to which individual and groups of observations impact optimal window size determination, and whether one can explain why some points are more influential than others. In addition, we strive to examine the impact neighbourhood specification has on model quality in terms of predictive capabilities and the ability of the method to retrieve spatially varying processes. The analysis is based on several datasets and using simulated data in order to compare and validate results. The results provide some practical guidelines for the use of cross-validation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. For Toronto, 400 neighbours is the largest bandwidth tested, so it is likely that the frequency of local optima at 400 is being augmented by those points which perform well under even larger bandwidths. We would have liked to compute cross-validation scores for larger bandwidths but the current software used to perform the GWR computations is presently incapable of processing such large matrices.

References

  • Anselin L (1988) Spatial econometrics: methods and models. Kluwer, Dordrecht

    Google Scholar 

  • Brunsdon C, Fotheringham AS, Charlton ME (1996) Geographically weighted regression: a method for exploring spatial nonstationarity. Geogr Anal 28(4):281–298

    Article  Google Scholar 

  • Farber S (2004) A comparison of localized regression models in an hedonic house price context. M.A. Dissertation. Centre for the Study of Commercial Activity, Ryerson University

  • Farber S, Yeates M (2006) A comparison of localized regression models in a hedonic house price context. Can J Reg Sci 29(3):405–420

    Google Scholar 

  • Fotheringham AS, Brunsdon C, Charlton M (2002) Geographically weighted regression: the analysis of spatially varying relationships. Wiley, Chichester

    Google Scholar 

  • Fox J (1997) Applied regression analysis, linear models and related methods. Sage Publications, Thousand Oaks

    Google Scholar 

  • Griffith DA (1988) Advanced spatial statistics: special topics in the exploration of quantitative spatial data series. Kluwer, Dordrecht

    Google Scholar 

  • Long F (2006) Modelling spatial variations of housing prices in Toronto, ON. M.A. Dissertation. School of Geography and Earth Sciences, McMaster University

  • Nakaya T, Fotheringham AS, Brunsdon C, Charlton M (2005) Geographically weighted Poisson regression for disease association mapping. Stat Med 24(17):2695–2717

    Article  Google Scholar 

  • Páez A, Uchida T, Miyamoto K (2001) Spatial association and heterogeneity issues in land price models. Urban Stud 38(9):1493–1508

    Article  Google Scholar 

  • Páez A, Uchida T, Miyamoto K (2002a) A general framework for estimation and inference of geographically weighted regression models: 1. Location-specific kernel bandwidths and a test for locational heterogeneity. Environ Plann A 34(4):733–754

    Article  Google Scholar 

  • Páez A, Uchida T, Miyamoto K (2002b) A general framework for estimation and inference of geographically weighted regression models: 2. Spatial association and model specification tests. Environ Plann A 34(5):883–904

    Article  Google Scholar 

  • Wang N, Mei CL, Yan XD (2007) Local linear estimation of spatially varying coefficient models: an improvement on geographically weighted regression technique. Environ Plann A (forthcoming)

  • Wheeler DC, Calder CA (2007) An assessment of coefficient accuracy in linear regression models with spatially varying coefficients. J Geogr Syst 9(2):145–166

    Article  Google Scholar 

  • Wheeler D, Tiefelsdorf M (2005) Multicollinearity and correlation among local regression coefficients in geographically weighted regression. J Geogr Syst 7(2):161–187

    Article  Google Scholar 

  • Yu DL (2006) Spatially varying development mechanisms in the Greater Beijing Area: a geographically weighted regression investigation. Ann Reg Sci 40(1):173–190

    Article  Google Scholar 

  • Zhang LJ, Gove JH (2005) Spatial assessment of model errors from four regression techniques. Forest Sci 51(4):334–346

    Google Scholar 

  • Zhang LJ, Gove JH, Heath LS (2005) Spatial residual analysis of six modeling techniques. Ecol Model 186(2):154–177

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank Jean Paelinck and the participants of the Spatial Statistics and Econometrics sessions in the 2006 North American Regional Science Meetings in Toronto for their feedback and suggestions. Three anonymous reviewers provided valuable comments and insights that helped improve the paper. This research was supported by NSERC grant #261872-03. Thanks are also due to Ontario’s Municipal Property Assessment Corporation and in particular Mr. Bill Bradley for their kind support regarding the use of Toronto’s housing data. All views expressed in the paper are those of the authors alone.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steven Farber.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material (80.0 KB XLS)

Supplementary material (8.77 KB TXT)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Farber, S., Páez, A. A systematic investigation of cross-validation in GWR model estimation: empirical analysis and Monte Carlo simulations. J Geograph Syst 9, 371–396 (2007). https://doi.org/10.1007/s10109-007-0051-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10109-007-0051-3

Keywords

JEL Classification

Navigation