Abstract
Spatial statistical analysis of geographically distributed counts data has been widely undertaken for many years, with initial analyses involving log-Gaussian approximations because only the normal probability model was first adapted in an implementable form (Ripley, 1990, pp. 9–10) to handle spatial autocorrelation (SA) effects (i.e., similar values tend to cluster on a map, indicating positive self-correlation among observations). In more recent years, linear regression techniques have given way to generalized linear model techniques that account for non-normality (e.g., logistic and Poisson regression), as well as geographic dependence. In very recent years, both linear and generalized linear models have been supplemented with hierarchical Bayesian models, in part to deal with geographic regions having small counts. The objective of this chapter is to furnish a comparison of this variety of principal techniques—both frequentist and Bayesian—available for map analysis with the newly formulated spatial filtering approach.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This material is based upon work supported by the National Science Foundation under Grant No. BCS-9905213. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
A HGLM is a GLM (e.g., Poisson, binomial, gamma) with multiple levels. The lowest level posits the probability model for individual observations. Higher levels posit probability models for parameters (e.g., prior distributions).
- 2.
A careful inspection of these data from multiple sources reveals published discrepancies: Cressie (1999), Waller and Gotway (2004) and GeoBUGS enumerate correct lists of geographic neighbors; in contrast, Clayton and Kaldor (1987), Breslow and Clayton (1993), Stern and Cressie (1999)—vis-à-vis Cressie and Guo (1987)—and Lee and Nelder (2001) have the lists of neighbors for Annandale and Tweeddale switched.
- 3.
An offset variable is one whose regression coefficient is known to be, and hence is set equal to, 1.
- 4.
Regardless of the context, regional aggregates with small base populations tend to yield imprecise standardized ratios, whereas regional aggregates with large base populations almost always yield significant results.
- 5.
For the Scottish lip cancer data, latitude and longitude geo-coordinates were retrieved from Waller and Gotway (see http://www.sph.emory.edu/~lwaller/ch9index.htm), and then refined with an ArcView script.
- 6.
The MC is a covariation-based measure that is similar to a Pearson product-moment correlation coefficient, and has an approximate range of (1/\({{{\uplambda }}_{\textrm{n}}}\), 1/\({{{\uplambda }}_{2}}\)), where \({{{\uplambda }}_{2}}\) and \({{{\uplambda }}_{\textrm{n}}}\) respectively are the second largest and smallest eigenvalues of matrix C. Its expected value is –1/(n–1). The GR is a paired comparisons type of index, is inversely related to the MC, has an approximate range of (0, 2), and has an expected value of 1.
- 7.
The non-zero exponent functional form is \({{\textrm{Y}}_{\textrm{N}}} = {{\upalpha }} + {{\upbeta }}{\left[ {({{\textrm{O}}_{\textrm{i}}} + {{\updelta }})/({{\textrm{E}}_{\textrm{i}}} + {{\updelta }})} \right]^\gamma },\) which here yields rounded-off parameter estimates of \( {\hat{\updelta}}\) = 0.10 and \( {\hat{\upgamma}}\) = 0.33 [RESS = 1.48\( \times \)10–2; P(S-W) = 0.726]. Setting \({{\updelta }}\) = 0.5 yields \( {\hat{\upgamma}}\) = –0.10 (RESS = 1.71\( \times \)10–2), which is very close to 0. Setting \({{\updelta }}\) = 0.5 and executing Friendly’s SAS macro boxcox[1].sas (http://www.math.yorku.ca/SCS/sasmac/boxcox.html) yields \( {\hat{\upgamma}}\) = 0.20 (RESS = 2.56\( \times \)10–2), which also is close to 0; setting \({{\updelta }}\) = 0.1 yields \({\hat{\upgamma}}\) = 0.31, which essentially is the same result obtained with the quantile equation. As an aside, the Freeman-Tukey transformation (Cressie, 1991, p. 540) furnishes an inferior result with its translation parameter of 1 [P(S-W) = 0.061]; its optimal translation parameter estimate also is 0.5, which modestly improves its performance here [P(S-W) = 0.131].
- 8.
A translation parameter is added to both the numerator and the denominator because Ei is based upon a sum of the Ois (i.e., the sums of the Eis and the Ois are equal). In the simple case of each regional expected value being calculated with a landscape-wide rate, for example: \({{\textrm{P}}_{\textrm{i}}}\frac{{\sum\limits_{{\textrm{i}} = {1}}^{\textrm{N}} {{(}{{\textrm{O}}_{\textrm{i}}} + {{\updelta )}}} }}{{\sum\limits_{{\textrm{i}} = {\textrm{1}}}^{\textrm{N}} {{{\textrm{P}}_{\textrm{i}}}} }} = {{\textrm{P}}_{\textrm{i}}}\frac{{\sum\limits_{{\textrm{i}} = {\textrm{1}}}^{\textrm{N}} {{{\textrm{O}}_{\textrm{i}}}} }}{{\sum\limits_{{\textrm{i}} = {\textrm{1}}}^{\textrm{N}} {{{\textrm{P}}_{\textrm{i}}}} }} + \frac{{{{\textrm{P}}_{\textrm{i}}}}}{{\sum\limits_{{\textrm{i}} = {\textrm{1}}}^{\textrm{N}} {{{\textrm{P}}_{\textrm{i}}}}}}\textrm{N}{\updelta} \stackrel{{P_{i} \rightarrow P_{i}}}{\longrightarrow} {\updelta}\Rightarrow \frac{{{{\textrm{O}}_{\textrm{i}}} + {{\updelta }}}}{{{{\textrm{E}}_{\textrm{i}}} + {{\updelta }}}}\), for regional “base populations” Pi in the ith of N areal units.
- 9.
A value of 0.25 for the MC tends to relate to about 5% of the variance in Y being attributable to redundant information arising from latent spatial autocorrelation, given a particular areal unit neighborhood configuration.
- 10.
The Levene test statistic was used to assess homogeneity of variance across groupings because the magnitude of the numbers involved allows them to be treated as though they approximate a continuous random variable. Meanwhile, there is no reason to expect that these sets of numbers conform to normal distributions, eliminating the possibility of using the Bartlett test statistic. The R measure is described in Gelman and Rubin (1992).
- 11.
These computations are based upon the Fisher information matrix.
- 12.
Huffer and Wu (1998, p. 514) note that studying the multivariate behavior of MCMC parameter estimates is a rather complicated and daunting problem, and suggest examining only univariate aspects of the sampling distributions of the individual MCMC estimates (i.e., each parameter estimate separately).
- 13.
Cressie employs the transformation \(\sqrt {\frac{\#\,\,\hbox{of}\,\,\hbox{lip}\,\,\hbox{cancer}\,\,\hbox{cases}}{\#\,\,\hbox{of}\,\, \hbox{males}\,\, \hbox{at}\,\, \hbox{risk}}} \)+\(\sqrt {\frac{\#\,\,\hbox{of}\,\,\hbox{lip}\,\,\hbox{cancer}\,\,\hbox{cases} + 1}{\#\,\,\hbox{of}\,\, \hbox{males}\,\, \hbox{at}\,\, \hbox{risk}}} \).
- 14.
The P(S-W) values for the various models are: 0.922 for the SAR, 0.703 for the Winsorized auto-Poisson, 0.445 for the Poisson spatial filter, 0.067 for the GeoBUGS proper CAR, and 0.575 for the BUGS spatial filter specification.
- 15.
PCAR denotes the proper conditional autoregressive model specification, which restricts the value of the autoregressive parameter to its feasible parameter space.
- 16.
Effective degrees of freedom were calculated in BUGS as parameter estimate pD (see Spiegelhalter et al., 2002).
- 17.
The P(S-W) values for the various models are: 0.401 for the SAR, 0.289 for the Winsorized auto-Poisson, 0.464 for the Poisson spatial filter, 0.092 for the GeoBUGS proper CAR, and 0.926 for the BUGS spatial filter specification.
References
Bartlett, M. 1947. The use of transformations, Biometrics, 3: 39–52.
Besag, J.E. 1974. Spatial interaction and the statistical analysis of lattice systems, Journal of the Royal Statistical Society B, 36: 192–225.
Besag, J., York, J., Mollié, A. 1991. Bayesian image restoration with two applications in spatial statistics, Annals of the Institute of Statistical Mathematics, 43: 1–59.
Breslow, N., Clayton, D. 1993. Approximate inference in generalized linear mixed models, Journal of the American Statistical Association, 88: 9–25.
Casella, G. 1985. An introduction to empirical Bayes data analysis, The American Statistician, 39: 83–87.
Casella, G., George, E. 1992. Explaining the Gibbs sampler, The American Statistician, 46: 167–174.
Chinn, S. 1996. Choosing a transformation, Journal of Applied Statistics, 23: 395–404.
Clayton, D., Kaldor, J. 1987. Empirical Bayes estimates of age-standardized relative risks for use in disease mapping, Biometrics, 43: 671–681.
Clifford, P., Richardson, S., Hémon, D. 1989. Assessing the significance of the correlation between two spatial processes, Biometrics, 45: 123–134.
Cressie, N. 1989. Geostatistics, The American Statistician, 43: 197–202.
Cressie, N. 1991. Statistics for Spatial Data. New York, NY: Wiley.
Cressie, N., Guo, R. 1987. Mapping variables, in Proceedings of the NCGA Conference, Computer Graphics ‘87. McLean, VA: National Computer Graphics Association, III: 521–530.
de Jong, P., Sprenger, C., Van Veen, F. 1984. On extreme values of Moran’s I and Geary’s c, Geographical Analysis, 16: 17–24.
Dutilleul, P. 1993. Modifying the t test for assessing the correlation between two spatial processes, Biometrics, 49: 305–314.
Gelman, A., Rubin, D. 1992. Inference from iterative simulation using multiple sequences (with discussion), Statistical Science, 7: 457–511.
Getis, A., Griffith, D. 2002. Comparative spatial filtering in regression analysis, Geographical Analysis, 34: 130–140.
Gilks, R., Richardson, S., Spiegelhalter, J. (eds.). 1996. Markov Chain Monte Carlo in Practice. New York, NY: Chapman & Hall.
Griffith, D. 2000a. A linear regression solution to the spatial autocorrelation problem, Journal of Geographical Systems, 2: 141–156.
Griffith, D. 2002a. A spatial filtering specification for the auto-Poisson model, Statistics and Probability Letters, 58: 245–251.
Griffith, D. 2003. Spatial Autocorrelation and Spatial Filtering: Gaining Understanding Through Theory and Scientific Visualization. Berlin: Springer.
Griffith, D., Haining, R. 2006. Beyond mule kicks: The Poisson distribution in geographical analysis, Geographical Analysis, 38: 123–139.
Haining, R. 1990. Spatial Data Analysis in the Social and Environmental Sciences. Cambridge: Cambridge University Press.
Haining, R. 1991. Bivariate correlation and spatial data, Geographical Analysis, 23: 210–227.
Hill, E., Allen A., Waller, L. 1999. A comparison of focused score tests and Bayesian hierarchical models for detecting spatial disease clustering, Journal of the National Institute of Public Health, 48: 102–112.
Hubbell S., Ahumada, J., Condit, R., Foster, R. 2001. Local neighborhood effects on long-term survival of individual trees in a neotropical forest, Ecological Research, 16: 859–875.
Huffer, F., Wu, H. 1998. Markov chain Monte Carlo for autologistic regression models with application to the distribution of plant species, Biometrics, 54: 509–524
Kaiser, M., Cressie, N. 1997. Modeling poisson variables with positive spatial dependence, Statistics and Probability Letters, 35: 423–432.
Kuehl, O. 1994. Statistical Principles of Research Design and Analysis. Belmont, CA: Duxbury Press.
Lee, Y., Nelder, J. 2001. Modelling and analysing correlated non-normal data, Statistical Modeling, 1: 3–16.
McCullagh, P., Nelder, J. 1983 (2nd ed., 1989). Generalized Linear Models, 1st ed. London: Chapman & Hall.
Mollie, A. 1996. Bayesian mapping of disease. In R. Gilks, S. Richardson, D. Spiegelhalter (eds.), Markov Chain Monte Carlo in Practice. New York, NY: Chapman & Hall, pp. 359–379.
Ripley, B. 1990. Gibbsian interaction models. In D.A. Griffith (ed.), Spatial Statistics: Past, Present, and Future. Ann Arbor, MI: Institute of Mathematical Geography, pp. 3–25.
Snedecor, G., Cochran, W. 1967. Statistical Methods, Sixth Edition. Ames, IA: Iowa State U. Press.
Spiegelhalter, D., Best, N., Carlin, B., Van der Linde, A. 2002. Bayesian measures of model complexity and fit (with discussion), Journal of Royal Statistic Society B, 64: 583–640.
Stern, H., Cressie, N. 1999. Inference for extremes in disease mapping. In A. Lawson, A. Biggeri, D. Bohning, E. Lesaffre, J-F. Viel, R. Bertollini (eds.), Disease Mapping and Risk Assessment for Public Health. Chichester: Wiley, pp. 63–84.
Thomas, A., Best, N., Lunn, D., Arnold, R., Spiegelhalter, D. 2004. GeoBUGS User Manual, version 1.2. Accessed at http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/geobugs12manual.pdf on 3/4/2005.
Tiefelsdorf, M., Boots, B. 1995. The exact distribution of Moran's I, Environment and Planning A, 27: 985–999.
Upton, G., Fingleton, B. 1989. Spatial Data Analysis by Example, vol. 2. New York, NY: Wiley.
Waller, L., Gotway, C. 2004. Applied Spatial Statistics for Public Health Data. New York, NY: Wiley.
Wrigley, N. 1985 (reprinted in 2002 by Blackburn). Categorical Data Analysis for Geographers and Environmental Scientists. Longman: London.
Yeo, I.-K., Johnson, R. 2000. A new family of power transformations to improve normality or symmetry, Biometrika, 87: 954–959.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2011 Springer Berlin Heidelberg
About this chapter
Cite this chapter
Griffith, D.A., Paelinck, J.H. (2011). Spatial Filter Versus Conventional Spatial Model Specifications: Some Comparisons. In: Non-standard Spatial Statistics and Spatial Econometrics. Advances in Geographic Information Science, vol 1. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16043-1_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-16043-1_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16042-4
Online ISBN: 978-3-642-16043-1
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)