Spatial Filter Versus Conventional Spatial Model Specifications: Some Comparisons

Griffith, Daniel A.; Paelinck, Jean H.P.

doi:10.1007/978-3-642-16043-1_7

Daniel A. Griffith³ &
Jean H.P. Paelinck⁴

Part of the book series: Advances in Geographic Information Science ((AGIS,volume 1))

2269 Accesses
1 Citations

Abstract

Spatial statistical analysis of geographically distributed counts data has been widely undertaken for many years, with initial analyses involving log-Gaussian approximations because only the normal probability model was first adapted in an implementable form (Ripley, 1990, pp. 9–10) to handle spatial autocorrelation (SA) effects (i.e., similar values tend to cluster on a map, indicating positive self-correlation among observations). In more recent years, linear regression techniques have given way to generalized linear model techniques that account for non-normality (e.g., logistic and Poisson regression), as well as geographic dependence. In very recent years, both linear and generalized linear models have been supplemented with hierarchical Bayesian models, in part to deal with geographic regions having small counts. The objective of this chapter is to furnish a comparison of this variety of principal techniques—both frequentist and Bayesian—available for map analysis with the newly formulated spatial filtering approach.

This material is based upon work supported by the National Science Foundation under Grant No. BCS-9905213. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A HGLM is a GLM (e.g., Poisson, binomial, gamma) with multiple levels. The lowest level posits the probability model for individual observations. Higher levels posit probability models for parameters (e.g., prior distributions).
2.
A careful inspection of these data from multiple sources reveals published discrepancies: Cressie (1999), Waller and Gotway (2004) and GeoBUGS enumerate correct lists of geographic neighbors; in contrast, Clayton and Kaldor (1987), Breslow and Clayton (1993), Stern and Cressie (1999)—vis-à-vis Cressie and Guo (1987)—and Lee and Nelder (2001) have the lists of neighbors for Annandale and Tweeddale switched.
3.
An offset variable is one whose regression coefficient is known to be, and hence is set equal to, 1.
4.
Regardless of the context, regional aggregates with small base populations tend to yield imprecise standardized ratios, whereas regional aggregates with large base populations almost always yield significant results.
5.
For the Scottish lip cancer data, latitude and longitude geo-coordinates were retrieved from Waller and Gotway (see http://www.sph.emory.edu/~lwaller/ch9index.htm), and then refined with an ArcView script.
6.
The MC is a covariation-based measure that is similar to a Pearson product-moment correlation coefficient, and has an approximate range of (1/\({{{\uplambda }}_{\textrm{n}}}\), 1/\({{{\uplambda }}_{2}}\)), where \({{{\uplambda }}_{2}}\) and \({{{\uplambda }}_{\textrm{n}}}\) respectively are the second largest and smallest eigenvalues of matrix C. Its expected value is –1/(n–1). The GR is a paired comparisons type of index, is inversely related to the MC, has an approximate range of (0, 2), and has an expected value of 1.
7.
The non-zero exponent functional form is \({{\textrm{Y}}_{\textrm{N}}} = {{\upalpha }} + {{\upbeta }}{\left[ {({{\textrm{O}}_{\textrm{i}}} + {{\updelta }})/({{\textrm{E}}_{\textrm{i}}} + {{\updelta }})} \right]^\gamma },\) which here yields rounded-off parameter estimates of \( {\hat{\updelta}}\) = 0.10 and \( {\hat{\upgamma}}\) = 0.33 [RESS = 1.48\( \times \)10^–2; P(S-W) = 0.726]. Setting \({{\updelta }}\) = 0.5 yields \( {\hat{\upgamma}}\) = –0.10 (RESS = 1.71\( \times \)10^–2), which is very close to 0. Setting \({{\updelta }}\) = 0.5 and executing Friendly’s SAS macro boxcox[1].sas (http://www.math.yorku.ca/SCS/sasmac/boxcox.html) yields \( {\hat{\upgamma}}\) = 0.20 (RESS = 2.56\( \times \)10^–2), which also is close to 0; setting \({{\updelta }}\) = 0.1 yields \({\hat{\upgamma}}\) = 0.31, which essentially is the same result obtained with the quantile equation. As an aside, the Freeman-Tukey transformation (Cressie, 1991, p. 540) furnishes an inferior result with its translation parameter of 1 [P(S-W) = 0.061]; its optimal translation parameter estimate also is 0.5, which modestly improves its performance here [P(S-W) = 0.131].
8.
A translation parameter is added to both the numerator and the denominator because E_i is based upon a sum of the O_is (i.e., the sums of the E_is and the O_is are equal). In the simple case of each regional expected value being calculated with a landscape-wide rate, for example: \({{\textrm{P}}_{\textrm{i}}}\frac{{\sum\limits_{{\textrm{i}} = {1}}^{\textrm{N}} {{(}{{\textrm{O}}_{\textrm{i}}} + {{\updelta )}}} }}{{\sum\limits_{{\textrm{i}} = {\textrm{1}}}^{\textrm{N}} {{{\textrm{P}}_{\textrm{i}}}} }} = {{\textrm{P}}_{\textrm{i}}}\frac{{\sum\limits_{{\textrm{i}} = {\textrm{1}}}^{\textrm{N}} {{{\textrm{O}}_{\textrm{i}}}} }}{{\sum\limits_{{\textrm{i}} = {\textrm{1}}}^{\textrm{N}} {{{\textrm{P}}_{\textrm{i}}}} }} + \frac{{{{\textrm{P}}_{\textrm{i}}}}}{{\sum\limits_{{\textrm{i}} = {\textrm{1}}}^{\textrm{N}} {{{\textrm{P}}_{\textrm{i}}}}}}\textrm{N}{\updelta} \stackrel{{P_{i} \rightarrow P_{i}}}{\longrightarrow} {\updelta}\Rightarrow \frac{{{{\textrm{O}}_{\textrm{i}}} + {{\updelta }}}}{{{{\textrm{E}}_{\textrm{i}}} + {{\updelta }}}}\), for regional “base populations” P_i in the i^th of N areal units.
9.
A value of 0.25 for the MC tends to relate to about 5% of the variance in Y being attributable to redundant information arising from latent spatial autocorrelation, given a particular areal unit neighborhood configuration.
10.
The Levene test statistic was used to assess homogeneity of variance across groupings because the magnitude of the numbers involved allows them to be treated as though they approximate a continuous random variable. Meanwhile, there is no reason to expect that these sets of numbers conform to normal distributions, eliminating the possibility of using the Bartlett test statistic. The R measure is described in Gelman and Rubin (1992).
11.
These computations are based upon the Fisher information matrix.
12.
Huffer and Wu (1998, p. 514) note that studying the multivariate behavior of MCMC parameter estimates is a rather complicated and daunting problem, and suggest examining only univariate aspects of the sampling distributions of the individual MCMC estimates (i.e., each parameter estimate separately).
13.
Cressie employs the transformation \(\sqrt {\frac{\#\,\,\hbox{of}\,\,\hbox{lip}\,\,\hbox{cancer}\,\,\hbox{cases}}{\#\,\,\hbox{of}\,\, \hbox{males}\,\, \hbox{at}\,\, \hbox{risk}}} \)+\(\sqrt {\frac{\#\,\,\hbox{of}\,\,\hbox{lip}\,\,\hbox{cancer}\,\,\hbox{cases} + 1}{\#\,\,\hbox{of}\,\, \hbox{males}\,\, \hbox{at}\,\, \hbox{risk}}} \).
14.
The P(S-W) values for the various models are: 0.922 for the SAR, 0.703 for the Winsorized auto-Poisson, 0.445 for the Poisson spatial filter, 0.067 for the GeoBUGS proper CAR, and 0.575 for the BUGS spatial filter specification.
15.
PCAR denotes the proper conditional autoregressive model specification, which restricts the value of the autoregressive parameter to its feasible parameter space.
16.
Effective degrees of freedom were calculated in BUGS as parameter estimate p_D (see Spiegelhalter et al., 2002).
17.
The P(S-W) values for the various models are: 0.401 for the SAR, 0.289 for the Winsorized auto-Poisson, 0.464 for the Poisson spatial filter, 0.092 for the GeoBUGS proper CAR, and 0.926 for the BUGS spatial filter specification.

References

Bartlett, M. 1947. The use of transformations, Biometrics, 3: 39–52.
Article Google Scholar
Besag, J.E. 1974. Spatial interaction and the statistical analysis of lattice systems, Journal of the Royal Statistical Society B, 36: 192–225.
Google Scholar
Besag, J., York, J., Mollié, A. 1991. Bayesian image restoration with two applications in spatial statistics, Annals of the Institute of Statistical Mathematics, 43: 1–59.
Article Google Scholar
Breslow, N., Clayton, D. 1993. Approximate inference in generalized linear mixed models, Journal of the American Statistical Association, 88: 9–25.
Article Google Scholar
Casella, G. 1985. An introduction to empirical Bayes data analysis, The American Statistician, 39: 83–87.
Article Google Scholar
Casella, G., George, E. 1992. Explaining the Gibbs sampler, The American Statistician, 46: 167–174.
Article Google Scholar
Chinn, S. 1996. Choosing a transformation, Journal of Applied Statistics, 23: 395–404.
Article Google Scholar
Clayton, D., Kaldor, J. 1987. Empirical Bayes estimates of age-standardized relative risks for use in disease mapping, Biometrics, 43: 671–681.
Article Google Scholar
Clifford, P., Richardson, S., Hémon, D. 1989. Assessing the significance of the correlation between two spatial processes, Biometrics, 45: 123–134.
Article Google Scholar
Cressie, N. 1989. Geostatistics, The American Statistician, 43: 197–202.
Article Google Scholar
Cressie, N. 1991. Statistics for Spatial Data. New York, NY: Wiley.
Google Scholar
Cressie, N., Guo, R. 1987. Mapping variables, in Proceedings of the NCGA Conference, Computer Graphics ‘87. McLean, VA: National Computer Graphics Association, III: 521–530.
Google Scholar
de Jong, P., Sprenger, C., Van Veen, F. 1984. On extreme values of Moran’s I and Geary’s c, Geographical Analysis, 16: 17–24.
Article Google Scholar
Dutilleul, P. 1993. Modifying the t test for assessing the correlation between two spatial processes, Biometrics, 49: 305–314.
Article Google Scholar
Gelman, A., Rubin, D. 1992. Inference from iterative simulation using multiple sequences (with discussion), Statistical Science, 7: 457–511.
Article Google Scholar
Getis, A., Griffith, D. 2002. Comparative spatial filtering in regression analysis, Geographical Analysis, 34: 130–140.
Google Scholar
Gilks, R., Richardson, S., Spiegelhalter, J. (eds.). 1996. Markov Chain Monte Carlo in Practice. New York, NY: Chapman & Hall.
Google Scholar
Griffith, D. 2000a. A linear regression solution to the spatial autocorrelation problem, Journal of Geographical Systems, 2: 141–156.
Article Google Scholar
Griffith, D. 2002a. A spatial filtering specification for the auto-Poisson model, Statistics and Probability Letters, 58: 245–251.
Article Google Scholar
Griffith, D. 2003. Spatial Autocorrelation and Spatial Filtering: Gaining Understanding Through Theory and Scientific Visualization. Berlin: Springer.
Google Scholar
Griffith, D., Haining, R. 2006. Beyond mule kicks: The Poisson distribution in geographical analysis, Geographical Analysis, 38: 123–139.
Google Scholar
Haining, R. 1990. Spatial Data Analysis in the Social and Environmental Sciences. Cambridge: Cambridge University Press.
Google Scholar
Haining, R. 1991. Bivariate correlation and spatial data, Geographical Analysis, 23: 210–227.
Article Google Scholar
Hill, E., Allen A., Waller, L. 1999. A comparison of focused score tests and Bayesian hierarchical models for detecting spatial disease clustering, Journal of the National Institute of Public Health, 48: 102–112.
Google Scholar
Hubbell S., Ahumada, J., Condit, R., Foster, R. 2001. Local neighborhood effects on long-term survival of individual trees in a neotropical forest, Ecological Research, 16: 859–875.
Article Google Scholar
Huffer, F., Wu, H. 1998. Markov chain Monte Carlo for autologistic regression models with application to the distribution of plant species, Biometrics, 54: 509–524
Article Google Scholar
Kaiser, M., Cressie, N. 1997. Modeling poisson variables with positive spatial dependence, Statistics and Probability Letters, 35: 423–432.
Article Google Scholar
Kuehl, O. 1994. Statistical Principles of Research Design and Analysis. Belmont, CA: Duxbury Press.
Google Scholar
Lee, Y., Nelder, J. 2001. Modelling and analysing correlated non-normal data, Statistical Modeling, 1: 3–16.
Article Google Scholar
McCullagh, P., Nelder, J. 1983 (2nd ed., 1989). Generalized Linear Models, 1st ed. London: Chapman & Hall.
Google Scholar
Mollie, A. 1996. Bayesian mapping of disease. In R. Gilks, S. Richardson, D. Spiegelhalter (eds.), Markov Chain Monte Carlo in Practice. New York, NY: Chapman & Hall, pp. 359–379.
Google Scholar
Ripley, B. 1990. Gibbsian interaction models. In D.A. Griffith (ed.), Spatial Statistics: Past, Present, and Future. Ann Arbor, MI: Institute of Mathematical Geography, pp. 3–25.
Google Scholar
Snedecor, G., Cochran, W. 1967. Statistical Methods, Sixth Edition. Ames, IA: Iowa State U. Press.
Google Scholar
Spiegelhalter, D., Best, N., Carlin, B., Van der Linde, A. 2002. Bayesian measures of model complexity and fit (with discussion), Journal of Royal Statistic Society B, 64: 583–640.
Article Google Scholar
Stern, H., Cressie, N. 1999. Inference for extremes in disease mapping. In A. Lawson, A. Biggeri, D. Bohning, E. Lesaffre, J-F. Viel, R. Bertollini (eds.), Disease Mapping and Risk Assessment for Public Health. Chichester: Wiley, pp. 63–84.
Google Scholar
Thomas, A., Best, N., Lunn, D., Arnold, R., Spiegelhalter, D. 2004. GeoBUGS User Manual, version 1.2. Accessed at http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/geobugs12manual.pdf on 3/4/2005.
Tiefelsdorf, M., Boots, B. 1995. The exact distribution of Moran's I, Environment and Planning A, 27: 985–999.
Article Google Scholar
Upton, G., Fingleton, B. 1989. Spatial Data Analysis by Example, vol. 2. New York, NY: Wiley.
Google Scholar
Waller, L., Gotway, C. 2004. Applied Spatial Statistics for Public Health Data. New York, NY: Wiley.
Book Google Scholar
Wrigley, N. 1985 (reprinted in 2002 by Blackburn). Categorical Data Analysis for Geographers and Environmental Scientists. Longman: London.
Google Scholar
Yeo, I.-K., Johnson, R. 2000. A new family of power transformations to improve normality or symmetry, Biometrika, 87: 954–959.
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Texas, Dallas School of Economic, Political & Policy Sciences, 800 W. Campbell Road, 75080, Richardson, Texas, USA
Prof. Daniel A. Griffith
George Mason University, School of Public Policy, Oranjelaan 36, 3062 BT, Rotterdam, Netherlands
Prof. Dr. Jean H.P. Paelinck

Authors

Prof. Daniel A. Griffith
View author publications
You can also search for this author in PubMed Google Scholar
Prof. Dr. Jean H.P. Paelinck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel A. Griffith .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Griffith, D.A., Paelinck, J.H. (2011). Spatial Filter Versus Conventional Spatial Model Specifications: Some Comparisons. In: Non-standard Spatial Statistics and Spatial Econometrics. Advances in Geographic Information Science, vol 1. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16043-1_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-16043-1_7
Published: 08 November 2010
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16042-4
Online ISBN: 978-3-642-16043-1
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)

Publish with us

Policies and ethics