Abstract
The original version of our Annals in Regional Science paper enumerates a number of topics that serve as focal points for the frontiers of spatial statistics and spatial econometrics.
Keywords
- Spatial Autocorrelation
- Spatial Econometric
- Ecological Fallacy
- Ancillary Information
- Exploratory Spatial Data Analysis
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
The original version of our Annals in Regional Science paper enumerates a number of topics that serve as focal points for the frontiers of spatial statistics and spatial econometrics. This first part of the book addresses some of these topics, which are loosely connected, in considerably more detail:
-
1.
The ecological fallacy: Chap. 2
-
2.
spatially adjusted statistical techniques, and quantifying spatial autocorrelation: Chap. 3
- 3.
-
4.
Bayesian hierarchical models: Chap. 6
-
5.
auto-model specification (normal, Poisson, binomial), and spatial structure as a covariate (spatial filtering): Chap. 7
-
6.
sampling network structure: design-based inference: Chap. 8
This selected list reflects research preferences of one of the authors, rather than some rank ordering of importance.
Considerable work still needs to be undertaken about the ecological fallacy. Two important aspects of this problem highlighted in Chap. 2 are: (1) georeferenced data are messy—standard statistical model and technique assumptions are not justified, and (2) sometimes only geographic aggregates can be treated. In this first case, many relationships are non-linear, which prevents them from being transferred from individuals to aggregates of individuals in a simple way. This is a critical feature that interacts with mixtures of non-identical observations, creating heterogeneity and excessive variation for geographic random variables. Spatial autocorrelation accounts for only part of this total excess variation. In this second case, rates, for example, require aggregates of individuals, as do variables such as the rural-urban dichotomy.
Seminal work establishing linkages between spatial autoregressive and geostatistical models is interesting and illuminating. This articulation needs to be extended to space-time contexts, as well as to inclusion of other model specifications such as spatial filtering and geographically varying coefficients. Autocorrelation is the key concept in these clusters of research; accordingly, spatial autocorrelation is fundamental, too. Many space-time datasets are dirty because they contain missing values (often in addition to unusual values). Using both spatially and temporally redundant information latent in a dataset allows imputations to be calculated for such missing values. This theme constitutes one of the principal problems needing solved by spatial analysis; an urgent need exists for procedures that compute extremely accurate and precise imputations.
Two facets of exploratory spatial data analysis that merit attention are a better understanding of frequency distributions constructed with georeferenced data, and correlations between georeferenced random variables. Frequently spatial scientists inspect histograms as a first step in data analysis, often finding that these graphs fail to closely align with any of the numerous existing ideal frequency distributions. Chap. 4 furnishes basic insights into why this occurs. But a mathematical statistics theoretical basis needs to be established for the intuition and numerical demonstrations appearing in that chapter. Meanwhile, spatial scientists need to recognize that correlation coefficients can be dramatically altered by latent spatial autocorrelation; depending upon prevailing spatial patterns, these coefficients can be inflated toward 1 or −1, or they can be deflated toward 0. In other words, a correlation coefficient for a pair of georeferenced random variables cannot be taken at face value!
Contemporary statistical methodology allows spatial scientists to approximate impacts of unmeasured (i.e., latent) variables and/or measurement error by including a random effects term in a model specification, acknowledging that georeferenced data are noisy (i.e., contain considerable variability). This spatial statistical topic is at the forefront of the subdiscipline today. Estimates of these impacts can be obtained with Bayesian techniques, allowing analysis of a single geographic distribution (positing prior distributions furnishes the necessary ancillary information), or with frequentist techniques when repeated measures (i.e., multiple geographic distributions, which furnish ancillary information as repeated measures) are available. Such random effects almost always have a spatially structured component, which relates to the spatial autocorrelation displayed by a georeferenced random variable. Spatial structuring can be captured with an autoregressive model (e.g., the conditional autoregressive model used in Bayesian map analysis), or by a spatial filter (i.e., regressing a random effects terms on a set of eigenvectors to separate them into a geographically varying mean response and a random error term).
These preceding discussions raise the question of relationships between spatial filtering and conventional spatial statistical models, which is the topic of Chap. 7. Spatial filtering offers the advantage of allowing a spatial scientist to work within the context of conventional statistical technology. It is consistent with statistical specifications associated with Bayesian map analysis: it represents spatial autocorrelation as a feature of model parameters, rather than correlated response variable values; as such, it casts a model intercept as an observation-specific surrogate for unobserved variables by expressing it as a spatially structured random deviation from some global intercept. This conceptualization posits that empirical probabilities are correct, while simple model parameters are not. In contrast, an auto-model posits that simple model parameters are correct, while empirical probabilities are conditional on other observations. Consequently, direct dependency between values of a response variable is replaced by the incorporation of spatial autocorrelation into prior parameter distributions, in Bayesian analysis, or a random effects intercept term, in frequentist analysis.
Finally, as can be surmised from impacts of spatial autocorrelation on histograms (e.g., Chap. 4) and correlation coefficients (e.g., Chap. 5), spatial autocorrelation affects prioritizing, say, polluted sites for remediation, based upon unusual values (e.g., hot spots)—an attribute of dirty data. Any rankings of sets of georeferenced objects (e.g., the rank size distribution of city sizes) suffer from this same corruption. This feature of georeferenced data has been recognized for decades, but little work has been produced while at the same time increasingly more sets of georeferenced objects have been ranked, some on an annual basis.
In conclusion, our Annals of Regional Science paper emphasizes a sizeable number of non-standard spatial statistics topics, some of which are treated in more depth in this book. The comprehensive treatments presented here initialize a quest to suscitate interest in the methodologies exposed and possible further applications of these methodologies.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2011 Springer Berlin Heidelberg
About this chapter
Cite this chapter
Griffith, D.A., Paelinck, J.H. (2011). General Conclusions: Spatial Statistics. In: Non-standard Spatial Statistics and Spatial Econometrics. Advances in Geographic Information Science, vol 1. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16043-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-16043-1_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16042-4
Online ISBN: 978-3-642-16043-1
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)