Abstract
Features often linked to datasets classified as “big spatial data” include: massive in size, complex (e.g., contain spatial autocorrelation), and the failure of conventional/standard analysis techniques originally designed for more modest sample sizes. These features characterize remotely sensed data, whose sizes may only be in the hundreds of thousands or millions, but whose spatial weights matrix sizes are the squares of these numbers. Consequently, spatial scientists have found that conventional spatial statistical/econometric techniques designed to handle data with n in the hundreds or thousands fail to handle practical sized remotely sensed images. This chapter outlines revised techniques to circumvent this restriction for spatial autoregression analyses. It also lays a foundation for extending findings reported here to sizeable irregular surface partitionings.
Sections of this chapter are adapted from, and an extension of Griffith, D. (2015).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Calibration refers to either estimation based upon a purposeful systematic sample covering a feasible parameter space or calculations based upon population moments.
- 2.
Mining began at the La Oroya, Peru, smelting site in 1893. Copper smelting began at the site in 1922 (70K tonnes capacity). Lead smelting began at the site in 1928 (120K tonnes capacity). Finally, zinc smelting began at the site in 1952 (45K tonnes capacity). Air pollution readings in 1999 indicated extreme levels of arsenic, cadmium, and lead.
- 3.
For a P-by-Q regular square tessellation and a queen adjacency SWM , \( \mathrm{TR}\left({\mathbf{W}}^{\mathrm{T}}\mathbf{W}\right)=\sum \limits_{j=1}^n{\lambda}_j^2+\frac{81P+81Q+326}{2400} \). For a P-by-Q (horizontal-by-vertical) regular hexagonal tessellation SWM, \( \mathrm{TR}\left({\mathbf{W}}^{\mathrm{T}}\mathbf{W}\right)=\sum \limits_{j=1}^n{\lambda}_j^2+\frac{5P+12Q+25}{180} \). The limit of these correction factors goes to zero as the size of a surface partitioning goes to infinity:
- 4.
For a given n, this correction factor appears to be of the form \( \frac{P+Q+12}{72}-\beta\ {\left(\frac{2}{\delta}\right)}^{\gamma }+\alpha +\beta\ {\left(\frac{1}{\delta +\rho }+\frac{1}{\delta -\rho}\right)}^{\gamma } \). For n = 202, k = 0.7222, \( \widehat{\alpha} \) = −0.0872, \( \widehat{\beta} \) = 0.4414, \( \widehat{\delta} \) = 1.0011, \( \widehat{\gamma} \) = 1.4571, and the RESS = 1.1 × 10−5.
- 5.
This formulation allows negative values to be raised to noninteger exponents.
- 6.
This value minimized the chances of having a count of 0 or100, neither of which occurs in the empirical data.
References
Burden, S., Cressie, N., & Steel, D. (2015). The SAR model for very large datasets: A reduced rank approach. Econometrics, 3, 317–338.
Cressie, N., Olsen, A., & Cook, D. (1996). Massive data sets: Problems and possibilities, with application to environmental monitoring. In The Committee of Applied and Theoretical Statistics (Ed.), Massive data sets: Proceedings of a Workshop (pp. 115–119). Washington, DC: National Academy Press.
Griffith, D. (2015). Approximation of Gaussian spatial autoregressive models for massive regular square tessellation data. International Journal of Geographical Information Science, 29, 2143–2173.
Kelejian, H., & Prucha, I. (2010). Specification and estimation of spatial autoregressive models with autoregressive and heteroskedastic disturbances. Journal of Econometrics, 157, 53–67.
Ord, J. (1975). Estimation methods for models of spatial interactions. Journal of the American Statistical Association, 70, 120–126.
Walde, J., Larch, M., & Tappeiner, G. (2008). Performance contest between MLE and GMM for huge spatial autoregressive models. Journal of Statistical Computation and Simulation, 78, 151–166.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Griffith, D.A., Paelinck, J.H.P. (2018). Spatial Autocorrelation Parameter Estimation for Massively Large Georeferenced Datasets. In: Morphisms for Quantitative Spatial Analysis. Advanced Studies in Theoretical and Applied Econometrics, vol 51. Springer, Cham. https://doi.org/10.1007/978-3-319-72553-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-72553-6_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-72552-9
Online ISBN: 978-3-319-72553-6
eBook Packages: Economics and FinanceEconomics and Finance (R0)