Skip to main content

Spatial Autocorrelation Parameter Estimation for Massively Large Georeferenced Datasets

  • Chapter
  • First Online:
Morphisms for Quantitative Spatial Analysis

Part of the book series: Advanced Studies in Theoretical and Applied Econometrics ((ASTA,volume 51))

  • 737 Accesses

Abstract

Features often linked to datasets classified as “big spatial data” include: massive in size, complex (e.g., contain spatial autocorrelation), and the failure of conventional/standard analysis techniques originally designed for more modest sample sizes. These features characterize remotely sensed data, whose sizes may only be in the hundreds of thousands or millions, but whose spatial weights matrix sizes are the squares of these numbers. Consequently, spatial scientists have found that conventional spatial statistical/econometric techniques designed to handle data with n in the hundreds or thousands fail to handle practical sized remotely sensed images. This chapter outlines revised techniques to circumvent this restriction for spatial autoregression analyses. It also lays a foundation for extending findings reported here to sizeable irregular surface partitionings.

Sections of this chapter are adapted from, and an extension of Griffith, D. (2015).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.00
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Calibration refers to either estimation based upon a purposeful systematic sample covering a feasible parameter space or calculations based upon population moments.

  2. 2.

    Mining began at the La Oroya, Peru, smelting site in 1893. Copper smelting began at the site in 1922 (70K tonnes capacity). Lead smelting began at the site in 1928 (120K tonnes capacity). Finally, zinc smelting began at the site in 1952 (45K tonnes capacity). Air pollution readings in 1999 indicated extreme levels of arsenic, cadmium, and lead.

  3. 3.

    For a P-by-Q regular square tessellation and a queen adjacency SWM , \( \mathrm{TR}\left({\mathbf{W}}^{\mathrm{T}}\mathbf{W}\right)=\sum \limits_{j=1}^n{\lambda}_j^2+\frac{81P+81Q+326}{2400} \). For a P-by-Q (horizontal-by-vertical) regular hexagonal tessellation SWM, \( \mathrm{TR}\left({\mathbf{W}}^{\mathrm{T}}\mathbf{W}\right)=\sum \limits_{j=1}^n{\lambda}_j^2+\frac{5P+12Q+25}{180} \). The limit of these correction factors goes to zero as the size of a surface partitioning goes to infinity:

  4. 4.

    For a given n, this correction factor appears to be of the form \( \frac{P+Q+12}{72}-\beta\ {\left(\frac{2}{\delta}\right)}^{\gamma }+\alpha +\beta\ {\left(\frac{1}{\delta +\rho }+\frac{1}{\delta -\rho}\right)}^{\gamma } \). For n = 202, k = 0.7222, \( \widehat{\alpha} \) = −0.0872, \( \widehat{\beta} \) = 0.4414, \( \widehat{\delta} \) = 1.0011, \( \widehat{\gamma} \) = 1.4571, and the RESS = 1.1 × 10−5.

  5. 5.

    This formulation allows negative values to be raised to noninteger exponents.

  6. 6.

    This value minimized the chances of having a count of 0 or100, neither of which occurs in the empirical data.

References

  • Burden, S., Cressie, N., & Steel, D. (2015). The SAR model for very large datasets: A reduced rank approach. Econometrics, 3, 317–338.

    Article  Google Scholar 

  • Cressie, N., Olsen, A., & Cook, D. (1996). Massive data sets: Problems and possibilities, with application to environmental monitoring. In The Committee of Applied and Theoretical Statistics (Ed.), Massive data sets: Proceedings of a Workshop (pp. 115–119). Washington, DC: National Academy Press.

    Google Scholar 

  • Griffith, D. (2015). Approximation of Gaussian spatial autoregressive models for massive regular square tessellation data. International Journal of Geographical Information Science, 29, 2143–2173.

    Article  Google Scholar 

  • Kelejian, H., & Prucha, I. (2010). Specification and estimation of spatial autoregressive models with autoregressive and heteroskedastic disturbances. Journal of Econometrics, 157, 53–67.

    Article  Google Scholar 

  • Ord, J. (1975). Estimation methods for models of spatial interactions. Journal of the American Statistical Association, 70, 120–126.

    Article  Google Scholar 

  • Walde, J., Larch, M., & Tappeiner, G. (2008). Performance contest between MLE and GMM for huge spatial autoregressive models. Journal of Statistical Computation and Simulation, 78, 151–166.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Griffith, D.A., Paelinck, J.H.P. (2018). Spatial Autocorrelation Parameter Estimation for Massively Large Georeferenced Datasets. In: Morphisms for Quantitative Spatial Analysis. Advanced Studies in Theoretical and Applied Econometrics, vol 51. Springer, Cham. https://doi.org/10.1007/978-3-319-72553-6_7

Download citation

Publish with us

Policies and ethics