Abstract
Spatial data are often contaminated with a series of imperfections that reduce their quality and can dramatically distort the inferential conclusions based on spatial econometric modeling. A “clean” ideal situation considered in standard spatial econometrics textbooks is when we fit Cliff-Ord-type models to data where the spatial units constitute the full population, there are no missing data, and there is no uncertainty on the spatial observations that are free from measurement and locational errors. Unfortunately in practical cases the reality is often very different and the datasets contain all sorts of imperfections: They are often based on a sample drawn from the whole population, some data are missing and they almost invariably contain both attribute and locational errors. This is a situation of “dirty” spatial econometric modeling. Through a series of Monte Carlo experiments, this paper considers the effects on spatial econometric model estimation and hypothesis testing of two specific sources of dirt, namely missing data and locational errors.
Similar content being viewed by others
References
Anselin L (1988) Spatial econometrics: methods and models. Kluwer Academic, Dordrecht
Arbia G (2006) Spatial econometrics: statistical foundations and applications to regional convergence. Springer, Heidelberg
Arbia G (2014) A primer for spatial econometrics. Palgrave MacMillan, Basingstoke
Baltagi BH, Egger PH, Pfaffermayr M (2007) Estimating models of complex FDI: are there third-country effects? J Econom 140:260–281
Bennett RJ, Haining RP, Griffith DA (1984) The problem of missing data on spatial surfaces. Ann Assoc Am Geogr 74(1):138–156
Cliff AD, Ord JK (1972) Spatial autocorrelation. Pion, London
Collins B (2011) Boundary respecting point displacement. Python Script, Blue Raster LLC, Arlington
Cozzi M, Filipponi D (2012) The new geospatial Business Register of Local Units: potentiality and application areas. In: 3rd Meeting of the Wiesbaden Group on Business Registers-International Roundtable on Business Survey Frames, Washington, DC, 17–20 September 2012
Cressie N, Wilke CK (2011) Statistics for spatio-temporal data. Wiley, Hoboken
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. JRSS Ser B 39(1):1–38
Deuchert E, Wunsch C (2014) Evaluating nationwide health interventions: Malawi’s insecticide-treated-net distribution programme. J R Stat Soc A 177(Part 2):523–552
Flores-Lagunes A, Schnier KE (2012) Estimation of sample selection models with spatial dependence. J Appl Econom 27:173–204
Griffith DA, Bennett RJ, Haining RP (1989) Statistical analysis of spatial data in the presence of missing observations: a methodological guide and an application to urban census data. Environ Plan A 21(11):1511–1523
IFNC (2015) http://www.sian.it/inventarioforestale/jsp/home_en.jsp
Kelejian HH, Prucha IR (2010) Spatial models with spatially lagged dependent variables and incomplete data. J Geogr Syst 12:241–257
Kelejian HH, Prucha IR (2007) HAC estimation in a spatial framework. J Econom 140:131–154
LeSage J, Pace RK (2009) Introduction to spatial econometrics. Chapman and Hall/CRC, Boca Raton
Little RJA (1988) Missing-data adjustments in large surveys. J Bus Econ Stat 6(3):287–296
Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, Hoboken
Pffafermayr M (2013) The Cliff and Ord test for spatial correlation of the disturbances in unbalanced panel models. Int Reg Sci Rev 36:492–506
Rubin DB (1976) Inference and missing data. Biometrika 63:581–592
Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New York
USAID (2013) Geographical displacement procedure and georeferences data release policy for the demographic and health surveys. DHS Spatial Analysis Report, 7 September 2013