Skip to main content
Log in

Spatial lag dependence in the presence of missing observations

  • Original Paper
  • Published:
The Annals of Regional Science Aims and scope Submit manuscript

Abstract

We explore the estimation effectiveness of spatial lag models in the presence of missing observations. Spatial lag models are used to measure interdependency between dependent variables. If there are no missing data, it is easy to interpret this spatial autocorrelation process. Very sparsely sampled data are sometimes used in empirical studies. For such data, we observe only a small part of a population containing possible mutual dependencies. Simulation studies based on artificial data confirm the relation between the sampling rate and selection ratio of spatial and non-spatial models. Our findings include the following: (1) Negative spatial autocorrelation of the data-generating process (DGP) may not be observed. (2) Positive spatial autocorrelation of the DGP may be observed, but it is downward-biased. (3) We obtain less-biased estimates if we use a non-row-standardized weight matrix. (4) Non-spatial models tend to be selected in preference to the correct model, the spatial lag model. (5) Estimates of regression coefficients remain almost unbiased.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. The edge (or border) effect described in Anselin (1988, pp. 175–176) has a close relation with the missing data problem, in that it affects the expected value of the disturbance term and introduces heteroscedasticity.

  2. See also Little and Rubin (2002, ch.1) and Arbia et al. (2015) for a more detailed survey on missing data and some examples.

  3. In this current study, every data unit is observable with a \(\gamma \) probability. Sampling rate is expressed as ‘PMP’ in Arbia et al. (2015). ‘PMP’ there is defined as the proportion of missing points. Their PMP \(=0.05\) and PMP \(\,=0.25\) correspond to our \(\gamma =0.95\) and \(\gamma =0.75\), respectively. It is expressed as ‘\(\alpha \)’ in Wang and Lee (2013a, b). ‘\(\alpha \)’ there is defined as a missing data percentage. Their \(\alpha =10, 20\), and 40 correspond to our \(\gamma =0.9, 0.8\), and 0.6, respectively.

  4. See Anselin (1988, ch. 6) for details of these two methods.

  5. We also estimated the spatial two-stage estimation (Kelejian and Prucha 1998) of Eq. (1). The results were nearly identical with those of ‘lag,’ and so are omitted from our description.

  6. See Stakhovych and Bijmolt (2009, p. 393) for a survey of spatial weights matrices used in simulation analyses.

  7. The current simulation study employs \(\alpha = 2\) and \(\bar{d} = 5\).

  8. Note that there is no spatial autocorrelation in the DGP when the true \(\rho \) value is 0.

  9. Arbia et al. (2015) also pointed out that spatial correlation disappears when some points are missing.

  10. A referee kindly suggested that ‘the SAR parameter \(\rho \) loses its interpretation, and its relationship with the eigen values of weight matrix is lost’ in this setting. Even if there should be further discussions, we here just point out the possibility of the non-row-standardized weight matrix setting.

  11. Arbia et al. (2015) performed experiments for the relatively dense sampling cases. They control the spatial intensity of data deletion with a parameter \(\psi \). Their \(\psi =0\) result corresponds to our experiments. A comparison between their results tells us several things. (1) There are similarities in decline patterns of spatial parameter efficiency. (2) There are differences in decline patterns of regression coefficients efficiency. In our sparse sampling experiments, the pattern does not depend on the true value of spatial parameter. In Arbia et al. (2015)’s dense sampling experiments, it depends on the true value of spatial parameter.

References

  • Anselin L (1988) Spatial econometrics: methods and models. Kluwer Academic, Dordrecht

    Book  Google Scholar 

  • Arbia G, Espa G, Giuliani D (2015) Dirty spatial econometrics. DEM Discussion Papers, 2015/09, University of Trento, Department of Economics and Management. http://web.unitn.it/files/download/27419/dem2015_09

  • Freeman JR (1989) Systematic sampling, temporal aggregation, and the study of political relationships. Polit Anal 1(1):61–98

    Article  Google Scholar 

  • Goulard M, Laurent T, Thomas-Agnan C (2009) About predictions in spatial SAR models: optimal and almost optimal strategies. A paper was presented at the “3rd World Conference of the Spatial Econometrics Association”, 10 July 2009

  • Griffith DA, Bennett RJ, Haining RP (1989) Statistical analysis of spatial data in the presence of missing observations: a methodological guide and an application to urban census data. Environ Plan A 21(11):1511–1523

    Article  Google Scholar 

  • Kelejian HH, Piras G (2011) An extension of Kelejian’s J-test for non-nested spatial models. Reg Sci Urban Econ 41(3):281–292

    Article  Google Scholar 

  • Kelejian HH, Prucha IR (1998) A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J Real Estate Finance Econ 17(1):99–121

    Article  Google Scholar 

  • Kelejian HH, Prucha IR (2010) Spatial models with spatially lagged dependent variables and incomplete data. J Geogr Syst 12:241–257

    Article  Google Scholar 

  • LeSage JP, Kelley Pace R (2004) Models for spatially dependent missing data. J Real Estate Finance Econ 29(2):233–254

    Article  Google Scholar 

  • Little RJA, Rubin DB (2002) Statistical analysis with missing data. Wiley-Interscience, New York

    Book  Google Scholar 

  • Stakhovych S, Bijmolt THA (2009) Specification of spatial models: a simulation study on weights matrices. Pap Reg Sci 88:389–408

    Article  Google Scholar 

  • Wang W, Lee L-F (2013a) Estimation of spatial autoregressive models with randomly missing data in the dependent variable. Econom J 16(1):73–102

    Article  Google Scholar 

  • Wang W, Lee L-F (2013b) Estimation of spatial panel data models with randomly missing data in the dependent variable. Reg Sci Urban Econ 43(3):521–538

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takahisa Yokoi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yokoi, T. Spatial lag dependence in the presence of missing observations. Ann Reg Sci 60, 25–40 (2018). https://doi.org/10.1007/s00168-015-0737-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00168-015-0737-2

JEL Classification

Navigation