Skip to main content
Log in

Spatio-Temporal Instrumental Variables Regression with Missing Data: A Bayesian Approach

  • Published:
Computational Economics Aims and scope Submit manuscript

Abstract

This paper proposes an extension of the Bayesian instrumental variables regression which allows spatial and temporal correlation among observations. For that, we introduce a double separable covariance matrix, adopting a Conditional Autoregressive structure for the spatial component, and a first-order autoregressive process for the temporal component. We also introduce a Bayesian multiple imputation to handle missing data considering uncertainty. The inference procedure is described joint with a step by step Monte Carlo Markov Chain algorithm for parameters estimation. We illustrate our methodology through a simulation study and a real application that investigates how broadband affects the Gross Domestic Product of municipalities in the state of Mato Grosso do Sul from 2010 to 2017.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Availability of Data and Material

Available at https://github.com/marcuslavagnole.

References

  • Angrist, J. D., & Krueger, A. B. (1991). Does compulsory school attendance affect schooling and earnings? The Quarterly Journal of Economics, 106(4), 979–1014.

    Article  Google Scholar 

  • Barro, R. J., & Sala-i-Martin, X. (1995). Economic growth. MIT Press.

    Google Scholar 

  • Besag, J., York, J., & Mollié, A. (1991). Bayesian image restoration with two applications in spatial statistics (with discussion). Annals of the Institute of Statistical Mathematics, 43, 1–59.

    Article  Google Scholar 

  • Bozza, S., & O’Hagan, A. (2003). A Bayesian approach for the estimation of the covariance structure of separable spatio-temporal stochastic processes. In M. Schader, W. Gaul, & M. Vichi (Eds.), Between data science and applied data analysis (pp. 165–172). Springer.

  • Carlin, B. P., & Banerjee, S. (2003). Hierarchical multivariate CAR models for spatially correlated survival data. Bayesian Statistics, 7, 45–65.

    Google Scholar 

  • Celeux, G., Forbes, F., Robert, C. P., & Titterington, M. (2006). Deviance information criteria for missing data models. Bayesian Analysis, 1(4), 651–674.

    Article  Google Scholar 

  • Chen, H., Quandt, S. A., Grzywacz, J. G., & Arcury, T. A. (2013). A Bayesian multiple imputation method for handling longitudinal pesticide data with values below the limit of detection. Environmetrics, 24(2), 132–142.

    Article  Google Scholar 

  • Conley, T. G., Hansen, C. B., McCulloch, R. E., & Rossi, P. E. (2008). A semi-parametric Bayesian approach to the instrumental variable problem. Journal of Econometrics, 144(1), 276–305.

    Article  Google Scholar 

  • Cressie, N., & Wikle, C. K. (2011). Statistics for spatio-temporal data. Wiley.

    Google Scholar 

  • Czernich, N., Falck, O., Kretschmer, T., & Woessmann, L. (2011). Broadband infrastructure and economic growth. The Economic Journal, 121(552), 505–532.

    Article  Google Scholar 

  • Forman, C., Goldfarb, A., & Greenstein, S. (2005). How did location affect adoption of the commercial internet? Global village vs. urban leadership. Journal of Urban Economics, 58(3), 389–420.

    Article  Google Scholar 

  • Fosdick, B. K., & Hoff, P. D. (2014). Separable factor analysis with applications to mortality data. Annals of Applied Statistics, 8(1), 120–147.

    Article  Google Scholar 

  • Gelfand, A. E., & Vounatsou, P. (2003). Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics, 4(1), 11–25.

    Article  Google Scholar 

  • Hahn, P. R., He, J., & Lopes, H. (2018). Bayesian factor model shrinkage for linear IV regression with many instruments. Journal of Business and Economic Statistics, 36(3), 278–287.

    Article  Google Scholar 

  • Hoogerheide, L., Kleibergen, F., & van Dijk, H. K. (2007). Natural conjugate priors for the instrumental variables regression model applied to the Angrist–Krueger data. Journal of Econometrics, 138(1), 63–103.

    Article  Google Scholar 

  • Jin, X., Banerjee, S., & Carlin, B. P. (2007). Order-free co-regionalized areal data models with application to multiple-disease mapping. Journal of the Royal Statistical Society, Series B, 69(5), 817–838.

    Article  Google Scholar 

  • Kleibergen, F., & Zivot, E. (2003). Bayesian and classical approaches to instrumental variable regression. Journal of Econometrics, 114(1), 29–72.

    Article  Google Scholar 

  • Knorr-Held, L. (2000). Bayesian modelling of inseparable space-time variation in disease risk. Statistics in Medicine, 19(17–18), 2555–2568.

    Article  Google Scholar 

  • Koutroumpis, P. (2009). The economic impact of broadband on growth: A simultaneous approach. Telecommunications Policy, 33(9), 471–485.

    Article  Google Scholar 

  • La Rose, R., Strover, S., Gregg, J. L., & Straubhaar, J. (2011). The impact of rural broadband development: Lessons from a natural field experiment. Government Information Quarterly, 28(1), 91–100.

    Article  Google Scholar 

  • Lancaster, T. (2004). An introduction to modern Bayesian. Wiley-Blackwell.

    Google Scholar 

  • Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data. Wiley.

    Book  Google Scholar 

  • Lopes, H. F., & Polson, N. G. (2014). Bayesian instrumental variables: Priors and likelihoods. Econometric Reviews, 39(1–4), 100–121.

    Article  Google Scholar 

  • Luo, S., Lawson, A. B., He, B., Elm, J. J., & Tilley, B. C. (2016). Bayesian multiple imputation for missing multivariate longitudinal data from a Parkinson’s disease clinical trial. Statistical Methods in Medical Research, 25(2), 821–837.

  • Mack, E. A. (2014). Businesses and the need for speed: The impact of broadband speed on business presence. Telematics and Informatics, 31(4), 617–627.

    Article  Google Scholar 

  • Manago, K. F., Hogue, T. S., Porter, A., & Hering, A. S. (2019). A Bayesian hierarchical model for multiple imputation of urban spatio-temporal groundwater levels. Statistics and Probability Letters, 144, 44–51.

    Article  Google Scholar 

  • Marfia, K. V. (1988). Multi-dimensional multivariate gaussian Markov random fields with application to image processing. Journal of Multivariate Analysis, 24(2), 265–284.

    Article  Google Scholar 

  • Moon, T. K. (1996). The expectation–maximization algorithm. IEEE Signal Processing Magazine, 13(6), 47–60.

    Article  Google Scholar 

  • Neri, Marcelo C., Vaz, Fabio M., de Souza, Pedro F. (2013.) “Efeitos Macroeconômicos do Programa Bolsa Família: Uma Análise Comparativa das Transferências Sociais.” In Programa Bolsa Família uma década de inclusão e cidadania, edited by Tereza Campello and Marcelo C. Neri, Brasília, DF, 193–206. Instituto de Pesquisa Econômica Aplicada.

  • Quick, H., Banerjee, S., & Carlin, B. P. (2013). Modeling temporal gradients in regionally aggregated California asthma hospitalization data. Annals of Applied Statistics, 7(1), 154–176.

    Article  Google Scholar 

  • R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/.

  • Roller, L.-H., & Waverman, L. (2001). Telecommunications infrastructure and economic development: A simultaneous approach. American Economic Review, 91(4), 909–923.

    Article  Google Scholar 

  • Rossi, P. E., Allenby, G. M., & McCulloch, R. (2005). Bayesian statistics and marketing. Wiley.

    Book  Google Scholar 

  • Rougier, E., Combarnous, F., & Fauré, Y-A. (2018). The “local economy” effect of social transfers: An empirical assessment of the impact of the Bolsa Família program on local productive structure and economic growth. World Development, 103, 199–215.

  • Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3), 581–592.

    Article  Google Scholar 

  • Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. Wiley.

    Book  Google Scholar 

  • Schmidt, A. M., & Nobre, W. S. (2018). Conditional autoregressive (CAR) model. In N. Balakrishnan, T. Colton, B. Everitt, W. Piegorsch, F. Ruggeri, & J. Teugels (Eds.), Wiley StatsRef: Statistics reference online (pp. 1–11). Wiley.

    Google Scholar 

  • Si, Y., & Reiter, J. P. (2013). Nonparametric Bayesian multiple imputation for incomplete categorical variables in large-scale assessment surveys. Journal of Educational and Behavioral Statistics, 38(5), 499–521.

    Article  Google Scholar 

  • Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B, 64(4), 583–639.

    Article  Google Scholar 

  • Stein, M. L. (2005). Space-time covariance functions. Journal of the American Statistical Association, 100(469), 310–321.

    Article  Google Scholar 

  • Vidotto, D., Vermunt, J. K., & van Deun, K. (2018). Bayesian multilevel latent class models for the multiple imputation of nested categorical data. Journal of Educational and Behavioral Statistics, 43(5), 511–539.

    Article  Google Scholar 

Download references

Funding

No funding to declare.

Author information

Authors and Affiliations

Authors

Contributions

Marcus L. Nascimento provided the conception of the article, the implementation of the codes, and drafting the article; Kelly C. M. Gonçalves provided the conception and the revision of the article; Mario Jorge Mendonça supplied the data set.

Corresponding author

Correspondence to Marcus L. Nascimento.

Ethics declarations

Conflict of interest

No conflicts of interest.

Ethical approval

No particular ethical approval was required for this study because it does not entail human participation or personal data.

Informed consent

No consent was required for this study because it does not entail human participation or personal data.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Summary Statistics

See Table 4

Table 4 Mean, standard deviation and quartiles of dependent, exogenous and endogenous variables in the logarithm scale for all years under analysis

Appendix B: Assessment of MCMC Convergence

See Figs. 4 and 5.

Fig. 4
figure 4

Trace plots of MCMC draws for parameters \(\beta \), \(\beta ^{*}_{0}\), \(\beta ^{*}_{1}\), \(\beta ^{*}_{2}\) and \(\beta ^{*}_{3}\). The solid lines represent the true values and the dashed lines represent the 95% HPD intervals. The left panels display how quickly the parameters converge to the true values, and the right panels display the chain after considering the burn-in period and the thin

Fig. 5
figure 5

Trace plots of MCMC draws for parameters \(\delta _{0}\), \(\delta _{1}\), \(\phi \) and \(\rho \). The solid lines represent the true values and the dashed lines represent the 95% HPD intervals. The left panels display how quickly the parameters converge to the true values, and the right panels display the chain after considering the burn-in period and the thin

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nascimento, M.L., Gonçalves, K.C.M. & Mendonça, M.J. Spatio-Temporal Instrumental Variables Regression with Missing Data: A Bayesian Approach. Comput Econ 62, 29–47 (2023). https://doi.org/10.1007/s10614-022-10269-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10614-022-10269-z

Keywords

Navigation