Abstract
This paper extends the Baltagi et al. (J Econom 202:108–123, 2018; Advances in econometrics, essays in honor of M. Hashem Pesaran, Emerald Publishing, Bingley, 2021) static and dynamic \(\varepsilon \)-contamination papers to dynamic space–time models. We investigate the robustness of Bayesian panel data models to possible misspecification of the prior distribution. The proposed robust Bayesian approach departs from the standard Bayesian framework in two ways. First, we consider the \(\varepsilon \)-contamination class of prior distributions for the model parameters as well as for the individual effects. Second, both the base elicited priors and the \(\varepsilon \)-contamination priors use Zellner (Bayesian inference and decision techniques: essays in honor of Bruno de Finetti. Studies in Bayesian econometrics, vol 6, North-Holland, Amsterdam, pp 389–399, 1986)’s g-priors for the variance–covariance matrices. We propose a general “toolbox” for a wide range of specifications which includes the dynamic space–time panel model with random effects, with cross-correlated effects à la Chamberlain, for the Hausman–Taylor world and for dynamic panel data models with homogeneous/heterogeneous slopes and cross-sectional dependence. Using an extensive Monte Carlo simulation study, we compare the finite sample properties of our proposed estimator to those of standard classical estimators. We illustrate our robust Bayesian estimator using the same data as in Keane and Neal (Quant Econ 11:1391–1429, 2020). We obtain short-run as well as long-run effects of climate change on corn producers in the USA.
This is a preview of subscription content, access via your institution.


Notes
We thank Yuehua Wu for the helpful discussions on this issue. Unfortunately, their bias correction method is ineffective when \(T<50\) irrespective of N.
“We consider the most commonly used method of selecting a hopefully robust prior in \(\Gamma \) (the \(\varepsilon \)-contamination class of prior distributions), namely choice of that prior \(\pi \) which maximizes the marginal likelihood \(m ( y | \pi )\) over \(\Gamma \). This process is called Type II maximum likelihood by Good (1965)” (Berger and Berliner 1986, p. 463).
Yu et al. (2008) observed that \(y_t\) can have some nonstationary components if \(\phi + \rho + \delta = 1\) but, as underlined by Parent and LeSage (2011), stationarity does not require that \(| \phi | + | \rho | + | \delta | < 1\). LeSage et al. (2019) recall that the dependence parameters \(\phi \), \(\rho \) and \(\delta \) associated with stable processes require \(\phi + \rho +\delta <1 \) and, for cases where \(\rho - \delta > 0\), it requires that \(\phi - \rho + \delta > -1 \). See also Parent and LeSage 2011).
The literature generally recommends using the unit information prior (UIP) to set the g-priors (see Sect. 4.1).
A one-step estimation of the ML-II posterior distribution is possible but hardly feasible. This is because the probability density functions of y and that of the base prior \(\pi _{0}\left( \theta , b ,\tau | g_{0},h_{0}\right) \) need to be combined to get the predictive density. The resulting expression is highly complex and its integration with respect to \(\left( \theta ,b,\tau \right) \) is quite involved.
\( \varepsilon =0.5\) is an arbitrary value. We implicitly assume that the amount of error in the base elicited prior is \( 50\%\). In other words, \( \varepsilon =0.5\) means that we elicit the \(\pi _{0} \) prior but feel we could be as much as \(50\%\) off (in terms of implied probability sets).
We chose: \(\theta _{0}=0,b_{0}=0\) and \(\tau =1.\)
See section C in the supplementary material. For the MCMC Gibbs sampling, we explicitly introduce uniform distributions for \(\phi \), \(\rho \) and \(\delta \). We use 1000 draws and a warmup of 500 burn-in draws.
We use our own R codes for the Bayesian two-stage two-step model (B2S2S) and the MCMC Gibbs sampling and the “xtdpdqml” Stata command for the QML estimator. We use the same DGP set under R and Stata environments to compare the three methods.
The simulations were conducted using R version 3.3.2 on a MacBook Pro, 2.8 GHz core i7 with 16Go 1600 MGz DDR3 ram.
For the sake of brevity, we will henceforth write B2S2S_mixt and B2S2S_boot when referring to the B2S2S estimators with mixtures of t-distributions and with block resampling bootstrap, respectively.
Strictly speaking, we should mention “posterior means" and “posterior standard errors” whenever we refer to Bayesian estimates and “coefficients" and “standard errors” when discussing frequentist ones. For the sake of brevity, we will use “coefficients” and “standard errors” in both cases.
The “nse,” often referred to as the Monte Carlo error, is equal to the difference between the mean of the sampled values and the true posterior mean. As a rule of thumb, as many simulations as necessary should be conducted so as to ensure that the Monte Carlo error of each parameter of interest is less than approximately \(10\%\) of the sample standard error. As shown in the table, the estimated nse easily satisfy this criterion. The “cd” compares means calculated from the first \(10\%\) and last \(40\%\) draws of the Markov chain. Under the null hypothesis of no difference between these means, \(cd \sim N(0,1)\) and indicates that a sufficiently large number of draws have been taken. See Koop (2003) and Koop et al. (2007).
Recall that we use only \(BR=20\) individual block bootstrap samples. Fortunately, the results are very robust to the value of BR. For instance, increasing BR from 20 to 200 in the random effects world increases the computation time tenfold but yields practically the same results.
For the \(N=63\) census tract rook-style and queen-style contiguities within Syracuse city, the non-sparsity rates are, respectively, \(8.72\%\) and \(7.76\%\) while that of the inverse distance weighting matrix is \(98.41\%\).
The with 4-nearest and 10-nearest neighbors weighting matrices have non-sparsity rates of \(6.35\%\) and \(15.87\%\), respectively.
In a time series: \(x_t = \phi x_{t-1} + u_t \text {, }t=1,\ldots ,T\), \(x_t\) is said to be local-to-unit-root from the explosive side (LTUE) if \(\phi = 1 + 1/T\). \(x_t\) is said to be mildly explosive (ME) if \(\phi = 1 + (T^{\alpha })/T\), with \(\alpha =0.1\) or 0.3 and \(x_t\) is said to be explosive (EX) if \(\phi >1\). When T is large, \(\phi _{\text {LTUE}}< \phi _{\text {ME}} < \phi _{\text {EX}}\) which is not necessarily the case when T is small (see for instance Phillips 1987; Phillips and Magdalinos 2007; Tao and Yu 2020)
As \(\phi =1.05\), \(\rho =0.8\), \(\delta =-0.84\), \(\varpi _{\min } = -0.0963 \) and \(\varpi _{\max }=1\) where \(\varpi _{\min }\) and \(\varpi _{\max }\) are the minimum and maximum eigenvalues of the spatial weights matrix \(W_N\), we cannot respect one of the two stationarity conditions (4) in footnote 4:
$$\begin{aligned} \left\{ \begin{array}{lllll} \phi + \left( \rho + \delta \right) \varpi _{\min }< 1 &{} \text { if } &{} \rho + \delta < 0 &{} \rightarrow &{} 1.0538 \nless 1, \\ \phi - \left( \rho - \delta \right) \varpi _{\max }> -1 &{} \text { if } &{} \rho - \delta \ge 0 &{} \rightarrow &{} -0.59 > -1. \\ \end{array} \right. \end{aligned}$$We only used 1000 draws and 500 burn-in draws for each replication, which is small for MCMC. Despite this, 1000 replications with \(N = 63\), \(T = 10\) (resp. \(N = 120\), \(T = 20\)) require more than one hour of CPU time (resp. almost 5 hours). Had we used 10, 000 draws and 1000 burn-in draws, it would have taken 8 (resp. 34) hours for \(N = 63\), \(T = 10\) (resp. \(N = 120\), \(T = 20\)). The computation times of B2S2S and QMLE are considerably shorter. For instance, in Table 1 the respective computation times are 3min and 7min for \(N = 63\), \(T = 10\) and 12 min and 20 min for \(N = 120\), \(T = 20\). When using mixtures of t-distributions, the B2S2S requires as little as 15 s for \(N = 63\), \(T = 10\) and 52 sec for \(N = 120\), \(T = 20\).
We do not provide simulations for other combinations of \(\phi \), \(\rho \) and \(\delta \) for the sake of brevity.
With Monte Carlo simulations for a SAR model with i.i.d errors, Yang (2021) shows that the biases (resp. RMSEs) (\(\times 100\)) of \(\rho (=0.4)\) for 2SLS are smaller (resp. close) to those of GMM: 0.05 (resp. 1.58) for 2SLS and \(-\,0.64\) (resp. 1.52) for GMM when \(N=50\), \(T=30\) and 0.01 (resp. 0.81) for 2SLS and \(-\,0.31\) (resp. 0.75) for GMM when \(N=100\), \(T=50\). Similar results are obtained for the coefficient \(\beta \).
See section E in the supplementary material for more details on the 2SLS estimator of Yang (2021) extended to the dynamic space–time case. We use our own R codes for our Bayesian estimator and the 2SLS estimator.
For \(N=63\), \(T=30\) (resp. \(T=50\)), the gain factor is 1.4 (resp. 3.2) and for \(N=120\), \(T=30\) (resp. \(T=50\)), the gain factor is 3.3 (resp. 7.8).
The growing season is generally defined as ranging from April 1 to September 30 in the literature. More specifically, it starts at sowing and lasts approximately 150 days.
As pointed out by Keane and Neal (2020), this may involve the use of more heat-tolerant hybrids, improved water retention in fields, irrigation, adjustment of sowing rates, etc. This adaptation includes all sources of covariation between heat and heat sensitivity of agricultural yields. It implies the active adaptation of farmers to temperature for growing techniques, as well as any other factors (not controlled by farmers) that make yields less sensitive to heat in warmer conditions.
The threshold for corn is \(29^{\circ }\)C.
Indeed, the MO-OLS estimation on the static model
$$\begin{aligned} \log y_{ti} = \beta _{1,ti} gdd_{ti} + \beta _{2,ti} kdd_{ti} + \beta _{3,ti} prec_{ti} + \beta _{4,ti} prec^{2}_{ti} + c_{ti} + u_{ti},\quad i=1,\ldots ,N,\quad t=1,\ldots ,T, \end{aligned}$$implies a nonlinear relation between \(\hat{\beta }_{1,ti}\) and \(\log gdd_{ti}\) and between \(\hat{\beta }_{2,ti}\) and \(\log kdd_{ti}\) (see Table H.4 and Figures 10 and 11 in the supplementary material).
Their yield data came from the US Department of Agriculture (USDA) National Agricultural Statistics Service. Temperatures and precipitations data were drawn from Schlenker and Roberts (2009).
Enlargements of these maps are reported in Figures 6 to 8 of the supplementary material.
See the supplementary material for additional maps and descriptive statistics, as well as data on the distribution of the Köppen–Geiger climate classification across counties.
This estimation is significantly better than that obtained by MO-OLS using the static non-spatial model which yields an \(R^2= 0.793\) and a residual variance \(\sigma ^2_u = 0.071\). See Table H.4 in the supplementary material.
We note that it is not possible to separate out the time from space and space–time diffusion effects in this model except if we constrain \(\delta \) to be equal to \(\delta = - \phi \rho \).
The derivation of the dynamic multipliers is given in section H.2 in the supplementary material.
Enlargements of these maps are reported in Figures 12 to 14 of the supplementary material.
References
Bailey N, Holly S, Pesaran MH (2016) A two-stage approach to spatio-temporal analysis with strong and weak cross-sectional dependence. J Appl Econom 31:249–280
Baltagi BH, Bresson G, Chaturvedi A, Lacroix G (2018) Robust linear static panel data models using \(\varepsilon \)-contamination. J Econom 202:108–123
Baltagi BH, Bresson G, Chaturvedi A, Lacroix G (2021) Robust dynamic panel data models using \(\varepsilon \)-contamination. In: Chudik A, Hsiao C, Timmermann A (eds) Advances in econometrics, essays in honor of M. Hashem Pesaran. Emerald Publishing, Bingley
Berger J (1985) Statistical decision theory and Bayesian analysis. Springer, New York
Berger J, Berliner M (1986) Robust Bayes and empirical Bayes analysis with \(\varepsilon \)-contaminated priors. Ann Stat 14:461–486
Bivand RS, Gomez-Rubio V, Pebesma EJ (2008) Applied spatial data analysis with R. Springer, Berlin
Bun MJG, Carree MA, Juodis A (2017) On maximum likelihood estimation of dynamic panel data models. Oxf Bull Econ Stat 79:463–494
Burke M, Emerick K (2016) Adaptation to climate change: evidence from US agriculture. Am Econ J Econ Pol 8:106–40
Butler EE, Huybers P (2013) Adaptation of US maize to temperature variations. Nat Clim Change 3:68–72
Chamberlain G (1982) Multivariate regression models for panel data. J Econom 18:5–46
Chaturvedi A (1996) Robust Bayesian analysis of the linear regression. J Stat Plan Inference 50:175–186
Chudik A, Pesaran MH (2015a) Common correlated effects estimation of heterogeneous dynamic panel data models with weakly exogenous regressors. J Econom 188:393–420
Chudik A, Pesaran MH (2015b) Large panel data models with cross-sectional dependence: a survey. In: Baltagi BH (ed) The Oxford handbook of panel data. Oxford University Press, Oxford, pp 3–45
Debarsy N, Ertur C, LeSage JP (2012) Interpreting dynamic space–time panel data models. Stat Methodol 9:158–171
Dell M, Jones BF, Olken BA (2012) Temperature shocks and economic growth: evidence from the last half century. Am Econ J Macroecon 4:66–95
Deschênes O, Greenstone M (2007) The economic impacts of climate change: evidence from agricultural output and random fluctuations in weather. Am Econ Rev 97:354–385
Deschênes O, Greenstone M (2011) Climate change, mortality, and adaptation: evidence from annual fluctuations in weather in the US. Am Econ J Appl Econ 3:152–85
Elhorst JP (2014) Spatial econometrics: from cross-sectional data to spatial panels. Springer, Berlin
Fernández C, Ley E, Steel MFJ (2001) Benchmark priors for Bayesian model averaging. J Econom 100:381–427
Gilks WR, Richardson S, Spiegelhalter DJ (1997) Markov chain Monte Carlo in practice, 2nd edn. Chapman & Hall, London
Good IJ (1965) The estimation of probabilities. MIT Press, Cambridge
Hausman JA, Taylor WE (1981) Panel data and unobservable individual effects. Econometrica 49:1377–1398
Hsiao C, Zhou Q (2018) Incidental parameters, initial conditions and sample size in statistical inference for dynamic panel data models. J Econom 207:114–128
Jin B, Wu Y, Rao CR, Hou L (2020) Estimation and model selection in general spatial dynamic panel data models. Proc Natl Acad Sci 117:5235–5241
Kass RE, Wasserman L (1995) A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J Am Stat Assoc 90:928–934
Keane M, Neal T (2020) Climate change and US agriculture: accounting for multidimensional slope heterogeneity in panel data. Quant Econ 11:1391–1429
Koop G (2003) Bayesian econometrics. Wiley, New York
Koop G, Poirier DJ, Tobias JL (2007) Bayesian econometric methods. Cambridge University Press, Cambridge
Kripfganz S (2016) Quasi-maximum likelihood estimation of linear dynamic short-T panel-data models. Stand Genomic Sci 16:1013–1038
Kripfganz S, Schwarz C (2019) Estimation of linear dynamic panel data models with time-invariant regressors. J Appl Econom 34:526–546
Lee LF, Yu J (2015) Spatial panel data models. In: Baltagi BH (ed) The Oxford handbook of panel data. Oxford University Press, Oxford, pp 363–401
LeSage J, Pace R (2009) An introduction to spatial econometrics. CRC Press, Boca Raton
LeSage JP, Chih YY, Vance C (2019) Markov chain Monte Carlo estimation of spatial dynamic panel models for large samples. Comput Stat Data Anal 138:107–125
Lobell DB, Burke MB (2008) Why are agricultural impacts of climate change so uncertain? The importance of temperature relative to precipitation. Environ Res Lett 3:034007
Lobell DB, Hammer GL, McLean G, Messina C, Roberts MJ, Schlenker W (2013) The critical role of extreme heat for maize production in the United States. Nat Clim Change 3:497–501
Mendelsohn R, Nordhaus WD, Shaw D (1994) The impact of global warming on agriculture: a Ricardian analysis. Am Econ Rev 84:753–771
Moral-Benito E, Allison P, Williams R (2019) Dynamic panel data modelling using maximum likelihood: an alternative to Arellano-Bond. Appl Econ 51:2221–2232
Parent O, LeSage JP (2010) A spatial dynamic panel model with random effects applied to commuting times. Transp Res Part B Methodol 44:633–645
Parent O, LeSage JP (2011) A space-time filter for panel data models containing random effects. Comput Stat Data Anal 55:475–490
Pesaran MH (2006) Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica 74:967–1012
Phillips PCB (1987) Towards a unified asymptotic theory for autoregression. Biometrika 74:535–547
Phillips PCB (1991) To criticize the critics: an objective Bayesian analysis of stochastic trends. J Appl Economet 6:333–364
Phillips PCB, Magdalinos T (2007) Limit theory for moderate deviations from a unit root. J Econom 136:115–130
Porter JR, Xie L, Challinor AJ, Cochrane K, Howden SM, Iqbal MM, Travasso MI (2014) Food security and food production systems. In: IPCC, T.I.P.o.C.C. (ed) Climate change 2014: impacts, adaptation, and vulnerability. Part A: global and sectoral aspects contribution of working group II to the fifth assessment report of the intergovernmental panel on climate change. Cambridge University Press, Cambridge, pp 485–533
Robert CP (2007) The Bayesian choice. From decision-theoretic foundations to computational implementation, 2nd edn. Springer, New York
Schlenker W, Roberts MJ (2009) Nonlinear temperature effects indicate severe damages to US crop yields under climate change. Proc Natl Acad Sci 106:15594–15598
Schlenker W, Hanemann WM, Fisher AC (2005) Will US agriculture really benefit from global warming? Accounting for irrigation in the hedonic approach. Am Econ Rev 95:395–406
Su L, Yang Z (2015) QML estimation of dynamic panel data models with spatial errors. J Econom 185:230–258
Tao Y, Yu J (2020) Model selection for explosive models. In: Li, Tong, Pesaran, M. Hashem and Terrell, Dek (eds) Advances in econometrics, vol 41: Essays in honor of Cheng Hsiao. Emerald Publishing Limited, pp 73–104
Wallace HA (1920) Mathematical inquiry into the effect of weather on corn yield in the eight corn belt states. Mon Weather Rev 48:439–446
Waller LA, Gotway CA (2004) Applied spatial statistics for public health data. Wiley, Hoboken
Yang CF (2021) Common factors and spatial dependence: an application to US house prices. Econom Rev 40:14–50
Yu J, De Jong R, Lee LF (2008) Quasi-maximum likelihood estimators for spatial dynamic panel data with fixed effects when both n and T are large. J Econom 146:118–134
Zellner A (1986) On assessing prior distributions and Bayesian regression analysis with g-Prior distributions. In: Goel P and Zellner A (eds) Bayesian inference and decision techniques: Essays in Honor of Bruno de Finetti. Elsevier Science Publishers, Inc., New York, 233–243
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This paper is written in honor of Peter Schmidt for his many contributions to econometrics, in particular his influential contributions to dynamic panel data models. We are grateful to Robin Sickles, Subal C. Kumbhakar and two anonymous referees for their helpful comments and suggestions. The usual disclaimers apply.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Baltagi, B.H., Bresson, G., Chaturvedi, A. et al. Robust dynamic space–time panel data models using \(\varepsilon \)-contamination: an application to crop yields and climate change. Empir Econ (2022). https://doi.org/10.1007/s00181-022-02348-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00181-022-02348-9
Keywords
- Climate change
- Crop yields
- Dynamic model
- \(\varepsilon \)-Contamination
- Panel data
- Robust Bayesian estimator
- Space–time
JEL Classification
- C11
- C23
- C26
- Q15
- Q54