Estimating Gravity Models of Trade with Correlated Time-Fixed Regressors: To IV or not IV?

Mitze, Timo

doi:10.1007/978-3-642-22901-5_6

Timo Mitze²

Part of the book series: Lecture Notes in Economics and Mathematical Systems ((LNE,volume 657))

1159 Accesses
2 Citations

Abstract

Gravity-type models are widely used in international economics. In these models, the inclusion of time-fixed regressors like geographical or cultural distance, language and institutional (dummy) variables is often of vital importance, e.g., to analyze the impact of trade costs on internationalization activity. This paper assesses the problem of parameter inconsistency due to a correlation of the time-fixed regressors with the combined error term in panel data settings. A common solution is to use instrumental variable (IV) estimation in the spirit of Hausman and Taylor (1981) since standard fixed effect model (FEM) estimation is not applicable. However, some potential shortcomings of the latter approach recently gave rise to the use of non-IV two-step estimators. Given their growing number of empirical applications, we aim to compare the performance of IV and non-IV approaches in the presence of time-fixed variables and right-hand-side endogeneity using Monte Carlo simulations, where we explicitly control for the problem of IV selection in the Hausman–Taylor case. The simulation results show that the Hausman–Taylor model with perfect knowledge about the underlying data structure (instrument orthogonality) has on average the smallest bias. However, compared to the empirically relevant specification with imperfect knowledge and instruments chosen by statistical criteria, simple non-IV rival estimators perform equally well or even better. We illustrate these findings by estimating gravity-type models for German regional export activity within the EU. The results show that the HT specification is likely to bias the role of trade costs proxied by geographical distance upwards.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In a recent comment, Greene (2010) criticizes the original approach by Plümper and Tröger (2007) arguing that they use a wrong variance covariance matrix resulting in systematically underestimated standard errors. Thus, bootstrapping the latter may be seen as a more appropriate choice.
2.
Searching for the term “Fixed Effects Vector Decomposition” (in quotation marks) by now gives almost 2100 entries in Google.
3.
Here, we use the terminology of ‘endogenous’ and ‘exogenous’ to refer to variables that are either correlated with the unobserved individual effects μ _i or not. An alternative classification scheme used in the panel data literature classifies variables as either ‘doubly exogenous’ with respect to both error components μ _i and ν _i,t or ‘singly exogenous’ to only ν. We use these two definitions interchangeably here.
4.
One also has to note that the HT model can also be estimated based on a slightly different transformation, namely the filtered instrumental variable (FIV) estimator. The latter transforms the estimation equation by GLS but uses unfiltered instruments. However, both approaches typically yield similar parameter estimates, see Ahn and Schmidt (1999).
5.
The total number of IVs in the HT model is 2k ₁+k ₂+g ₁ (k ₁+k ₂ from QX1 and QX2, k ₁ from PX1 and g ₁ from Z1).
6.
The FEVD may be seen as an extension to an earlier model in Hsiao (2003). For details, see Plümper and Tröger (2007).
7.
For details see Atkinson and Cornwell (2006).
8.
A modification of the standard FEVD approach also allows for the possibility to estimate the second step as IV regression and thus account for endogeneity among time invariant variables and η _i. Following Atkinson and Cornwell 2006, we can define a standard IV estimator as: $\hat{\gamma}_{\mathit{FEVD}}=(S'Z)^{-1}S'\hat{\pi}$, where S is the instrument set that satisfies the orthogonality condition $E(S \;\eta)=0$. However, this brings back the classification problem of the HT approach, which we aim to avoid here.
9.
ξ defines the ratio of the variance terms of the error components as ξ=σ _μ/σ _ν.
10.
The CEEC aggregate includes Hungary, Poland, the Czech Republic, Slovakia, Slovenia, Estonia, Latvia, Lithuania, Romania and Bulgaria.
11.
Results for an import equation with qualitatively similar results can be obtained from the author upon request.
12.
Further details can be found in the data Appendix B in Table 6.3.
13.
We vary g ₁ and g ₂ on the interval [−2,2]. The default is g ₁=g ₂=2.
14.
For the FEVD estimator, we employ the Stata routine xtfevd written by Plümper and Tröger (2007), the HT model is implemented using the user written Stata routine ivreg2 by Baum et al. (2003).
15.
A detailed description of different moment selection criteria is given in a longer working paper version of this paper, see Mitze 2009.
16.
Generally, the MSC-BIC criterion is found to have the best empirical performance in large samples, while the MSC-AIC outranks the other criteria in small sample settings, but performs poor otherwise.
17.
The C-statistic can be derived as the difference of two Hansen/Sargan overidentification tests with C=J−J ₁∼χ ²(M−M ₁), where M ₁ is the number of instruments in S ₁ and M is the total number of IVs.

References

Ahn, S., & Low, S. (1996). A reformulation of the Hausman test for regression models with pooled cross-section-time series data. Journal of Econometrics, 71, 309–319.
Article Google Scholar
Ahn, S., & Schmidt, P. (1999). Estimation of linear panel data models using GMM. In L. Matyas (Ed.), Generalized methods of moments estimation. Cambridge: Cambridge University Press.
Google Scholar
Akther, S., & Daly, K. (2009). Finance and poverty: evidence from fixed effect vector decomposition. Emerging Market Review, 10(3), 191–206.
Article Google Scholar
Alecke, B., Mitze, T., & Untiedt, G. (2003). Das Handelsvolumen der ostdeutschen Bundesländer mit Polen und Tschechien im Zuge der EU-Osterweiterung: Ergebnisse auf Basis eines Gravitationsmodells. DIW Vierteljahrshefte zur Wirtschaftsforschung, 72(4), 565–578.
Article Google Scholar
Alecke, B., Mitze, T., & Untiedt, G. (2010). Trade-FDI linkages in a simultaneous equations system of gravity models for german regional data. International Economics, 122, 121–162.
Article Google Scholar
Alfaro, R. (2006). Application of the symmetrically normalized IV estimator (Working Paper). Boston University.
Google Scholar
Andrews, D. (1999). Moment selection procedures for generalized method of moments estimation. Econometrica, 67(3), 543–564.
Article Google Scholar
Andrews, D., & Lu, B. (2001). Consistent model and moment selection procedures for GMM estimation with application to dynamic panel data models. Journal of Econometrics, 101, 123–164.
Article Google Scholar
Arellano, M., & Bond, S. (1991). Some tests of specification for panel data: Monte Carlo evidence and an application to employment equations. Review of Economic Studies, 58, 277–297.
Article Google Scholar
Arnold, J., & Hussinger, K. (2006). Export versus FDI in German manufacturing: firm performance and participation in international markets (Deutsche Bundesbank Discussion Paper, No. 04/2006).
Google Scholar
Atkinson, S., & Cornwell, C. (2006). Inference in two-step panel data models with time-invariant regressors (Working Paper). University of Georgia.
Google Scholar
Baldwin, R., & Taglioni, D. (2006). Gravity for dummies and dummies for gravity equations (NBER Working Paper No. 12516).
Google Scholar
Baltagi, B. (2008). Econometric analysis of panel data (4th ed.). Chichester: John Wiley & Sons.
Google Scholar
Baltagi, B., Bresson, G., & Pirotte, A. (2003). Fixed effects, random effects or Hausman Taylor? A pretest estimator. Economic Letters, 79, 361–369.
Article Google Scholar
Baltagi, B., & Chang, Y. (2000). Simultaneous equations with incomplete panels. Econometric Theory, 16, 269–279.
Article Google Scholar
Baum, C., Schaffer, M., & Stillman, S. (2003). IVREG2: stata module for extended instrumental variables/2SLS and GMM estimation (Statistical Software Component No. S425401). Boston College.
Google Scholar
Belke, A., & Spies, J. (2008). Enlarging the EMU to the East: what effects on trade? Empirica, 4, 369–389.
Article Google Scholar
Breuss, F., & Egger, P. (1999). How reliable are estimations of East–West trade potentials based on cross-section gravity analyses? Empirica, 26(2), 81–94.
Article Google Scholar
Buch, C., & Piazolo, D. (2000). Capital and trade flows in Europe and the impact of enlargement (Kiel Working Papers No. 1001). Kiel Institute for the World Economy.
Google Scholar
Caetano, J., & Gallego, A. (2003). An analysis of actual and potential trade between the EU countries and the Eastern European countries (Documento de trabalho No. 03/2003). Universidade de Évora.
Google Scholar
Caetano, J., Galego, A., Vaz, E., Vieira, C., & Vieira, I. (2002). The Eastern enlargement of the Eurozone. Trade and FDI (Ezoneplus Working Paper No. 7).
Google Scholar
Caporale, G., Rault, C., Sova, R., & Sova, A. (2008). On the bilateral trade effects of free trade agreements between the EU-15 and the CEEC-4 Countries (CESifo Working Paper Series No. 2419).
Google Scholar
Destatis (2008). Außenhandel nach Bundesländern, various issues. German Statistical Office, Wiesbaden.
Google Scholar
Disdier, A., & Head, K. (2008). The puzzling persistence of the distance effect on bilateral trade. The Review of Economics and Statistics, 90(1), 37–48.
Article Google Scholar
Egger, P. (2000). A note on the proper econometric specification of the gravity equation. Economics Letters, 66, 25–31.
Article Google Scholar
Eichenbaum, M., Hansen, L., & Singelton, K. (1988). A time series analysis of representative agent models of consumption and leisure under uncertainty. Quarterly Journal of Economics, 103, 51–78,
Article Google Scholar
Etzo, I. (2007). Determinants of interregional migration in Italy: a panel data analysis (MPRA Paper No. 5307).
Google Scholar
EU Commission (2008). AMECO Database. Directorate General for Economic and Financial Affairs, available at: http://ec.europa.eu/economy_finance/ameco.
Eurostat (2008). National accounts (including GDP), various issues, available at: http://epp.eurostat.ec.europa.eu.
Feenstra, R. (2004). Advanced international trade. Theory and evidence. Princeton: Princeton University Press.
Google Scholar
Fidrmuc, J. (2008). Gravity models in integrated panels. Empirical Economics, 37(2), 435–446.
Article Google Scholar
GGDC (2008). Total economy database. Groningen growth and development centre, available at: www.ggdc.net.
Greene, W. (2010). Fixed effects vector decomposition: a magical solution to the problem of time invariant variables in fixed effects models? (unpublished Working Paper), download from: http://pages.stern.nyu.edu/~wgreene.
Hansen, L. (1982). Large sample properties of generalised method of moments estimators. Econometrica, 50, 1029–1054.
Article Google Scholar
Hausman, J. (1978). Specification tests in econometrics. Econometrica, 46, 1251–1271.
Article Google Scholar
Hausman, J., & Taylor, W. (1981). Panel data and unobservable individual effects. Econometrica, 49, 1377–1399.
Article Google Scholar
Helpman, E., Melitz, M., & Yeaple, S. (2003). Export versus FDI (NBER Working Paper No. 9439).
Google Scholar
Henderson, D., & Millimet, D. (2008). Is gravity linear? Journal of Applied Econometrics, 23, 137–172.
Article Google Scholar
Hong, H., Preston, B., & Shum, M. (2003). Generalized empirical likelihood-based model selection criteria for moment condition models. Econometric Theory, 19, 923–943.
Google Scholar
Hsiao, C. (2003). Analysis of panel data (2nd ed.). Cambridge: Cambridge University Press.
Book Google Scholar
Im, K., Ahn, S., Schmidt, P., & Wooldridge, J. (1999). Efficient estimation in panel data models with strictly exogenous explanatory variables. Journal of Econometrics, 93, 177–203.
Article Google Scholar
Jakab, Z., Kovács, M., & Oszlay, A. (2001). How far has trade integration advanced?: an analysis of the actual and potential trade of three Central and Eastern European countries. Journal of Comparative Economics, 29, 276–292.
Article Google Scholar
Krogstrup, S., & Wälti, S. (2008). Do fiscal rules cause budgetary outcomes? Public Choice, 136(1), 123–138.
Article Google Scholar
Lafourcade, M., & Paluzie, E. (2005). European integration, FDI and the internal geography of trade: evidence from Western European border regions. Paper presented at the EEA congress in Amsterdam, August 2005.
Google Scholar
Linders, G. (2005). Distance decay in international trade patterns: a meta-analysis (ERSA 2005 Conference Paper ersa05p679).
Google Scholar
Matyas, L. (1997). Proper econometric specification of the gravity model. The World Economy, 20(3), 363–368.
Article Google Scholar
Mitze, T. (2009). Endogeneity in panel data models with time-verying and time-fixed regressors: to IV or not IV? (Ruhr Economic Papers No. 83).
Google Scholar
Mitze, T., Alecke, B., & Untiedt, G. (2010). Trade-FDI linkages in a simultaneous equation system of gravity models for German regional data. Economie Internationale/International Economics, forthcoming.
Google Scholar
Mundlak, Y. (1978). On the pooling of time series and cross-section data. Econometrica, 46, 69–85.
Article Google Scholar
Murphy, K., & Topel, R. (1985). Estimation and inference in two-step econometric models. Journal of Business and Economic Statistics, 3, 88–97.
Google Scholar
Pagan, A. (1984). Econometric issues in the analysis of regressions with generated regressors. International Economic Review, 25, 221–247.
Article Google Scholar
Plümper, T., & Tröger, V. (2007). Efficient estimation of time invariant and rarely changing variables in panel data analysis with unit effects. Political Analysis, 15, 124–139.
Article Google Scholar
Sargan, J. (1958). The estimation of economic relationships using instrumental variables. Econometrica, 26, 393–415.
Article Google Scholar
Schumacher, D., & Trübswetter, P. (2000). Volume and comparative advantage in East West trade (DIW Discussion Paper No. 223).
Google Scholar
VGRdL (2008). Volkswirtschaftliche Gesamtrechnungen der Bundesländer (Regional Accounts for German States), available at: https://vgrdl.de.
White, H. (1984). Asymptotic theory for econometrics. New York: Academic Press.
Google Scholar
Zwinkels, R., & Beugelsdijk, S. (2010). Gravity equations: workhorse or Trojan horse in explaining trade and FDI patterns across time and space? International Business Review, 19(1), 102–115.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Rheinisch-Westfälisches Institut für Wirtschaftsforschung, Essen, Germany
Dr. Timo Mitze

Authors

Dr. Timo Mitze
View author publications
You can also search for this author in PubMed Google Scholar

Appendices

Appendix A: Monte Carlo Simulation Design

The starting point for the Monte Carlo simulation experiment is (6.4). The time-varying regressors $x_{1_{1}}, x_{1_{2}},x_{2_{1}}, x_{2_{2}}$ are generated by the following autoregressive process:

$$\begin{array}{*{20}c} {x_{n_m ,i,t = 1} = 0} & {{\rm with}\,n,m = 1,2,} \\ \end{array}$$

(6.8)

$$\begin{array}{*{20}c} {x_{1_1 ,i,t} = \rho _1 x1_{i,t - 1} + \delta _i + \xi _{i,t} } & {{\rm for}\,t = 2,...,T,} \\ \end{array}$$

(6.9)

$$\begin{array}{*{20}c} {x_{1_2 ,i,t} = \rho _2 x2_{i,t - 1} + \psi _i + \omega _{i,t} } & {{\rm for}\,t = 2,...,T,} \\ \end{array}$$

(6.10)

$$\begin{array}{*{20}c} {x_{2_1 ,i,t} = \rho _3 x3_{i,t - 1} + \mu _i + \tau _{i,t} } & {{\rm for}\,t = 2,...,T,} \\ \end{array}$$

(6.11)

$$\begin{array}{*{20}c} {x_{2_2 ,i,t} = \rho _4 x4_{i,t - 1} + \mu _i + \lambda _{i,t} } & {{\rm for}\,t = 2,...,T.} \\ \end{array}$$

(6.12)

For the time-fixed regressors $z_{1_{1}}, z_{1_{2}}, z_{2_{1}}$ we analogously define

$$z_{1_{1,i}} = 1,$$

(6.13)

$$ z_{1_{2,i}} = g_1 \psi _i + g_2 \delta _i + \kappa _i ,$$

(6.14)

$$z_{2_1 ,i} = \mu _i + \delta _i + \psi _i + \in _i .$$

(6.15)

The variable $z_{1_{1},i}$ simplifies to a constant term, $z_{2_{1},i}$ is the endogenous time-fixed regressor since it contains μ _i as right-hand-side variable, the weights g ₁ and g ₁ in the specification of $z_{1_{2},i}$ control for the degree of correlation with the time-varying variables $x_{1_{1},i,t}$ and $x_{1_{2},i,t}$.^{Footnote 13} The remainder innovations in the data generating process are defined as follows:

$$v_{i,t} \sim N(0,\sigma _v^2 ),$$

(6.16)

$$\mu _i \sim N(0,\sigma _\mu ^2 ),$$

(6.17)

$$\delta _i \sim U( - 2,2),$$

(6.18)

$$\xi _{i,t} \sim U( - 2,2),$$

(6.19)

$$\psi _i \sim U( - 2,2),$$

(6.20)

$$\omega _{i,t} \sim U( - 2,2),$$

(6.21)

$$\tau _{i,t} \sim U( - 2,2),$$

(6.22)

$$\lambda _{i,t} \sim U( - 2,2),$$

(6.23)

$$ \in _i \sim U( - 2,2),$$

(6.24)

$$\kappa _i \sim U( - 2,2).$$

(6.25)

Except μ _i and ν _i,t, which are drawn from a normal distribution with zero mean and variance $\sigma_{\mu}^{2}$ and $\sigma _{\nu}^{2}$, respectively, all innovations are uniform on [−2,2]. For μ _i,δ _i,ψ _i,ϵ _i,κ _i the first observation is fixed over T. With respect to the main parameter settings in the Monte Carlo simulation experiment we set:

$\beta_{1_{1}}=\beta_{1_{2}}=\beta_{2_{1}}=\beta_{2_{2}}=1$
$\gamma_{1_{2}}=\gamma_{2_{1}}=1$
ρ ₁=ρ ₂=ρ ₃=ρ ₄=0.7

All variable coefficients are normalized to one, the specification of ρ<1 assures that the time-varying variables are stationary. We also normalize σ _ν equal to one and define a load factor ξ determining the ratio of the variance terms of the error components as ξ=σ _μ/σ _ν. ξ takes values of (2, 1 and 0.5). We run simulations with different combinations in the time and cross-section dimension of the panel as N=(100,500,1000) and T=(5,10). All Monte Carlo simulations are conducted with 500 replications for each permutation in y and u. As in Arellano and Bond (1991) we set T=T+10 and cut off first 10 cross-sections, which gives a total sample size of NT observations.

We apply the FEVD and Hausman–Taylor estimators.^{Footnote 14} As outlined above, one drawback in earlier Monte Carlo based comparisons between the HT model and rival non-IV candidates was the strong assumption made for IV selection in the HT case, namely that true correlation between right-hand-side variables and the error term is known. However, this may not reflect the identification and estimation problem in applied econometric work and Alfaro (2006) identifies it as one of the open questions for future investigation in Monte Carlo simulations. We therefore account for the HT variable classification problem by implementing algorithms from ‘model selection criteria’-literature, which combine information from Hansen/Sargan overidentification test for moment condition selection as outlined above and time-series information-criteria. Following Andrews (1999), we define a general model selection criteria (MSC) based on IV estimation as

$$ \mathit{MSC}_{n}(m) = J(m)-h(c)k_{n},$$

(6.26)

where n is the sample size, c as number of moment conditions selected by model m based on the Hansen J-statistic J(m), h(.) is a general function, k _n is a constant term. As (6.26) shows, the model selection criteria centers around the J-statistic.^{Footnote 15} The second part in (6.26) defines a ’bonus’ term rewarding models with more moment conditions, where the form of function h(.) and the constant terms k _n are specified by the researcher. For empirical application Andrews (1999) proposes three operationalizations in analogy to model selection criteria from time series analysis:

MSC-BIC: J(m)−(k−g)lnn
MSC-AIC: J(m)−2(k−g)
MSC-HQIC: J(m)−Q(k−g)lnlnn with Q=2.01

where (k−g) is the number of overidentifying restrictions, and depending on the form of the bonus term, the MSC may take the BIC (Bayesian), AIC (Akaike) and HQIC (Hannan Quinn) form. We apply all three information criteria in the Monte Carlo simulations motivated by the results in Andrews and Lu (2001) and Hong et al. (2003) that the superiority of one of the criteria over the others in terms of finding consistent moment conditions may vary with the sample size.^{Footnote 16} For each of these MSC criteria, we specify the following algorithms:

1.
Unrestricted form: For all possible IV combinations out of the full IV-set S=(QX1,QX2,PX1,PX2,Z1,Z2), where Q denote deviations from group means and P are group means. The IV set satisfies the order condition k ₁>g ₂ (giving a total number of 42 combinations). We calculate the value of the MSC criterion (for the BIC, AIC and HQIC separately) and choose that model as final HT specification, which has minimum MSC value over all candidates.
2.
Restricted form: This algorithm follows the basic logic from above, but additionally puts the further restriction that only those models serves as MSC candidates for which the p-value of the J-statistic is a above a critical value K _crit., which we set to 0.05 to maximize the likelihood that the selected moment conditions are valid in terms of statistical pre-testing. The restricted (see Andrews 1999, for this point).

We present flow charts of the restricted and unrestricted MSC based search algorithm in Fig. 6.8. As Andrews (1999) argues, the above specified model selection criteria is closely related to the C-statistic approach by Eichenbaum et al. (1988) to test whether a given subset of moment conditions is correct or not.^{Footnote 17}

Thus, alternatively to the above described algorithms, we specify a downward-testing approach based on the C-statistic: Here, we start from the HT model with full IV set in terms of the REM moment conditions as S ₁=(QX1,QX2,PX1,PX2,Z1,Z2). We calculate the value of the J-statistic for the model with IV-set S ₁ and compare its p-value with a predefined critical value K _crit., which we set in line with the above algorithm as K _crit.=0.05. If $P_{S_{1}}>K_{\mathit{crit.}}$ we take this model as a valid representation in terms of the underlying moment conditions. If not, we calculate the value of the C-statistic for each single instrument in S ₁ and exclude that instrument from the IV-set that has the maximum value of the C-statistic.

We then re-estimate the model based on the IV-subset S ₂ net of the selected instrument with the highest C-statistic and again calculate the J-statistic and its respective p-value. If $P_{S_{2}}>K_{\mathit{crit.}}$ is true, we take the HT-model with S ₂ as final specification and otherwise again calculate the C-statistic for each instrument to exclude that one with the highest value. We run this downward-testing algorithm for moment conditions until we find a model that satisfies $P_{S_{.}}>C_{\mathit{crit.}}$ or, at the most, until we reach the IV-sets S _n to S _m, where the number of overidentifying restrictions (k−g)=1, since the J-statistic is not defined for just-identified models. Out of S _n to S _m, we then pick the model with the lowest J-statistic value. The C-statistic based model selection algorithms is graphically summarized in Fig. 6.9.

Appendix B: Variable Description for the Gravity Model

Table 6.3 Data description and source for export model

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mitze, T. (2012). Estimating Gravity Models of Trade with Correlated Time-Fixed Regressors: To IV or not IV?. In: Empirical Modelling in Regional Science. Lecture Notes in Economics and Mathematical Systems, vol 657. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22901-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-22901-5_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22900-8
Online ISBN: 978-3-642-22901-5
eBook Packages: Business and EconomicsEconomics and Finance (R0)

Publish with us

Policies and ethics