Baseline setup
Our goal is to examine the contribution of terminating BITs to the flow of FDI consistently. To this end, we exploit the timing of BIT terminations as a source of variation in the FDI inflows. Our identification strategy relies on pre-termination trends, which we exploit by using a simple DD estimator. In this respect, we augment the simple 2 × 2 DD empirical specification with multiple groups and multiple time periods (Bertrand et al., 2004; Hansen, 2007a, 2007b; Imbens & Wooldridge, 2009):
$$Y_{i,t}\;=\;\gamma_i\;+\;\lambda_t\;+\;\widehat B\;\times\;M_{i,t}\;+\;\mathbf X_{i,t}^{'}\;\delta\;+\;\varepsilon_{i,t}$$
(1)
where \(y\) is the FDI inflow of i-th investor in t-th year-quarter to India. The key coefficient of interest is \(\hat{\beta }\) which denotes the impact of BIT termination M. The treatment effect is measured with a dummy variable, which indicates one if an FIHC government i has no BIT in a year-quarter t and zero otherwise. This coding is opposite to most other studies but is intentional because the coefficient 2 × 2 DD fixed effect estimator can be interpreted as the effect of BIT termination.
The term \(\gamma\) captures the investor-level heterogeneity bias that is unobserved, and \(\lambda\) denotes the set of time-varying common FDI technology shocks common to all home countries of foreign investors to India. The vector X contains the covariates capturing time-varying investor characteristics. In studies with large numbers of zeros and the presence of heteroscedasticity, the reliance on the Poisson pseudo-maximum likelihood (PPML) with clustered standard errors is recommended (Head & Mayer, 2014; Santos Silva & Tenreyro, 2006; Westerlund & Wilhelmsson, 2011). In panels, it delivers consistent and unbiased estimates with reasonably large number of groups and time periods (Fernandez-Val & Weidner, 2016). In the following sections we discuss the identifying assumptions, major challenges and outline important checks, which help us to test validity of our approach in case of BIT termination.
Exogeneity and quasi-randomization of BIT termination
To understand whether the exogeneity and random occurrence of BIT terminations in India is a plausible assumption, we require detailed knowledge about the termination decision and the effective termination dates.
The ratifications of BITs have often been done as a response to the prospect of large volumes of investments. However, BIT terminations in India have different reasons and appear not to be a direct response to FIHC specific factors. In India, the main rationale behind termination was the increasing number of BIT-based ISDS claims and the urge of India to introduce a new model BIT, which limits the scope of investors rights and gives India more space for regulation and policy making (Ranjan, 2019). Terminations were not directed against the few foreign investors involved in ISDS procedures against India. It was instead a general decision to terminate all BITs.Footnote 12 An example is the ISDS case won by an Australian investor in 2011. Indian officials have singled out this case as a reason for the revision of their BIT strategy, but it did not lead to the termination of the Australian BIT until March 2017, while the first termination in August 2013 against Argentina did happen without any previous ISDS claim or any other dispute.
Moreover, BIT terminations in India are by themselves not directly related to contemporaneous foreign investor relations such as high FDI stocks or recent FDI inflows. Sparing important BIT partner countries (for example, Mauritius and Singapore) from BIT termination would be an indication of a strategic termination decision. Moreover, having FIHC governments with high bargaining power on one side (e.g. BIT) and less powerful FIHCs on the other side, would also be an indication that termination is not ‘as good as random’. Looking at the effective termination in the case of India, we see that FIHCs with high bargaining power end up in different groups. Investigating such cases, we find Singapore with a ratified BIT in place while Mauritius experienced an effective termination of its BIT in the year 2017. A rather heterogeneous set of ratification dates, durations of the treaties and sunset clauses of countries with arguably high bargaining power (France May 2000, Netherlands December 1996, Spain December 1998, Italy March 1998 etc.) left these countries with different effective termination dates within a few months of the announcement of the general decision (e.g. France 2017Q2, Spain in 2016Q3, Netherlands 2016Q4, Italy 2017Q1) and others after longer time spans (e.g. China: BIT ratification in August 2007 and BIT termination in late 2018) (for an overview of the distribution of ratification year, treaty duration and sunset clause duration, see Table 2).
Table 2 Timing of ratification and treaty terms Figure 2 shows an overview of the three groups of FIHCs plotted in order of the aggregate FDI inflows between 2013 and 2019 by the three different groups. In addition to our investigation of ratification dates, renewal procedures, and the announcement timing, the pattern derived from ordering FIHCs by aggregate FDI inflows from larger to smaller does not appear to indicate bias towards FIHCs with high bargaining power in any of the groups (‘never’, ‘always’ and ‘timing’).
In fact, the critical factors for effective termination are the decision to terminate all BITs and the expiration date of each individual BIT. The expiration date depends on treaty terms of the BITs like minimum duration (‘honeymoon phase’), extension and termination rule along with their original ratification date. In India, the ‘honeymoon phase’ amounts to 10 years for 72 FIHCs, 15 years for seven FIHCs, and indefinite for eight FIHCs of all 87 BITs. After this phase, BITs had been up for automatic renewal for either an indefinite period, or 10 to 15 years. The expiration of the ‘honeymoon phase’ is a necessary condition for unilateral BIT termination.
Due to the fact that many BITs had been ratified at different times, often decades ago, and that treaty terms considerably vary among BITs, and given the arbitrary decision to terminate all BITs at the earliest possible termination date, we argue that effective BIT terminations have been randomized in case of India. Some BITs were terminated soon after the decision, others followed later and some are still in force today. This does not only introduce randomization but also enables us to make comparisons of FIHCs with and without BITs over the whole time period (‘always’, ‘never’ groups), and conduct comparisons of BIT terminations at different points in time (‘timing’ groups).
The BIT terminations between 2013 and 2018 were conducted with remarkable rigor and in almost each case were implemented exactly at the earliest possible termination date. When looking at the timing, almost all BITs were unilaterally terminated as soon as they exceeded the initial or renewal period.Footnote 13 By the end of 2018, 68 BITs were terminated and 19 remained in force. These active BITs, with the exception of Philippines and Singapore, had not been eligible for unilateral termination.
Based on this evidence, and the fact that it is impossible to manipulate the ratification dates and treaty terms or anticipate the termination decision in 2013, decades earlier when the treaty terms were determined and ratified, we argue that BIT terminations are plausibly exogenous to India-specific and FIHC-specific factors.Footnote 14 Further concerns of non-random selection are addressed by including additional time-varying control variables to account for observable differences between FIHCs. In addition, we rely on further tests to check on the identifying assumptions by adding FIHC-specific linear and quadratic time trends and removing pre-treatment trends (see Sect. 6.1.1).
BIT termination as sharp break from existing investor protection
BIT termination posits an instantaneous and reasonably sharp break from existing investor protection at a clearly defined date. After the passing of this date, violations of investor rights of new investments are legally handled by domestic courts. For the estimation of the effect of BITs, the termination perspective has a considerable advantage over the ratification perspective. While newly ratified BITs predictably remain effective over longer time spans, FDI behavior does not necessarily have to adjust instantaneously. Unilateral termination rules in the BIT obliged India to notify BIT partner countries a year in advance of the effective termination date. This rule could lead to an anticipation effect of a deterioration of conditions for foreign investors (investor protection, general hostility), which makes it more difficult to identify the treatment effect at the termination date. We do not expect such a behavior due to the ‘sunset clause’ contained in the Indian BITs. It protects all investments conducted before BIT termination for an extended period of 5 to 20 years after the termination date. The effect of the notification of the termination, if anything, could trigger an increase in FDI inflows to India before the effective termination date. For the treatment effect, however, we expect a substantial and instantaneous drop in FDI inflows in response to effective BIT termination. Aside of these expectations, we address these concerns about anticipation by conducting placebo tests (Sect. 6.1.2).
BIT termination as heterogeneous treatment
As Table 1 has shown, BIT termination in India had been heterogeneously scattered over different quarters between 2014 and 2018. Heterogeneity in the treatment effects does potentially introduce bias to the two-way fixed effects estimatorFootnote 15 (De Chaisemartin & D'Haultfoeuille, 2020; Imai & Kim, 2019). In this regard, addressing this problem requires a better understanding of the sources of the heterogeneity, which can—aside of heterogeneity in timing – also be heterogeneity in magnitude of the treatment over time. Hence a deeper understanding of the heterogeneity is important to understand the validity of the estimators, which are available for such as setting.
In our case, the terminations are stable and have a sustained legal effect. Although having some differences, the scope of coverage of Indian BITs is remarkably consistent with respect to important cornerstones of investor protection. ISDS provisions, expropriation clauses and full protection and security are included in all 87 Indian BITs. Fair and equitable treatment (85), National Treatment (84) and Most Favored Nation Clauses (81) are included in most Indian BITs. Looking deeper into ISDS there are 10 countries with security exception clauses, and another three exclude certain policy areas from ISDS and one has limitations to the provisions of ISDS (Slovakia). Despite these few deviations, we assume that the BITs to India are remarkably similar in the composition of treaty terms and the strength of ISDS provisions.Footnote 16 Moreover, after India had terminated BITs, the legal situation changed immediately and this change did not alter any further over the whole time period under consideration. The scope of the treatment effect is stable because BIT termination has the same consequences for investor protection rights at any year-quarter period. Therefore, we address heterogeneity in timing, for which we are able to rely on two recently introduced estimators, which are now briefly introduced
Decomposing the treatment effect using Bacon-Goodman approach
Given the variation in the timing of BIT terminations, the fixed-effects DD estimator from Eq. (1) is most likely contaminated by FIHC-specific time trend in FDI inflows induced by the termination itself. One way to address this problem is the recently introduced Goodman-Bacon (2018) decomposition. In its simplest version, \(\hat{\beta }^{DD}\) consists of the weighted average of all DD parameters of all possible 2 × 2 period comparisons. Compared to the original specification in Eq. (1), which only effectively compares treated and non-treated FIHCs, the decomposition exploits several comparisons of groups being treated earlier and later over the period in question. For these comparisons, we apply the specifications as in Eq. (1) but replace the 2 × 2 two-way fixed effects DD estimator
$$y_{i,t}\;=\;\gamma_i\;+\;\lambda_t\;+\;\widehat\beta^{DD}\cdot\;M_{i,t}\;+\;X_{i,t}^{'}\;\delta\;+\;\epsilon_{i,t}$$
(2)
where \(y\) represents the FDI inflow from investor i at time t, M denotes again the dummy variable which indicates BIT termination by switching from 0 to 1 or remains constant otherwise, and X is a vector of time-varying covariates. Given the \(k = 1,2, \ldots K\) range of possible treatment-period interactions, we deploy specifications with different treated and untreated groups. Besides the ‘timing’ groups ‘early’ (k) and ‘late’ (l) termination, we also consider a group of never (U) and always (A) treated groups. The ‘never’ and ‘always’ treated groups represent the countries, which have a valid BIT in place throughout the whole timespan and is therefore never treated with termination while the always group are the FIHCs, which have never ratified a BIT in the first place. The decomposed treatment effect contains the following two-by-two estimators (\({\beta }^{2x2})\) and weights (S):
$${\hat{\beta }}^{DD}={\sum }_{k\ne U}{S}_{kU}{\beta }_{kU}^{2x2}+{\sum }_{k\ne A}{S}_{kA}{\beta }_{kA}^{2x2}+{\sum}_{\begin{array}{c}k\ne U\\ k\ne A\end{array}}{\sum }_{l>k}\left[{S}_{kl}^{k}{\beta }_{kl}^{2x2,k}+{S}_{kl}^{l}{\beta }_{kl}^{2x2,l}\right]$$
(3)
In the two-way fixed effects specification the estimate of \({\hat{\beta }}^{DD}\) (2) is the ‘variance- weighted average treatment effect on the treated’ (VWATT) and represents the ‘ weighted average of all possible two-by-two DD estimators’. According to Goodman-Bacon (2018: p. 8) they are defined as following:
$${\beta }_{kU}^{2x2}\equiv \left({\overline{y} }_{k}^{Post\left(k\right)}-{\overline{y} }_{k}^{Pre\left(k\right)}\right)-\left({\overline{y} }_{U}^{Post\left(j\right)}-{\overline{y} }_{U}^{Pre\left(j\right)}\right)$$
(4)
$${\beta }_{kA}^{2x2}\equiv \left({\overline{y} }_{k}^{Post\left(k\right)}-{\overline{y} }_{k}^{Pre\left(k\right)}\right)-\left({\overline{y} }_{A}^{Post\left(j\right)}-{\overline{y} }_{A}^{Pre\left(j\right)}\right)$$
(5)
$${\beta }_{kl}^{2x2,k}\equiv \left({\overline{y} }_{k}^{Mid\left(k,l\right)}-{\overline{y} }_{k}^{Pre\left(k\right)}\right)-\left({\overline{y} }_{l}^{MId\left(k,l\right)}-{\overline{y} }_{l}^{Pre\left(k\right)}\right)$$
(6)
$${\beta }_{kl}^{2x2,l}\equiv \left({\overline{y} }_{l}^{Post\left(l\right)}-{\overline{y} }_{l}^{Mid\left(k,l\right)}\right)-\left({\overline{y} }_{k}^{Post\left(l\right)}-{\overline{y} }_{k}^{Mid\left(k,l\right)}\right)$$
(7)
The superscripts stand for the subperiods after (‘Post’), and before (‘Pre’) treatment as well as at the time window when the ‘treatment status varies’ (‘Mid’) (Goodman-Bacon, 2018: p. 6). For example, the values of the ‘early’ groups work as treatment while the ‘later’ groups in this setting serve as control groups (6). The weights S are determined by group sizes and treatment variance. The variance of treatment is largest for groups with timing of the treatment being in the middle of the time span of the study and the lowest weights are assigned to groups near the start and end points (Goodman-Bacon, 2018: p. 8–9).
Two key identifying assumptions for the validity requires the treatment effect to be stable over time which requires homogeneity and monotonicity. The first means that the treatment effect should not vary in intensity over time while the latter excludes the case where units switch in and out of treatment. As described in the properties of BIT termination in India, we have no FIHCs in our sample where the treatment effect changes in intensity over time and no FIHCs, which had terminated and resigned a new BIT during the time span of our study. Therefore, we expect that the Bacon-Goodman approach is valid in our case and that the estimator delivers unbiased estimates of the treatment effect.
Estimating the heterogeneous treatment with temporal correction
As another alternative, we compute the Wald DD and time-corrected Wald ratio (Wald TC) coefficients that are applicable for multi-group and multi-periods DD settings (De Chaisemartin & D'Haultfaeuille, 2018). To address the temporal correction, we calculate the coefficients for the multiple groups and multiple period case, which is weighted sum of coefficients at each year-quarter as the weighted average of the local average treatment effects (LATEs) of FIHCs switching treatment at any year-quarter. This implementation is straightforward, when the treatment is homogenous and monotonic over time. In fact, homogenous means that we can assume that the treatment rate of the treated is constant over time and monotonic requires that groups do not switch repeatedly in and out of treatment (de Chaisemartin & D’Haultfoeuille, 2017). The termination of BITs is homogenous, because it becomes effective at a clearly defined date and maintains constant until a new BIT is ratified. Monotonic treatment holds in our case because, there were no new ratifications which replace formerly terminated BITs terminations.
Another important assumption in our setting is ‘conditional common trends’, which is more credible than assuming just ‘common trends’ for estimating the Wald estimators in presence of multiple groups and heterogeneous treatment (de Chaisemartin & D’Haultfoeuille, 2017). We control for the covariates, which influence the pre-treatment dynamics of the outcomes. Herewith, we precisely rely on the covariates to balance FIHCs in treatment and control groups in the pre-treatment period with propensity score matching. We calculate the standard errors by using cluster bootstrapping. This procedure paired with the fact that our setting matches well with the required assumptions of the Wald DD estimator and the Wald DD with temporal correction, make us confident that the use of these estimators is valid and that the results are unbiased.
For better comparability of the results, we use the same three sets of subsamples for the two Wald estimators as for the Bacon decomposition and add the same control variables.