1 Introduction

Ever since the founding of the European Economic and Monetary Union (EMU) in 1990, the euro as a single common currency for the EMU members is controversially discussed by economists. The controversy arose because the literature on optimum currency areas (OCA) pioneered by Mundell (1961), McKinnon (1963) and Kenen (1969) does not clearly point out how high costs and benefits due to a membership in a common currency area are. Further, it is controversial whether the nineteen members of the euro area sufficiently meet the criteria and requirements stated in the OCA literature to minimize the costs that arise in common currency areas and what the consequences are of not fulfilling these criteria. By now, empirical research on costs and benefits from the euro is mainly about whether the EU and EMU member states satisfy the OCA criteria or not. Overall, the OCA literature lacks on a formal universal theory that clearly identifies the costs and benefits of a common currency adoption and the empirical literature rarely indicates whether a euro adoption is associated with costs or benefits for a particular country. Thus, a meaningful measure of an ex-post benchmark has the potential to shed new light on the euro debate.

More recently, several empirical studies by Fernandez & Garcia Perea (2015), Verstegen et al. (2017), Lin & Chen (2017), Gomis-Porqueras & Puzzello (2018) and Gabriel & Pessoa (2020) have attempted to provide such an ex-post benchmark by applying the synthetic control method (SCM) by Abadie & Gardeazabal (2003) and Abadie et al. (2010, 2015) (AGDH hereinafter). However, the evidence by applying this method is not overly consistent, leading to the conclusion that the method is not too robust to different settings of the adjustment screws such as changes in the donor pool, selection of covariates and the use of pre-intervention outcomes as separate predictors. Although the empirical approach of all of the above studies is very reasonable, we do not know which of the approaches we can trust or how large the euro net effect really is for each EMU member, since the selection of the donor pool and covariates is based on more or less subjective measures and these choices change the outcomes.

Recently, a new SCM based on structural time series models (STMs) and cointegration has been developed by Harvey & Thiele (2021) (HT hereinafter). The approach of the HT-SCM is to identify control series that are comoving to the target series prior the intervention, i.e., the euro adoption. When modelling target and control series in a multivariate STM and restrict them to share a common trend, the synthetic control is obtained by forecasting the common trend for the target series to the post-intervention period. The attraction of this method is, that no covariates are needed, time series dynamics such as structural breaks (e.g., the 2008/2009 financial crisis) can be easily incorporated into the model, and the selection of the control group can be based on cointegration - i.e., the long-term co-movement of two series - as an objective measure. The presence of cointegration can be tested using the test of Kwiatkowski et al. (1992), which is shown to be powerful for small time samples by HT. Plausibility considerations to avoid cointegration by chance in a short time sample can also be taken into account when selecting the control group. Thus, the method is able to solve several issues which occur in the AGDH-SCM studies so far.

This paper aims to contribute to the existing synthetic control studies on the euro net effect by providing new evidence for eight early euro adopters (Austria, Belgium, France, Germany, Greece, Italy, the Netherlands and Spain) and for the EMU aggregate (EA19) by applying the HT-SCM. Once the synthetic controls are estimated, the OCA literature is related to the estimated gaps, and panel regressions are used to test the hypotheses about the effect of the derived main drivers of gains and losses. I find that the euro net effect on GDP per capita is positive for Greece and Spain over the period 1999–2007, the Netherlands gain a bit, Austria and Belgium are only hardly affected, while France and Germany lose over this period. Italy is a clear loser of the euro and loses from 1999 to 2019, as is France, but its losses are not as drastic as Italy’s. The financial and euro crises have a massive impact on losses from the euro adoption, so that after the financial crisis all of the eight EMU members considered (CEA8 hereinafter), with the exception of Germany, lose relative to their synthetic controls. In particular Greece, Italy and Spain are big losers from the euro crisis. Relating the estimated gaps to the main drivers of gains and losses derived from the OCA literature, I find that main drivers of gains are openness, real convergence and labour mobility (which is proxied by net migration). Main drivers of losses are low levels of competitiveness and fiscal instability and labour market rigidity. Examination of the crisis channels of the financial and euro crises shows that fiscal instability, labour market rigidity but also business cycle synchronization cause the severe losses from the crises. Net migration helps to dampen these shocks. While openness is beneficial during the pre-crises period, it could not help to dampen the shock of the crises.

The remainder of this paper is as follows. Section 2 discusses the related literature. Section 3 presents the empirical method of the HT-SCM, it explains the empirical strategy and describes the used data. Section 4 shows the empirical results of the euro net effect for the CEA8 and the EA19 aggregate and performs regression analysis to relate the OCA literature to the estimated gaps. Section 5 concludes.

2 Related literature

2.1 Common currencies and EMU: theory and empirical evidence

Sharing a common currency comes with the cost of losing an individual monetary policy. Instead of having its own central bank, a common central bank manages the monetary policy for the whole currency area. In particular, the inability to devalue or revalue one's currency in response to asymmetric shocks is seen as a major cost. But there are also substantial benefits of a common currency resulting from improved trading conditions with currency partners (economies of scale, no exchange-rate risk, lower transaction- and information costs, price transparency) and increasing trade to these improved conditions (see De Grauwe (2020) for a comprehensive review on costs and benefits of common currency). The early theories of OCAs consider several adjustment and flexibility requirements that members of a common currency should satisfy in order to be able to adjust for asymmetric shocks and minimize their economic losses from abandoning their individual monetary policy. These requirements include labour mobility (Mundell 1961), wage and price flexibility (ibid.), a high degree of openness (McKinnon 1963), product diversification (Kenen 1969), fiscal integration of common currency members (ibid.) and risk sharing through capital market integration (Ingram 1969, 1973; Mundell 1973).Footnote 1

More modern theories focus on the conditions under which a common central bank policy is optimal, even if some of the OCA criteria of the early theories are not satisfied. These conditions are synchronous business cycles (Frankel & Rose 1997, 1998) and synchronous reaction to asymmetric shocks (ibid., Alesina et al. 2003). Moreover, Frankel & Rose (1997, 1998) propose the view that common currency areas are endogenous. The authors argue that sharing a common currency will lead to an increase in economic relations between currency partners, which in turn results in a stronger correlation of their business cycles. As a result, their response to shocks will also become more synchronized. In the end, the common central bank policy may be optimal for all currency members as they converge economically. Thus, even if a monetary union is not optimal ex ante, it can be optimal ex post. Contrary to this view, Krugman (1993) argues that increasing economic relations lead to specialization, which in turn desynchronizes business cycles and responses to shocks. Thus, the common central bank policy may become more expensive for all currency members in the end.

Numerous empirical studies have been conducted on the criteria and conditions of early and modern OCA theories in the EU and EMU. While it seems to be common knowledge that labour mobility, price and wage flexibility and fiscal integration are (too) low and trade openness and product diversification are (sufficiently) high in EU and EMU member states (see Baldwin & Wyplosz 2019, Sect. 15.5), empirical evidence on the degree of risk sharing is mixed. Bekaert et al. (2017) and Hoffmann et al. (2018) find that integration into financial markets has not increased in the euro area. Ferrari & Picco (2016) find that risk sharing across EMU members is decreasing rather than increasing, while Cimadomo et al. (2017) find that 40% of pre-crisis and 65% of post-crisis shocks are dampened by financial integration among euro area members. Integrating financial markets to share risk appears to be an effective tool, but eurozone members are not integrating deeply enough. The empirical literature on the synchronization of business cycles and shocks (see Campos et al. (2017), Campos & Macchiarelli (2016), Wortmann & Stahl (2016), Fingleton et al. 2015, Pentecôte & Huchet-Bourdon (2012)) concludes that there is a core-periphery pattern in the EMU, first identified in Bayoumi & Eichengreen (1993). The core consists of Germany, Austria, Netherlands, Belgium, France, Luxembourg and Finland and is more synchronized, which makes the core countries (better) suited for the formation of a monetary union than the periphery (Greece, Ireland, Italy, Spain and Portugal). With regard to real convergence processes, the empirical literature finds some evidence of beta convergence in the sense of a catching-up process of the peripheral countries (which are also the relatively poorer countries in EMU) (see, for example, Dreyer & Schmid (2016) or Franks et al.). Economic relations between the currency partners are estimated on the basis of their multilateral trade relations. The empirical evidence on the trade effect in the euro area is also mixed, but differs only in the size of the trade increase due to the introduction of the euro. Rose (2016) examines a meta-analysis on the euro effect on trade and finds that trade between euro partners increased by 8.9% to 12.3%.

There is a relatively small strand in the empirical literature that focuses on ex ante benchmarks of euro adoption in a single country. Bayoumi & Eichengreen (1997), Horvath & Komarek (2003), Horvath (2005), and Frydrych & Burian (2017) attempt to provide ex ante benchmarks by quantifying OCA indices that measure a country's eligibility for eurozone membership based on some of the aforementioned OCA criteria. However, these indices are very sensitive to input data, and minor methodological differences lead to different results. As a result, the indices are hardly considered reliable (ibid., p. 196).

Considered as a whole, the theoretical and empirical literature on OCAs does not provide clear evidence of a negative or positive euro net effect, and ex-ante benchmarks are only hardly reliable. Most studies focus on the OCA criteria and whether or not they are sufficiently met by EMU members. What has been missing from the OCA literature is a reliable empirical measure of how large the gains or losses from joining EMU are for a given country. Empirical research provides only a vague suggestion that peripheral countries may catch up due to convergence processes, while core countries may perform better in times of crises because their response to exogenous shocks is more synchronous and ECB policies to respond to shocks are therefore more optimal for core countries than for peripheral countries. Therefore, a meaningful and reliable measurement of an ex-post benchmark for the success of a euro adoption could open new perspectives on the relative importance of the OCA criteria, and assessing the impact of the euro effect for several countries could have great potential to provide clarity on the euro debate.

2.2 Synthetic control methods to estimate euro net effects

An empirical procedure for ex post evaluation of the euro adoption should give a reasonable answer to the question “what would be the level of prosperity in a country if it had not joined EMU?”. To answer this question, a control is needed for exactly the same country that has not joined the euro. In macroeconomics, however, these exact controls do not exist, so a meaningful synthetic control must be constructed for these types of questions. So far, the most popular and widely used method to construct such a synthetic control is the AGDH-SCM. This method uses a data-driven procedure to construct a synthetic counterfactual as a weighted combination of control series. The algorithm chooses the weights, which sum to one, such that the characteristics (certain covariates) of the synthetic control best match those of the target series during the period before the specific intervention. In other words, the algorithm minimizes the root mean square prediction error of the pre-intervention period with respect to the weights. The development of this methodology, as well as a reasonable length of the post-euro adoption period, prompted a growing number of researchers to conduct empirical studies of the net euro effect for several euro area members.

More recently, the studies by Fernandez & Garcia Perea (2015), Verstegen et al. (2017), Lin & Chen (2017), Puzzello & Gomis-Porqueras (2018) and Gabriel & Pessoa (2020) have used this method to estimate the euro net effect on GDP per capita for some or all of the twelve early euro adopters. Table 1 summarizes their results. For some countries the estimates of the synthetic controls are quite similar (Italy, Portugal, France) but for others they are not very consistent (Germany, Austria, Belgium, Netherlands). Overall, there is some noticeable variation within the results and we do not know which model or which study to trust (more). As a result, there is no robust evidence about winners and losers from the euro adoption in these studies.

Table 1 Overview about euro net effect studies so far

There are some reasons that could explain the different results among the studies:

  1. a)

    The use of the "donor pool" of potential control countries varies across studies and the selection of the donor pool is based on more or less subjective plausibility considerations. While Lin & Chen (2017) restrict the potential control countries to EU members or to geographically close countries, Fernandez & Garcia Perea (2015), Verstegen et al. (2017) and Gabriel & Pessoa (2020) restrict the potential control countries to OECD members that are economically close to the country of interest. Puzzello & Gomis-Porqueras (2018) restrict the donor pool to all countries in the world that are economically close to the country of interest.

  2. b)

    The covariates to explain GDP per capita differ among the studies.

To a): The data driven algorithm tries to match the country of interest as closely as possible to a weighted average of the countries in the donor pool in the pre-intervention period. Clearly, the control group chosen is different if the donor pool is different, and the synthetic control is more or less based on these subjective measures, even if the actual selection of the control group is data-driven. Hence, the selection of the donor pool should be based on an objective measure to overcome the problem of subjectivity. To b): The data-driven procedure for selecting weights also depends on the relative importance on the covariates in explaining GDP per capita. If the covariates are different, then the relative importance and hence the estimated weights may also be different.

If there are differences between studies in both (a) and (b), the differences in estimated synthetic controls in the post-intervention period may be even larger. This point is also made by Ferman et al. (2020) who focus on “specification searching opportunities” and discuss that different specifications of pre-intervention periods and covariates can lead to substantially different synthetic controls. However, because the algorithm always estimates the weights so that the synthetic control is as close as possible to the target series in the pre-intervention period, it is possible that the synthetic controls in the pre-intervention period are very similar.

Kaul et al. (2015) give another reason why the results might differ. They show that when all pre-intervention outcomes are used as separate predictors, all other covariates become irrelevant and only the pre-intervention fit of the synthetic control is optimized. They also show that the estimation of the synthetic control can change drastically if the number of pre-intervention outcomes as predictors is restricted. Therefore, the post-intervention synthetic control may differ even if the donor pool and the covariates are the same. While Lin & Chen (2017) do not use all pre-intervention outcomes as predictors, it is not clear from the description of the other studies whether any, some or all pre-intervention outcomes are used as separate predictors.

In addition to these issues related to the AGDH-SCM, there are some general issues that may arise when using synthetic control methods to assess the euro net effect (or other macroeconomic issues). For example, an implicit and inherent assumption of any synthetic control method is that the difference between the target series and its synthetic control series in the post-intervention period is entirely driven by the intervention. However, it is possible that countries will be affected by country-specific shocks after the introduction that are not due to this intervention (e.g., Germany and France were affected by the “dot-com” crisis, or early 2000s recession, much more than other EMU-members and control countries) and the synthetic control is not able to distinguish whether this shock is caused by the intervention or by another event. Conversely, a control country could be affected by an individual shock in the post-intervention period. Since the synthetic control is created with some weight on that country, it may also be biased. If a synthetic control method cannot capture these country-specific shocks in some of the control series during the post-intervention period, these series should be excluded from the donor pool so that the synthetic estimate is not biased due to these country-specific shocks. Another issue to consider is global shocks such as the financial crisis, oil price shocks, or the rise of China. Since these shocks apply to both the target series and the control group, synthetic control methods implicitly capture these effects as they affect the control group.

Taking all of the above problems into account, the AGDH-SCM provides reasonable measures of the performance of the various EMU members after euro adoption, but the rather mixed results from the various studies suggest that it is sensitive to the donor pool and the choice of covariates. A new method proposed by HT (2021) based on structural time series models (STMs) and cointegration is able to solve the major issues of the AGDH-SCM. In this approach, the synthetic control is implicitly estimated by a multivariate balanced growth STM in which all the series in the model are described with the same trend and only individual differences in height from it. Cointegration serves as an objective criterion for the selection of the control group and is the basis for the validity of the balanced growth STM. The rational of this approach is that countries that are cointegrating share a common trend, and predicting this common trend in the post-adoption period is the synthetic control. Plausibility considerations, such as similar economic development, could also be taken into account when selecting the control group. Weights that sum to one are implicitly given by variance–covariance relations. No covariates are needed to estimate the common trend. An important feature of the HT-SCM approach is that time series dynamics are tracked in the model and common structural breaks like the financial crisis in 2008–2009 as well as country-specific structural breaks (e.g., the German reunification) can be easily accounted for both in the pre and post adoption period. But like the AGDH-SCM, the HT-SCM is unable to distinguish between the effect of an intervention and other country-specific shocks that occur at the same time in the post-intervention period. For this study, as for all other synthetic control studies so far, we must assume that differences of the synthetic controls and the target series are entirely due to the euro adoption.

The HT-SCM approach has two major weaknesses compared to the AGDH-SCM approach. First, estimated weights can be negative. Second, a large number of dummy variables need to be included in the model to capture outliers and to model the intervention of the euro adoption. While the first issue can be solved by removing the corresponding control series (this series does not contribute anything to the synthetic control), the second issue mainly concerns the length of the post-intervention period and the detection of outliers. The only way to deal with the number of parameters is to model the intervention with as few parameters as necessary, but as many as needed, and to perform a careful outlier analysis. Robustness analyses to address these issues (see Sect. 4.3) show that the HT-SCM estimates are robust to the number of parameters and modelling outliers. Thus, the HT-SCM can address several problems of the AGDH-SCM studies discussed above, while its weaknesses appear to be minor.

The aim of this paper is to contribute to the existing synthetic control literature on the euro effect by applying the new method of HT (2021). After estimating the synthetic controls, hypotheses about the main drivers of the gains and losses from EMU membership are tested with a panel regression analysis. New evidence on both the estimated synthetic controls and the main drivers of gains and losses could clarify the controversial debate over the adoption of the euro.

3 Methodology

3.1 Structural time series models and synthetic controls

The basic idea of STMs is to decompose a time series into several components that are not directly observable but can be directly interpreted, such as trend, seasonality, and cycle.Footnote 2 The basic univariate STM for the empirical research of this paper consists of a decomposition of the GDP per capita series of any country into a local linear trend model, which is

$$y_{t} = \mu_{t} + \varepsilon_{t} ,\;\varepsilon_{t} \sim N\left( {0,\sigma_{\varepsilon }^{2} } \right),\;t = 1, \ldots .,T,$$
(1)
$$\mu_{t + 1} = \mu_{t} + \beta_{t} + \xi_{t} ,\;\beta_{t + 1} = \beta_{t} + \zeta_{t} ,\;\xi_{t} \sim N\left( {0,\sigma_{\xi }^{2} } \right),\;{ }\zeta_{t} \sim N\left( {0,\sigma_{\zeta }^{2} } \right)$$
(2)

where \({y}_{t}\) is annual GDP per capita in logs, \({\mu }_{t}\) is the level component, \({\beta }_{t}\) is the slope component of the trend and \({\varepsilon }_{t}\), \({\xi }_{t}\) and \({\zeta }_{t}\) are mutually independent disturbance terms with zero mean and variances \({\sigma }_{\varepsilon }^{2}\), \({\sigma }_{\xi }^{2}\) and \({\sigma }_{\zeta }^{2}\) respectivly. If \({\zeta }_{t}=0\), the trend component is called a random walk with drift, while if \({\xi }_{t}=0\) the trend component is called an integrated random walk. Equation (1) is called observation equation. The equations for the level and slope in (2) together form the state equations.

If \(N\) univariate series are modelled in this way, the univariate STM can be generalised to the following multivariate STM

$$\begin{gathered} {\varvec{y}}_{t} = {\varvec{\mu}}_{t} + {\varvec{\varepsilon}}_{t} , \hfill \\ {\varvec{\mu}}_{t + 1} = {\varvec{\mu}}_{t} + {\varvec{\beta}}_{t} + {\varvec{\xi}}_{t} ,\;t = 1, \ldots , T \hfill \\ {\varvec{\beta}}_{t + 1} = {\varvec{\beta}}_{t} + {\varvec{\zeta}}_{t} , \hfill \\ {\varvec{\varepsilon}}_{t} \sim N\left( {0,{\varvec{\varSigma}}_{\varepsilon } } \right),\;{\varvec{\xi}}_{t} \sim N\left( {0,{\varvec{\varSigma}}_{{\varvec{\xi}}} } \right),\;{ }{\varvec{\zeta}}_{t} \sim N\left( {0,{\varvec{\varSigma}}_{{\varvec{\zeta}}} } \right) \hfill \\ \end{gathered}$$
(3)

where \({{\varvec{y}}}_{t}\) is an \(N\times 1\) vector of GDP per capita series in logs, \({{\varvec{\mu}}}_{t}\) is an \(N\times 1\) vector of indivudial trend components of each series, \({{\varvec{\beta}}}_{t}\) is an \(N\times 1\) vector of indivudial slope components of the individual trends and \({{\varvec{\varepsilon}}}_{t}\),\({{\varvec{\xi}}}_{t}\) and \({{\varvec{\zeta}}}_{t}\) are \(N\times 1\) vectors of disturbances with zero mean and \(N\times N\) variance covariance matrices \({\boldsymbol{\Sigma }}_{\varepsilon }\), \({\boldsymbol{\Sigma }}_{{\varvec{\xi}}}\) and \({\boldsymbol{\Sigma }}_{{\varvec{\zeta}}}\) respectively. \({{\varvec{\varepsilon}}}_{t}\), \({{\varvec{\xi}}}_{t}\) and \({{\varvec{\zeta}}}_{t}\) are assumed to be uncorrelated to each other. The advantage of this multivariate model is that if \({\boldsymbol{\Sigma }}_{\varepsilon }\) is non-diagonal the disturbances \({{\varvec{\varepsilon}}}_{t}\) of the series are correlated, leading to efficiency gains in estimation and forecasting. The correlation between disturbances can be seen as a control for the general economic environment for all series. Balanced growth means that the N series have a stable relationship over time, implying a stationary difference and hence cointegration between each pair of the series. Thus, a balanced growth model is a special case of (3) where \(rank\left({\boldsymbol{\Sigma }}_{{\varvec{\xi}}}\right)=1\), so that it can be simplified to

$$\begin{gathered} {\varvec{y}}_{t} = {\varvec{i}}\mu_{t} + \overline{\user2{\mu }} + {\varvec{\varepsilon}}_{t} ,\;t = 1, \ldots ,T \hfill \\ {\varvec{\varepsilon}}_{t} \sim N\left( {0,{\varvec{\varSigma}}_{\varepsilon } } \right) \hfill \\ \end{gathered}$$
(4)

where \({\mu }_{t}\) is a univariate local linear trend as in (2), \({\varvec{i}}\) is an \(N\times 1\) vector of ones and \(\overline{{\varvec{\mu}} }\) is an \(N\times 1\) vector of unrestricted constants that captures the height differences of the series. The elements of \({{\varvec{\varepsilon}}}_{t}\) are assumed to be correlated so that \({\boldsymbol{\Sigma }}_{\varepsilon }\) is non-diagonal.

HT (2021) show how the balanced growth model can be used to construct a synthetic control and assess the impact of an intervention such as the euro adoption. Assume, that the first series in (4), \({y}_{1t}\), is affected by the intervention of the euro adoption, while all other series are independent of this intervention. The euro adoption may affect this series in order to change its level or its slope. The empirical analysis of this paper concentrates on potential level changes due to the intervention. Such an intervention can be modelled as a combination of pulse and step dummy variables in the target series equation. After modelling the intervention, the first series in (4) becomes

$$y_{1t} = \mu_{t} + \overline{\mu }_{1} + \sum \lambda_{p} d_{pt} + \sum \lambda_{s} d_{st} + \varepsilon_{{1{\text{t}}}} ,\;t = 1, \ldots ,T.$$
(5)

where pulse dummies are denoted with subscript \(p\) and step dummies are denoted with subscript \(s\). \({\lambda }_{p}\) and \({\lambda }_{s}\) are their parameters. Pulse and step dummies are defined as

$$\begin{gathered} \begin{array}{*{20}c} {d_{{pt}} = \left\{ {\begin{array}{*{20}l} {0,t \ne \tau } \hfill \\ {1,t = \tau } \hfill \\ \end{array} ,} \right.} & {t = 1, \ldots ,T,} \\ \end{array} \hfill \\ \begin{array}{*{20}c} {d_{{st}} = \left\{ {\begin{array}{*{20}l} {\begin{array}{*{20}l} {0,t < \tau _{l} } \hfill \\ {1,\tau _{l} \le t \le \tau _{u} } \hfill \\ \end{array} } \hfill \\ {0,~t > \tau _{u} } \hfill \\ \end{array} } \right.,} & {t = 1, \ldots ,T,} \\ \end{array} \hfill \\ \end{gathered}$$
(6)

where \(\tau\) is a single time point and \({\tau }_{l}\) and \({\tau }_{u}\) are the lower and upper bounds of a time range. Pulse dummies capture effects only at a single time point, so they can be used to model temporary intervention effects, whereas step variables capture effects at more than one time point, so they can be used to model intermediate intervention effects if \({\tau }_{u}<T\) or permanent intervention effects if \({\tau }_{u}=T\).

Step and pulse dummy variables can also be included in the model to account for temporary and intermediate country-specific deviations from the common trend or individual structural breaks. A structural break can be modelled with a step variable with \({\tau }_{u}=T\). To allow for country-specific deviations, a more general balanced growth model is formulated as

$$\begin{gathered} y_{it} = \mu_{t} + \overline{\mu }_{i} + \sum \psi_{ip} d_{ipt} + \sum \psi_{is} d_{ist} + \left( {1{\text{|i}} = 1} \right){*}\left( {\sum \lambda_{p} d_{pt} + \sum \lambda_{s} d_{st} } \right) + \varepsilon_{it} , \hfill \\ i = 1, \ldots ,N,\;t = 1, \ldots ,T. \hfill \\ \end{gathered}$$
(7)

Again, the subscripts \(p\) and \(s\) denote pulse and step dummies in the form of (6) and \({\lambda }_{p}\) and \({\lambda }_{s}\) are the corresponding intervention parameters, while \({\psi }_{ip}\) and \({\psi }_{is}\) are the corresponding parameters to account for deviations from the common trend of country \(i\). The first equation in (7) corresponds to an EMU member while the other \(N-1\) equations correspond to the control countries. The indicator variable ensures that the intervention modelling only appears in the first equation.

Also consider common structural breaks that can occur when all series in the balanced growth model are affected by the same shock, such as the 2008–09 financial crisis. These types of shocks affect the common trend. We can capture such shocks by adding pulse and step dummy variables as defined in (6) to the trend equation, which then becomes

$$\mu_{t + 1} = \mu_{t} + \beta_{t} + \sum \varphi_{bp} d_{bpt} + \sum \varphi_{bs} d_{bst} + \xi_{t}$$
(8)

\({\varphi }_{bp}\) and \({\varphi }_{bs}\) are the parameters corresponding to the pulse and step dummies. Subscript \(b\) means that the variable occurs in the trend equation, while the other subscripts are analogous to those in Eq. (6).Footnote 3

The basic model for the empirical analysis of this paper is the balanced growth model given in (7), where the common trend is modelled as in (2) or (8), depending on whether there are any structural breaks. To estimate the model, it must be put in state space form. The regression components, i.e., the dummy variables are included in the state vector. Then the Kalman Filter and Smoother are applied to estimate the unobserved components. The hyperparameters corresponding to the variances and covariances of the error components are estimated using maximum likelihood. The likelihood function in this regard can be computed directly by the Kalman Filter equations (see Durbin & Koopman 2012, Sect. 2.10).

Weights that sum to one are implicitly given by variance–covariance relations of the series. If \({{\varvec{w}}}_{1}\) denotes the \((N-1)\times 1\) vector of weights to construct the synthetic control \({y}_{1t}^{c}\), they are given by

$$\begin{gathered} \user2{w}^{\prime }_1 = \user2{\beta }_{1}^{{^{\prime } }} + s~{\text{i}}\prime \user2{\Sigma }_{{\varepsilon \left( { - 1} \right)}}^{{ - 1}} , \hfill \\ \user2{\beta }_{1}^{{^{\prime } }} = \user2{\sigma }_{\varepsilon }^{{^{\prime } }} \user2{\Sigma }_{{\varepsilon \left( { - 1} \right)}}^{{ - 1}} , \hfill \\ \user2{\sigma }_{\varepsilon } = \text{cov} \left( {\varepsilon _{{1t}} ,\user2{\varepsilon }_{{\left( { - 1} \right)t}} } \right), \hfill \\ s = \left( {1 - \user2{\beta ^{\prime}}_{1} {\text{i}}} \right)/{\text{i}}\prime \user2{\Sigma }_{{\varepsilon \left( { - 1} \right)}}^{{ - 1}} \user2{i} \hfill \\ \end{gathered}$$
(9)

where \({{\varvec{\beta}}}_{1}\) and \({{\varvec{\sigma}}}_{\varepsilon }\) are \((N-1)\times 1\) vectors and \({\varvec{i}}\) is an \((N-1)\times 1\) vectors consisting of ones. \({\boldsymbol{\Sigma }}_{\varepsilon (-1)}\) is an \((N-1)\times (N-1)\) matrix and the subscript \(\left(-1\right)\) means that the components referring to the first row are omitted in the respective vector or matrix. A shortcoming of the HT-SCM is that “there is nothing to prevent some of the weights being negative” (HT 2021, p. 4). In relation to the AGDH-SCM, the synthetic control could be constructed directly as a weighted combination of the control series. However, constructing the synthetic control this way ignores common and country-specific structural breaks and country-specific deviations from the common trend. A more accurate measure of the synthetic control can be constructed from the output of the Kalman Smoother,Footnote 4 which is given by

$$\begin{gathered} \hat{y}_{1t}^{c} = \hat{\mu }_{t} + \widehat{{\overline{\mu }}}_{1} + \sum \hat{\psi }_{1p} d_{1pt} + \sum \hat{\psi }_{1s} d_{1st} + \hat{\varepsilon }_{1t} , \hfill \\ \hat{\varepsilon }_{1t} = \hat{\user2{\beta }}^{\user2{^{\prime}}} \left( {\hat{\user2{y}}_{{\left( { - 1} \right)t}}^{ - } - \widehat{{\overline{\user2{\mu }}}}_{{\left( { - 1} \right)}} - {\varvec{i}}\hat{\mu }_{t} } \right) \hfill \\ \hat{\user2{y}}_{t}^{ - } = \hat{y}_{it}^{ - } = y_{it} - \sum \hat{\psi }_{ip} d_{ipt} - \sum \hat{\psi }_{is} d_{ist} . \hfill \\ \end{gathered}$$
(10)

Note that if there are country-specific outliers and structural breaks, the construction of the synthetic control is not based on the original series of the control group,\({{\varvec{y}}}_{\left(-1\right)t}\), but on the model prediction adjusted for these deviations, \({\widehat{{\varvec{y}}}}_{(-1)t}^{-}\). When there are no structural breaks or country-specific deviations, as in the model (4) + (5), the construction of the synthetic control using (10) simplifies to

$$\begin{gathered} \hat{y}_{1t}^{c} = \hat{\mu }_{t} + \widehat{{\overline{\mu }}}_{1} + \hat{\varepsilon }_{1t} , \hfill \\ \hat{\varepsilon }_{1t} = \hat{\user2{\beta }}^{\user2{^{\prime}}} \left( {{\varvec{y}}_{{\left( { - 1} \right)t}} - \widehat{{\overline{\user2{\mu }}}}_{{\left( { - 1} \right)}} - {\varvec{i}}\hat{\mu }_{t} } \right) \hfill \\ \end{gathered}$$
(11)

The summations of the intervention parameters, \(\sum {\widehat{\lambda }}_{p}{d}_{pt}+\sum {\widehat{\lambda }}_{s}{d}_{st},\) in (5) and (7) is the estimate of the gap between the original target series, \({y}_{1t}\), and its synthetic control \({\widehat{y}}_{1t}^{c}\) at time \(t\ge \tau\). The intervention modelling is chosen so that \(\sum {\widehat{\lambda }}_{p}{d}_{pt}+\sum {\widehat{\lambda }}_{s}{d}_{st}\approx {y}_{1t}-{\widehat{y}}_{1t}^{c}\) with as many parameters as necessary but as few as possible.

3.2 Diagnostics checking and empirical remarks

To ensure that a model is well specified, several diagnostic tests must be performed on standardized innovations (i.e., the one-step-ahead-prediction errors of the Kalman Filter) as well as an outlier analysis using standardized smoothed state residuals and standardized smoothed observation residuals (irregular hereinafter). According to Harvey & Koopman (1992) and Harvey et al. (1998) outliers in standardized irregular residuals indicate series-specific temporary effects that can be captured with a pulse dummy in the respective observation equation. In the case of the balanced growth model, an outlier in standardized irregulars indicates a strong deviation of the respective time series from the common trend. Outliers in standardized smoothed state residuals indicate a common structural break representing a level change in the common trend.

In the balanced growth model, there is no disturbance term that directly indicates structural breaks in a particular series,Footnote 5 since the standardized smoothed state residuals indicate only common structural breaks, and a structural break in a series accounts for only a weighted portion of the total residual. If such a weight is small, the structural break in the residual cannot be traced; if such a weight is large, the residual may indicate a common structural break even though the outlier is generated by only a single series. In addition, a series-specific structural break will increase the variance of the irregulars of that series, so the standardized irregulars may not indicate outliers. However, the standardized irregulars will have a structure: Suppose there is a series-specific negative structural break at time \(\uptau\) while there are no other series-specific outliers. Then, standardized irregulars will fluctuate around a positive mean in \(t=1,\dots ,\) \(\tau\) and then around a negative mean in \(t=\tau +1,\dots ,T\). The weighted sum of the positive mean in \(t=1,\dots ,\) \(\tau -1\), and the negative mean in \(t=\tau ,\dots ,T\), will be close to zero as the standardized irregular are \(\sim N\left(\mathrm{0,1}\right)\).

As mentioned by Commandeur & Koopman (2007, p. 96), outlier analysis is not about mindlessly adding every single outlier discovered. Especially adding structural breaks should be confirmed by a theory or a historical event concerning the possible cause of the structural break. In this regard, common structural breaks are justified in times of global economic crises like the financial crisis in 2008–2009, the oil crises in 1973 and 1979 or the early 1980s and early 1990s recession. Individual structural breaks or temporary outliers should be confirmed by country-specific recessions or booms. Moreover, outlier analysis should be a tool to confirm the results of visual analysis, and visual analysis should be a tool to confirm the results of outlier analysis. Therefore, in the empirical analysis of this paper, the outlier analysis is always performed together with the visual analysis.

Note that adding a pulse dummy to the observation equation actually pushes the respective series to the common trend for the respective time point, while the observation for which the pulse dummy is added has no weight in the construction of the common trend for that time point (This follows from the fact that adding a pulse variable is the same as replacing the respective observation with a missing value). Adding a step variable to an observation equation pushes the series to the level of the common trend for the length of the step. This is an important feature, especially in the post-adoption period, as it allows the common trend to be constructed without country-specific strong deviations (This is not possible with an AGDH-SC). The drawback of this feature is that adjustments in the post-adoption period can only be added if there are at least three controls. With only two controls, the common trend in the post-adoption period is made of only two series. If, a pulse or step variable is added to an observation equation of one of the series, the common trend in that time period consists of only one series and thus the synthetic control for those time points consists of only one series.

Model diagnostics such as normality, serial correlation and heteroscedasticity are checked with standardized innovations, as recommended by Harvey et al. (1998, p. 111f.). Appropriate standardized innovations for diagnostics testing in a model with regression components are the generalised least squares (GLS) residuals (Harvey (1989, p. 386f.)). The GLS residuals are the standardized innovations that are produced when the Kalman Filter is applied to \({{\varvec{y}}}_{t}\) minus the estimated regression part of the model.Footnote 6 Therefore, it is required that the Kalman Smoother estimates of the structural breaks modelled in the trend equation in (8) are reformulated as variables affecting the observations \({y}_{it}\). The reformulation is the cumulative sum of the original dummy variable (see Commandeur & Koopman (2007, p 78ff.)):

$$\begin{gathered} \begin{array}{*{20}c} {d_{bpt} \to d_{bpt}^{y} = {\Sigma }_{t = 1}^{{\text{T}}} d_{bpt} ,} & {,t = 1, \ldots ,T,} \\ \end{array} \hfill \\ d_{bst} \to d_{bst}^{y} = {\Sigma }_{t = 1}^{{\text{T}}} d_{bst} , \hfill \\ {\Sigma }_{t = 1}^{{\text{T}}} d_{bpt} ,\; {\Sigma }_{t = 1}^{{\text{T}}} d_{bst} = 0 ,\;{\text{when}}\; t = 1 \hfill \\ \end{gathered}$$
(12)

To produce the GLS residuals, a second Kalman Filter is applied to (4) but with \({\widehat{{\varvec{y}}}}_{t}^{*}\) instead of \({{\varvec{y}}}_{t}\). \({\widehat{{\varvec{y}}}}_{t}^{*}\) is defined as

$${ \begin{aligned} \user2{\hat{y}}_{t}^{*} = & \hat{y}_{{it}}^{*} ,\;~i = 1, \ldots ,N,\;t = 1, \ldots ,T, \\ \hat{y}_{{it}}^{*} = & y_{{it}} - \sum \hat{\varphi }_{{bp}} d_{{bpt}}^{y} - \sum \hat{\varphi }_{{bs}} d_{{bst}}^{y} - \sum \hat{\psi }_{{ip}} d_{{ipt}} - \sum \hat{\psi }_{{is}} d_{{ist}} ~ \\ ~ & - \left( {1{\text{|}}i = 1} \right)*\left( {\sum \hat{\lambda }_{p} d_{{pt}} + \sum \hat{\lambda }_{s} d_{{st}} } \right) \\ \end{aligned}} $$
(13)

and the parameters \({\widehat{\varphi }}_{bp}\), \({\widehat{\varphi }}_{bs}\), \({\widehat{\psi }}_{ip}, {\widehat{\psi }}_{is},\) \({\widehat{\lambda }}_{p}\) and \({\widehat{\lambda }}_{s}\) are estimates of the Kalman Smoother of the model in (7) with the trend component as in (8).

Finally, it should be noted that model diagnostics based on the GLS residuals can only be performed in the non-diffusion phase of the Kalman filter. The Kalman filter is initialised in the same way for all models. The trend component \({\mu }_{t}\) is initialized by setting \({\mu }_{0}=0\) so that \(\overline{{\varvec{\mu}} }\) is subject to a constraint and contains only \(N-1\) free parameters as noted by Carvalho & Harvey (2005). Both the slope component \({\beta }_{0}\) as well as \(\overline{{\varvec{\mu}} }\) are initialized with a diffuse prior. Thus, the diffuse phase of the Kalman Filter lasts for the first two periods and the model diagnostics can be checked with the GLS residuals in \(t=3,\dots ,T\).

3.3 Data and empirical strategy

To analyse the euro net effect on euro area economies, annual data on real GDP per capita at constant 2010 prices from 1970 to 2019 are used and transformed into logarithms. The data stem from the World Bank’s world development indicators database and are available for 122 countries for the mentioned time series length. Among these 122 countries are 14 countries that have adopted the euro.

At first, the donor pool of potential control countries consists of the 108 non-euro area countries, but in a second stage it is restricted to fulfil three control group criteria: cointegration, small variance in the difference between the target and control countries in the pre-intervention period and economic similarity. Therefore, only a small fraction of the 108 potential control countries remains as suitable control countries for the various models. The net effect of the euro on annual real GDP per capita is analysed for eight of the twelve countries that adopted the euro early (Austria, Belgium, France, Germany, Greece, Italy, the Netherlands and Spain) as well as for the EA19 aggregate.

Finland is excluded from the analysis because of its major crisis around 1990, which has made it impossible to construct a suitable model for evaluating the euro effect. For Portugal, the criteria for the control group are too restrictive, so that no suitable control group can be found. The same applies to Ireland and Luxembourg, both of which are highly individual in terms of their economic development. Ireland is considered a "Celtic tiger" due to its rapid growth since the mid-1990s, while Luxembourg is both one of the richest and one of the smallest countries in the world. The fact that no suitable control group could be found for Ireland and Luxembourg in the studies mentioned in Table 1 and in the present study could reflect their individual growth paths. The euro adoption date for all CEA8 is 1999, except for Greece, which officially joined in 2001. Despite this small temporal difference, as in Verstegen et al. (2017) and Fernandez & Garcia Perea (2015), 1999 is chosen as the intervention date of euro adoption for Greece. This is justified by the fact that Greece had to fulfil the criteria for joining EMU for at least two years before officially joining and therefore had to adapt to the economic realities of the EMU system in 1999. Nonetheless, a model in which Greece adopts the euro in 2001 is estimated to test robustness. Due to economic turbulence and some economic differences between the control and target countries in the 1970s, the estimation sample for none of the models starts in 1970, yet the largest possible time series length is chosen for all models.

The empirical strategy for assessing the euro net effect is closely related to that proposed by HT (2021). First, cointegration between the target series and potential control countries is tested in the whole sample. To test for cointegration the test for level stationarity by Kwiatkowski et al. (1992) – KPSS hereinafter – is used by applying it to the difference between the target and control countries for the pre-intervention period. The difference is also examined graphically to ensure that the series are cointegrating and to look for distortions between the series that should be incorporated into the model by adding appropriate dummy variables. Another statistical criterion for selecting the control group is that the variance of the differences between the target and control groups should be relatively small because "controls with relatively large variances tend to be downweighted" (see HT 2021, p. 4). Consequently, including a control country with a relatively large variance in the model does not contribute much to the estimation of the synthetic control, while requiring more parameters to be estimated. Therefore, controls with relatively large variances are removed from the donor pool. Note that the difference between a perfect control and the target is zero in the pre-intervention period, implying that the control country is exactly equal to the target, with the only difference being that it is not affected by the intervention. Thus, the smaller the variance of the difference between control and target, the more suitable it is as a control, given that it is cointegrating. Plausibility considerations are also applied to the selection of the control group. In this context, only controls with a similar economic development in terms of GDP per capita are selected. To get an idea of the economic similarity, the mean difference in GDP per capita between the target country and the control group in the pre-euro period and the mean value of GDP per capita in the pre-euro period of the control country are examined. If the mean difference in GDP per capita is close to the mean of the control country, a control is considered economically similar. Once an appropriate control group is selected, KPSS tests for level and trend stationarity are performed to determine whether the series are best described by an integrated random walk, a random walk with drift, or with stochastic components for trend and slope.

The question is how to model the intervention. HT (2021) consider an approach based entirely on estimation under the balanced growth model. In this context, a model is estimated in which the values of the target series in the post-intervention period are replaced by missing values (this is the same as modelling impulse dummies for each post-intervention period), and the difference between the original target series and the synthetic control estimate of this model provides guidance on how to model the intervention. A balanced growth model in which target series values are replaced by missing values after the intervention is called a BGMTV (balanced growth with missing target values) model, and a synthetic control estimated with such a model is called a BGMTV synthetic control. Based on the difference between the original target series and the BGMTV synthetic control in the post-intervention period, the intervention is modelled with as few parameters as possible and with as many parameters as necessary, so that whenever the difference indicates an intermediate or permanent step, these steps are estimated with a single step variable instead of a bundle of pulse dummies.

Adding outliers and common and country-specific structural breaks affects the estimation of the common trend as well as the weights and \(\widehat{{\varvec{\beta}}}\). Hence, to get an idea of how to model the intervention, the underlying BGMTV model should include all relevant step and pulse variables to account for these outliers. Therefore, a preliminary outlier analysis is performed with a BGMTV model to check whether the model is well specified. Only after ensuring that the BGMTV-model captures all the relevant structural breaks and outliers, the difference between the target and the estimated BGMTV synthetic control is explored to model the intervention. After estimating a full model with original target values in the pre- and post-adoption period that includes the relevant outliers and structural breaks, as well as the intervention modelling, a final check of outliers ensures that the model is well specified.

Finally, the model diagnostics are tested based on the GLS residuals for the target series in the full model. The Jarque–Bera (JB) test is used to test normality. Autocorrelation is tested with the Ljung-Box-Q(p,l)-test, which tests the Q-statistic for the first p autocorrelations against a \({\mathrm{\rm X}}^{2}(l)\)-distribution, where \(l=p-w+1\) and w denotes the number of estimated variances belonging to the tested series. For a univariate random walk with drift, \(w=2\) since \({\sigma }_{\varepsilon }^{2}\) and \({\sigma }_{\xi }^{2}\) must be estimated. Heteroscedasticity is tested with the H(h)-test (see Harvey 1989, p. 259f.) which compares the residual variance of the first h residuals with that of the last h, where h is the nearest integer to (T-d)/3 and d being the number of diffuse initial elements (d = 2 for this study), and the null hypothesis is that of homoscedasticity. The null is rejected, if \(H(h)\ge {F}_{1-\alpha /2}(h,h)\) or \(H(h)\le {F}_{\alpha /2}(h,h).\) Alternatively, if \(H\left(h\right)<1\), the null is rejected if \(1/H(h)\ge {F}_{1-\alpha /2}(h,h)\). Any estimated model passes the diagnostics if all tests indicate that the null cannot be rejected.

The empirical analysis is performed in R with help of the KFAS-package by Helske (2017).Footnote 7 The Kalman-Filter and Smoother recursions in this package are based on the univariate treatment algorithms given in Durbin & Koopman (2000, 2003, 2012). While the univariate treatment procedure has some computational advantages, it produces cross sectionally uncorrelated innovations. Hence, it is appropriate to apply the univariate tests mentioned above to each univariate standardized innovation series (ibid. 2012, p. 188), rather than applying their multivariate analogues to the multivariate vector of standardized innovations.

4 Assessing the impact of the euro adoption

4.1 An instruction for the empirical analysis with the example of Austria

Step 1: Choosing the control group

The first step in building a model is to select an appropriate control group. The selection procedure includes visual analysis and KPSS tests to check for cointegration with the respective EMU member across the whole sample. The cointegration tests are evaluated together with the criteria of small relative variance and economic similarity in terms of a close GDP per capita level in the pre-1999 period. At the outset of the analysis, the full sample period is used, but it may be appropriate to shorten it due to convergence processes and economic turbulence in the mid-1970s or early 1980s.

KPSS tests for the period 1970 to 1998 indicate that the control group would consist of Morocco, the UK and Tunisia. But Morocco and Tunisia are not at all comparable to Austria, since GDP per capita in both countries is about 20 times smaller than in Austria. Table 2 shows the results of the KPSS tests for the period 1976 to 1998 along with the other measures of the control group criteria. For this period, the KPPS tests show that the USA, Denmark, Israel, and the UK should form the control group. According to the cointegration, low relative variance, and economic similarity criteria, this is a much more plausible control group.

Table 2 Ordered KPSS tests for cointegration for Austria

Visual analysis of the mean-adjusted series of Austria and its potential control group for the period 1970 to 2017 confirms these results. Figure 1 shows the GDP per capita series of Austria and its potential control countries and their mean-adjusted differences from Austria for this period. It can be clearly seen that Austria converges economically to the potential control countries until 1976, but then cointegrates for the entire period before the intervention. Taking the results of the KPSS tests and the visual analysis together, the USA, Denmark, Israel and the UK provide an adequate control group.

Fig. 1
figure 1

Austria and control series (left) and differences of Austria and its controls (right). The series are adjusted for the mean prior to 1999

KPSS tests for trend and level stationarity are useful to get an idea of how to model the trend. Applied to the first difference of a series, the KPSS test for level stationarity indicates whether the trend is deterministic or stochastic. Applied to the original series, the KPSS test for trend stationarity indicates whether the slope is deterministic or stochastic. Table 3 shows the results for Austria and the selected control countries for the period 1976 to 1998. Only for Israel do the tests indicate that the slope is stochastic, but from the visual analysis (see Fig. 1) there is only a vague case for a stochastic slope. Finally, the picture that emerges from the tests and the visual analysis is that the trend is best modelled as a random walk with drift.

Table 3 KPSS tests for level and trend stationarity for Austria and its controls

Step 2: Building a naïve model

Once we have selected suitable control countries, we can set up the model for the period 1976 to 2019. As a starting point, we create a "naïve" model as in (4), ignoring possible common and country-specific structural breaks as well as country-specific deviations from the common trend. The goal of this naïve model is to get an overview of the model prediction and check if there are any structural breaks or influential observations. Since we do not know how to model the intervention, we estimate a BGMTV model. The resulting visual output of this model is shown in Fig. 2a. Now, the graph of the difference between Austria and its BGMTV synthetic control (shown in Fig. 2c) together with the table of differences (shown in Fig. 2d) gives an indication of how to model the intervention. According to the graph and table of differences, the periods 2001–2005 and 2015–2019 can be modelled with a step variable while the remaining time points need to be modelled with a pulse dummy. Modelling the intervention in this way and estimating the model with the full series of Austria yields the visual output shown in Fig. 2b. The synthetic control in Fig. 2a and b is constructed using (11). The model diagnostics can be found in Table 6 in Sect. 4.3 in the “Austria naïve” row.

Fig. 2
figure 2

Visual output of a BGMTV-model a, visual output of the full sample estimation b, difference of Austria and its BGMTV-model estimate c and the table of differences between Austria and its estimated BGMTV-model estimate d. Outliers or structural breaks are not captured in the model

Step 3: Checking the model

Checking the model includes testing for heteroskedasticity, normality and autocorrelation with standardized GLS residuals as well as performing an outlier analysis and a visual analysis of the mean-adjusted series for the target and control countries. The obtained estimates of the naïve model are close to the observed differences of Austria to its estimated synthetic control and the heteroskedasticity-test, normality-test and Q(5,4)- and Q(10,9)-test are satisfactory. However, the estimation of hypothetical Austria with the naïve model has some weaknesses. First, the estimated trend more or less ignores the large downshift of the 2008–2009 financial crisis and second, there are some individual trend deviations of Denmark in 1984–1988 and 2009–2019, of the USA in 1982 and of Israel in 2001–2007. In particular, the crisis in Israel around 2001, which recovers by 2007, and the much sharper downturn of Denmark in 2009 could affect the estimate of the synthetic Austria. Therefore, based on the visual analysis, a step variable 2009–2019 is added to the observation equation of Denmark and two step variables, 2001–2007 and 2002–2006 are added to the observation equation of Israel.

As mentioned in Sect. 3.3, adding outliers and common and country-specific structural breaks affects the estimation of the common trend as well as the weights and \(\widehat{{\varvec{\beta}}}\). Thus, before getting an idea of how to model the intervention based on the difference of the target series to its BGTMV-synthetic control, a preliminary outlier analysis is needed to ensure that all relevant outliers are captured in the model and that the model is well specified. Since we do not know how to model the intervention yet, the preceding outlier analysis is performed with BGTMV-models. In this process, the outlier analysis is performed repeatedly: If one or more plausible outliers are detected, corresponding step or pulse dummy variables are added to the respective observation or trend equation; then the model is estimated; then the model is checked again for plausible outliers; then plausible outliers are added to the model; then the model is estimated again, etc.

Applying this procedure after adding the aforementioned step variables for Israel and Denmark, the standardized smoothed state residuals reveal a consecutive structural break in 2008–2009, which is captured in the model by adding two pulse dummy variables to the trend equation. The standardized irregulars of the several series (stepwise) show outliers for Denmark in 1984, 1985, 1986 and 1987; for Israel in 1976 and 1989; for the USA in 1982 and 1983, for UK in 1980, 1981, 1982, 1983, 1988, 1989 and 1990; and for Austria in 1990, 1991 and 1992, that can be confirmed by the visual analysis. Capturing these outliers by adding appropriate pulse dummies to the respective observation equations and estimating the model again reveals that the parameters for 1991 and 1992 are almost equal and can therefore be combined into a single step variable, 1991–1992. After adding these outliers and structural breaks there are still some weak outliers in standardized irregulars of the USA in the mid-1980s, but adding these to the model can only hardly be justified by the visual analysis, so they are not added.

Step 4: Modelling the intervention and estimating the full model

Finally, all relevant outliers are captured in the model, and we can use the difference of Austria to its BGMTV-synthetic control to get an idea of how to model the euro adoption intervention. The resulting visual output of this model is shown in Fig. 3a. Figure 3c shows the graphical representation of Austria's difference from its synthetic control, and Fig. 3d shows the table of this difference. Figure 3c and d show that the intervention can be modelled with pulse dummies for 2005, 2006, 2007, 2008, 2011, 2012, 2013, 2014 and step dummies for 1999–2000, 2001–2002, 2003–2004, 2009–2010, 2015–2019. Finally, Fig. 3b shows the visual output of the full model containing all relevant dummies to capture common structural breaks, country-specific structural breaks and deviations from the common trend, as well as the intervention modelling. The variables used to construct this full model are listed in Table 4. The model diagnostics for this full model for Austria are reported in Table 5 in Sect. 4.2, along with the model diagnostics for the models of the other CEA8. Detailed information on the estimation procedure as given in Tables 2 and 3 and Fig. 3 is provided in an Online Appendix for all the remaining CEA8 countries as well as for the Euro Area aggregate (EA19) and for a control model for Greece in which its date of joining EMU is set to 2001. Details on the estimated parameters and their significances are also provided in the Online Appendix.

Fig. 3
figure 3

Visual output of a BGMTV-model a, visual output of the full sample estimation b, difference of Austria and its BGMTV-model estimate c and the table of differences between Austria and its estimated BGMTV-model estimate d. Outliers and structural breaks are captured in the model

Table 4 Pulse and step dummies in the full model for Austria
Table 5 Diagnostics tests and estimated weights for the full models of the CEA8, the EA19 aggregate and the Greek (2001) model

4.2 Evaluating the euro net effect

The procedure explained in the previous subsection is applied to each model for the CEA8 as well as for the EA19 aggregate and a control for Greece adopting the euro in 2001. The resulting visual output for each of the full models for CEA8 are shown in Fig. 4a. Figure 4b shows the visual results for the EA19 aggregate and Greece (2001).Footnote 8 The outputs show the original target series, their synthetic control according to (10), the common trend with respect to the target series, i.e., \({\widehat{\mu }}_{t}+{\widehat{\overline{\mu }}}_{1}\), and the mean-adjusted control series. Additionally, the estimated weights computed by (9) are given in the legend of each visual output. In each of the models, the common trend is modelled as a random walk with drift.Footnote 9 Before analysing these visual outputs, some notes on their generation and analysis are provided.

Fig. 4
figure 4figure 4

a Visual outputs of the full models for the CEA8 countries. b Visual output of the full model for the EA19 aggregate and the Greek (2001) model.

The first note concerns the choice of control group and negative weights in the estimation of full models. If the weight of a control country in a full model is negative, it is excluded from the control group. This applies to Denmark in the model for Germany and New Zealand in the model for Greece. In Sect. 4.3, robustness checks of the models are performed by estimating so-called "naïve" and "structural" models. In naïve models, only the euro adoption intervention is modelled, so that no structural breaks or outliers are captured in the model. In structural models, all common and country-specific structural breaks are added but no outliers in standardized irregulars. If the weight of a control country is negative in the full model, it is also dropped from the control group of the naïve and structural models, so that the robustness checks are performed with the same control group as in the full models. In some naïve and some structural models, the weights of the control countries are negative but become positive in the full model (compare Tables 5 and 7).

Second, country-specific structural breaks for Denmark and Israel are added to their observation equations in each model in which they are chosen as a control country. For Israel, two step variables are added: 2001–2007 and 2002–2007, and for Denmark one step variable is added: 2009–2019. This has the effect of pushing Israel to the common trend in 2001–2007 and Denmark in 2009–2019. Thus, Israel and Denmark serve only hardly as controls for these periods, but provide good control in the remaining periods. The addition of these level variables is closely linked to the assumption that the level changes in Israel and Denmark in the respective periods are the result of country-specific shocks, which should not affect the synthetic control estimate.

Third, country-specific structural breaks are added to the observation equations for France: 1996–2019, Germany: 1997–2019, Italy: 1996–2019, and the Netherlands: 1997–2019. These structural breaks are added because these countries leave the common growth path at these points in time. If these step variables were not added, the euro effect would be extended to include the level change, which would lead to an overestimation or underestimation of the euro effect by the parameter value of the step variables. The effects of these country-specific structural breaks can be seen very clearly in the different levels of the common trend and the estimated synthetic controls after 1996/1997. The fact that these EMU members leave the common trend just before euro adoption may be consistent with the euro anticipation effects examined by some of the AGDH-SCM studies discussed in Sect. 2.2. The argument is that EMU countries (citizens, politicians, banks, etc.) might change their economic actions in line with their expectations of the euro effect even before the actual introduction of the euro in 1999. However, it is at least equally plausible that these structural changes can be explained by a lagged effect of the fall of the Soviet Union, especially for Germany, but also for France and Italy, where the structural breaks are negative. Since we do not know whether these structural breaks are already caused by the euro introduction, step variables are added to separate these effects from the euro adoption. The robustness model for Greece (2001) shows that Greece is leaving the common trend in 1998, so a step dummy is added to this model as well. Apart from this difference, its synthetic estimate is very close to that of the model in which Greece is joining in 1999. On the one hand, the fact that Greece leaves the common trend in 1998 speaks for its “real” entry in 1999, so that the dating of the intervention is justified by meeting of the criteria for joining EMU. On the other hand, we do not know whether this is caused by the euro adoption, as in the case of the other countries discussed above. In this context, Greek development after 1999/2001 was much more dynamic than that of the other CEA8 countries, and the only suitable control countries found are Mexico and Brazil, which are much less developed (Both have less than half of Greek GDP per capita). Thus, both models for Greece are subject to some uncertainty regarding a suitable control group and an individual Greek growth path.

Fourth, the common structural breaks modelled in (8) are worth mentioning in the different models. Dummy variables to account for the structural break caused by the financial crisis in 2008 and/or 2009Footnote 10 are added to all models except the model for France. Moreover, there are some differences between the target and control series around 1990 in all models except the models for Greece and the Netherlands, and appropriate step and step dummies are added to the observation equation for the target to account for these differences. In the models of Belgium, France, Germany, Italy and Spain, outliers also indicate the addition of a common structural break in 1990, which is also captured with a pulse dummy in (8). Further structural breaks are added around 1980 to the models of Belgium, Germany, Greece, Netherlands and Spain. In general, most control and target series contain only a few outliers. Exceptions are the United Kingdom in the model for Austria, Sweden in the model for France, and Denmark in the model for the Netherlands, where more than five outliers are included in the corresponding observation equations. The interest in the parameter estimates is actually only secondary. The main focus is on visually examining the difference between EMU countries and their estimated synthetic controls, which provide information on gains and losses resulting from the euro adoption.Footnote 11

The fifth and final note relates to model diagnostics. Table 5 provides information on the Q(5,4), the Q(10,9), the JB, the H(h) statistics and their p-values for the target observation equation (since this is the series of interest), as well as the estimated weights of the control countries, which are also given in the legends of the various graphs in Fig. 4. In general, the model diagnostics are acceptable, but there is some evidence of autocorrelation in the models of France, Germany, Greece, and the Netherlands, as the Q(5,4) test rejects the null value at the 5% significance level. However, the Q(10,9) test accepts the null in the model of the Netherlands at the 10% level, and it accepts the null at the 5% level in the model of France and Greece. One could account for autocorrelation by modelling \({{\varvec{\varepsilon}}}_{t}\) as a stationary multivariate ARMA-process, but then many more parameters must be estimated. Since the trend modelling splits nonstationary and stationary components, the effect of autocorrelation is likely to be small, and it is preferable to tolerate some degree of autocorrelation (HT 2021, p. 7). This is confirmed in the robustness section (see Sect. 4.3), where not accounting for outliers in many models leads to a significant increase in the Q(p,l) statistic, while the estimate does not change much. All in all, the model diagnostics provide confidence in the estimates.

Keeping the five notes in mind, the analysis of the visual results in Fig. 4 shows two clear losers from the euro: France and Italy. While they are only marginally affected until 2001, they gradually lose out until 2019. Italy and France show a similar negative trend until 2010, but then Italy is hit much harder by the euro crisis and significantly increases the gap with its synthetic estimate, while France's gap increases only slightly. The synthetic estimates for Austria and Belgium are close to the original series. Due to the euro crisis, they lose slightly compared with their synthetic controls. The Netherlands records some small gains but is close to its synthetic estimate until 2009 but has been losing steadily since 2010. Germany loses until 2010, but is no longer affected thereafter. Spain and Greece gain until 2010 but lose afterwards. Spain finally returns to the common trend after the euro crisis. Interestingly, the visual output for the EA19 aggregate gives a similar picture as when considering all the individual results in their entirety.

The results of the visual analysis are quantified in Table 6, which provides information on the absolute and percentage differences between the original EMU series and their synthetic controls at the end of the pre-financial crisis period (2007), at the end of the financial crisis (2009), at the end of the euro crisis (2013), and at the end of the sample in 2019. Italy's remarkable loss in 2019 is $9341, a difference of 26.23% from its synthetic control that would not have adopted the euro. France's shortfall is also remarkable, as its loss in 2019 is 12.07%, equivalent to about $5350. Greece's large gains in 2007, as well as its losses in 2013, are also notable. For Austria, Belgium, and the Netherlands, neither the losses nor the gains are very large in any period.

Table 6 Absolute and percentage gaps of CEA8 and EA19 aggregate and the Greek (2001) model at the end of selected time periods

Overall, the development of the gaps between the original series and their synthetic estimates in the CEA8 is individual, but there are some structural movements in the response to the 2008–2009 financial crisis and to the 2010–2013 euro crisis. During the financial crisis, Austria, Belgium, and Spain benefit from the euro by losing less than their synthetic controls, and Germany benefits by catching up with its synthetic control. During the euro crisis, all CEA8 start to lose or continue to lose, leading to a negative gap (Austria, Belgium, Netherlands, Spain and Greece) or a widening of the negative gap (France, Italy). The only country that succeeds relative to its synthetic control during the euro crisis is Germany, which catches up and overtakes in 2011 and 2012, but then loses a little and approaches the level of its synthetic control by 2018. In the period after the euro crisis, the Netherlands, Spain and Greece start to catch up with their synthetic controls, while the size of the gap remains more or less unchanged for the other CEA8. In summary, with the exception of France and Italy, the CEA8 seem to perform slightly better during the financial crisis, while sliding into a structural crisis in the period thereafter.

Finally, it is worth noting that the synthetic controls obtained by applying HT-SCM are close to some of the AGDH-SCM estimates from the studies presented in Table 1. The synthetic control estimates for Austria, France, Germany, and the Netherlands are close to those of Puzzello & Gomis-Porqueras (2018); the estimates of Lin & Chen (2017) are broadly consistent with those for Austria, France, Germany, and Italy; and the results of Fernandez & Garcia Perea (2015) are close to those for Austria, Spain, and Greece.

4.3 Robustness section

In the following section, I will show that the estimated synthetic controls presented in Fig. 4 are robust to changes in the modelling approach. Therefore, the "full" models presented in Sect. 4.2 are compared to the models referred to below as "naïve" and "structural." A naïve model, as in step 2 of Sect. 4.1, refers to a model in which all temporary country-specific outliers as well as common and country-specific structural breaks are excluded from the model and only the euro adoption is modelled. A structural model refers to a model that includes all country-specific and common structural breaks as well as modelling the euro adoption, but excludes temporary country-specific effects.

Figure 5 shows the estimated differences between the target series and their synthetic controls in the full model, naïve model, and structural model together in one graph for each of the CEA8, the EA19 aggregate, and the Greek control model. Table 7 provides information on model diagnostics based on the JB statistics, the Q(5,4) and Q(10,9) statistics, and the H(h) statistics, along with the estimated weights of the naïve and structural models.

Fig. 5
figure 5

Gaps of the several CEA8 countries, the EA19 aggregate and Greece (2001) estimated by the full model (black lines) the structural models (blue lines) and the naïve model (red lines). (Color figure online)

Considering at first only Fig. 5, I find that the estimated gaps of the structural model are close to the estimated gaps of the full model. Exceptions are the estimates for Belgium and the EA19 aggregate, for which the structural model predicts higher losses. In many cases, the estimates of the naïve models differ somewhat more from the estimates of the full models. According to the naïve model, Austria, Belgium and the Netherlands would benefit by up to 3 to 5 percentage points more in the period before the financial crisis, while Germany and Belgium would lose up to 3 to 7 percentage points in the period during and after the financial crisis, compared with the gap estimated by the full model. For France and Italy, the gap estimated with the naïve model is larger by 2 to 5 percentage points for the entire period after the introduction of the euro. Despite these discrepancies, the naïve, structural and full models create synthetic controls that move in parallel, so they can be said to follow the same growth path. Note that the difference between the naïve and structural/complete models for France reflects the step variable that is added because France leaves the common trend even before the introduction of the euro. If this variable were not added, the euro net effect would be overestimated by the corresponding parameter, which is −0.0268 and highly significant. The same applies to the model for Italy (where the parameter is −0.0236 and highly significant), to the model for the Netherlands in the period before the financial crisis (where the parameter is 0.0202 and highly significant) and to the model for Germany (where the parameter is −0.0331 and highly significant) in the period after 2008.

Table 7 shows that several of the estimated naive and structural models fail some or all of the model diagnostics at the 5% significance level and that the estimated weights are negative in a number of models (especially for Denmark). Moreover, the estimated weights differ substantially between structural, naïve, and full models (compare the weights in Table 5 with those in Table 7). However, despite the failed diagnostic tests and the different estimated weights, which may even be negative, the three models yield similar synthetic controls. This leads to the conclusion that including outliers and structural breaks in the full model primarily eliminates misspecification, but does not have as large an impact on the results. In addition, adding country-specific outliers and structural breaks eliminates extrapolation in the estimation of weights to some extent, which is reasonable because influential observations bias the variance and the estimation of weights depends on \({\boldsymbol{\Sigma }}_{\varepsilon }\).

From Fig. 5 and Table 7, I conclude that the full model estimates are mostly robust to (a) modelling outliers and thus the number of regression parameters included in the model, (b) changes in estimated weights, and (c) some degree of misspecification indicated by the diagnostic tests.

Table 7 Diagnostics tests and estimated weights of the controls of the naïve and structural models of the CEA8, the EA19 aggregate and the Greek (2001) model

4.4 Explaining the gap

The estimation of synthetic controls for the CEA8 opens the door for further research on the economic impact of a euro adoption. We are now able to link the theoretical and empirical state of the literature, briefly described in Sect. 2.1, to the gaps between the CEA8 and their synthetic controls. Moreover, we are able to examine the channels of the financial and euro crises in more detail. In this subsection, I derive hypotheses about the effects of the main drivers of gains and losses due to EMU membership and test them using fixed-effects regressions. The aim of this section is to relate the estimated synthetic controls to some of the main drivers of gains and losses described in the literature. Since much of the literature on common monetary unions is concerned with the loss of individual monetary policy in response to (asymmetric) shocks, the crisis channel is of particular interest. Therefore, the regression analysis is divided into two parts. First, benchmark results are constructed for the entire sample. Second, the sample is split into four periods: the pre-crises period, the financial crisis period, the euro crisis period, and the post-crises period. While this drastically limits the degrees of freedom and may not be appropriate for drawing conclusions from these results, it allows for a more detailed examination of the crisis channel. In addition to testing the hypotheses stated in the OCA/EMU literature, the regression results also provide validation of the accuracy of the estimated synthetic controls.

4.4.1 Deriving hypotheses from the literature on common currency areas

The loss of an individual monetary policy and, in particular, the inability to respond to shocks with currency devaluations or revaluations has been seen as major costs of joining a common currency since the seminal work of Mundell (1961), McKinnon (1963), and Kenen (1969). As a result, much of the research on joining currency unions is concerned with finding alternative regulatory mechanisms. In this context, Mundell (1961) focuses on labour mobility and price and wage flexibility. In short, the rationale behind these criteria is as follows: An asymmetric shock can lead to demand and supply shocks, so that some countries face excess demand and others face excess supply. If workers who become unemployed in one country now simply move across the border where their labour is demanded, this will effectively dampen the shock. Moreover, currency devaluations and revaluations are indirect changes in the domestic price level. If wages and prices could be adjusted directly, for example, by structurally hiring or firing workers or by imposing new wages or prices on products, currency devaluations and revaluations become irrelevant. Unfortunately, the actual degree of labour mobility is difficult to observe, but it can be approximated by net migration. The more open an economy is to immigration, the higher the degree of labour mobility will be and the lower the cost of joining EMU. The data on migration come from Eurostat. Net migration is defined as the sum of immigrants minus emigrants as a percentage of the population. One indicator of wage flexibility is the employment protection index provided by the OECD database. The higher an index value, the lower the wage flexibility, so the expected sign for the employment protection index is negative.

McKinnon (1963) emphasizes that a higher degree of openness lowers the cost of joining a common monetary union because the domestic price level depends to a greater extent on the international prices of tradable goods. This makes currency devaluations and revaluations less effective and the loss of the ability to change the value of the currency less costly. In addition, trade with currency partners becomes cheaper because many trade costs are eliminated, so more open economies benefit more from joining a currency union if they trade more with currency partners. Consequently, small and open economies such as Austria, Belgium, and the Netherlands should benefit more from joining EMU, but openness should be beneficial for all EMU members. Openness is measured via trade openness as the sum of imports and exports as a percentage of GDP. Data on imports and exports stem from the World Bank's World Development Indicators database.

In the newer stream of the literature on EMU, Frankel & Rose (1997, 1998) argue for the endogeneity of a monetary union and conclude that the synchronization of business cycles is an important criterion for the success of a monetary union. Convergence effects are closely related to the endogeneity of a monetary union. According to Frankel & Rose (ibid.), convergence of business cycles means that they become more synchronous over time. Dellas & Tavlas (2009, p. 1128) elaborate that real convergence processes are also likely for economies joining a monetary union whose members are more economically developed. According to the authors, the relatively underdeveloped economy would benefit by adopting relatively high expected returns on investment and low real interest rates. These, in turn, lead to more optimistic income expectations and excess domestic demand. To test the real convergence hypothesis, 1998 GDP per capita is included in the regression. The convergence hypothesis states that the lower GDP per capita was before euro adoption, the higher the gains from joining EMU. Therefore, the expected sign of the parameter for GDP per capita in 1998 is negative.

The synchronization of business cycles is often measured in the recent empirical literature using the approach of Cerqueira & Martins (2009) and Cerqueira (2013). Their approach allows for the computation of the correlation of cycles at a given point in time rather than only in a given time range. Thus, it is possible to include temporal variability in the measure of co-movement, which is limited by the conventional correlation measure. The calculation for the degree of co-movement is as follows

$$\rho_{ijt} = 1 - \frac{1}{2}\left( {\frac{{\left( {d_{jt} - \overline{d}_{j} } \right)}}{{\sqrt {\frac{1}{T}\mathop \sum \nolimits_{t = 1}^{T} \left( {d_{jt} - \overline{d}_{j} } \right)^{2} } }} - \frac{{\left( {d_{it} - \overline{d}_{i} } \right)}}{{\sqrt {\frac{1}{T}\mathop \sum \nolimits_{t = 1}^{T} \left( {d_{it} - \overline{d}_{i} } \right)^{2} } }}} \right)^{2} ,\;t = 1, \ldots ,T$$
(14)
$$\rho_{ijt}^{nb} = \frac{1}{2}*ln\left( {\frac{{1 + \frac{{\rho_{ijt} }}{2T - 3}}}{{1 - \rho_{ijt} }}} \right)$$
(15)
$$\rho_{ijt}^{cor} = {\text{tanh}}\left( {\rho_{ijt}^{nb} } \right)$$
(16)

Cerqueira (2013) shows that \({\rho }_{ijt}\) is bounded to \([3-2T,1]\), while \({\rho }_{ijt}^{nb}\) is (un)bounded to (\(-\infty ,\infty )\) and so \({\rho }_{ijt}^{cor}\) is bounded to \([-\mathrm{1,1}]\). While all of these measures are indicators of the degree of co-movement of two series, only \({\rho }_{ijt}^{cor}\) allows the usual interpretation of correlation, and so \({\rho }_{ijt}^{cor}\) is chosen to measure business cycle correlation.Footnote 12 To calculate the business cycle, the HP filter is applied to the GDP per capita series (transformed into logarithms, sample range from 1970 to 2019) with λ = 100, as suggested for annual data. The synchronization of the business cycles is evaluated as the mean of the correlations between the CEA8 and the 12 early adopters (EA12), so that the synchronization of the business cycle for CEA8 i at time t is measured as follows

$$\overline{\rho }_{it}^{cor} = \frac{1}{M - 1}\mathop \sum \limits_{j = 1}^{N} \rho_{ijt}^{cor} ,i \ne j, i = 1, \ldots ,N, j = 1, \ldots ,M, t = 1999, \ldots ,2019$$
(17)

where \(N=8\) and \(M=12\). If the hypothesis of Frankel & Rose (1997, 1998) holds, the sign of \({\overline{\rho }}_{it}^{cor}\) should be positive.

Finally, two control variables – public debt and competitiveness – are added to the regressions. Public debt is added to control for the effect of fiscal stance and to observe its effect during the financial crisis and, in particular, the euro crisis, for which public debt is often considered the main driver. In general, a country that does not share its currency with other countries can increase its public debt to dampen the effects of a shock. The country can then create inflation, i.e., print money to repay its debt. This is not possible if a country is a member of a monetary union with a single central bank. Then rising public debt can increase investor distrust in the country in question, and interest rates on government debt rise. Instead of cushioning the shock, this can make it worse (De Grauwe 2020, pp. 10ff.). This consideration is the basis for Kenen's (1969) criterion of fiscal integration. Asymmetric shocks could be cushioned by fiscal transfers rather than by an increase in public debt. However, since EMU is not a fiscal union, fiscal stance becomes important to avoid sliding into a sovereign debt crisis. Public debt data are provided by Eurostat and measured as a percentage of GDP. The expected sign of government debt is negative, and the impact of government debt should be greater in the period of the euro crisis.

Currency revaluations and devaluations affect the competitiveness of an economy because they change the domestic price level relative to the foreign price level. In a monetary union, the domestic price level is determined by the aggregate price level of the monetary union. An economy joining a monetary union must compete in international markets with the price level set by the monetary union. Once an economy can no longer compete with this price level, for example because it is hit by a shock and unemployment rises, it has no way to change its competitiveness through currency devaluation, and sharing a currency becomes costly. This point is made by Sinn (2014). As a result, a loss of competitiveness is expected to increase losses from EMU membership. To estimate the impact of competitiveness, real unit labour costs from the AMECO database are included in the regression. The base year for real unit labour costs is 1998, so real unit labour costs in 1998 are 100. The higher the real unit labour cost, the lower the competitiveness, so the expected sign of the real unit labour cost is negative.

4.4.2 Regression analysis

To test the derived hypotheses, several panel regressions are performed. The analysis mainly relates to two benchmark models. The first benchmark model covers the entire period after the introduction of the euro. In the second benchmark, the effect of several variables is split using cross terms with step dummies to extract the effect of the variables in different time periods. These dummies are defined to divide the period into a pre-crises period, a post-financial crisis period, a post-euro crisis period, and a post-crises period:

$$\begin{aligned} & precrises_{t} = \left\{ {\begin{array}{llll} {1,\quad if\ \ 1999 \le t \le 2007} \\ {0,\quad else} \\ \end{array} } \right. \\ & fcrisis_{t} = \left\{ {\begin{array}{llll} {1,\quad if\ \ \ 2008 \le t \le 2009} \\ {0,\quad else} \\ \end{array} } \right. \\ & ecrisis_{t} = \left\{ {\begin{array}{llll} {1,\quad if\ \ 2010 \le t \le 2013} \\ {0,\quad else} \\ \end{array} } \right. \\ & postcrises_{t} = \left\{ {\begin{array}{lll} {1,\quad if\ \ 2014 \le t \le 2019} \\ {0,\quad else} \\ \end{array} } \right. \end{aligned}$$
(18)

Table 8 provides information on the different variables used in the regression models. Column 1 informs about the variables in the regression analysis with their abbreviations in parentheses, column 2–8 inform about their scaling, some descriptive statistics, the number of NAs and the expected sign. As can be seen, net migration is the only variable with missing values. These missing values occur for France in 1999–2005, Belgium in 2008–2009, and Spain in 1999–2001. In addition to the variables listed in Table 8, a cross term of GDPpc98 × RULC98 is added to the regressions to control for the association that poorer economies are generally less competitive. Adding this cross term separates the effects of convergence (poorer economies gain more) and competitiveness (less competitive economies gain less). The explanatory variables are mean-centred to ensure that the estimated time fixed effects can be interpreted meaningfully. Mean-centring allows their parameters to be interpreted as the euro effect for an average EMU member at time t. Comparing the parameters of the time fixed effects with the estimated gap for the EA19 aggregate provides some cross-validation of the regression results and the estimate of the synthetic STM controls.

Table 8 Overview of the variables used for the regression analysis

The results of the regressions are reported in Table 9. The first two regressions are estimated for the entire sample after the introduction of the euro. Model (1) is the benchmark with all variables. Model (2) excludes net migration to control for the effect of excluding 12 observations. For model (3), the effects of business cycle correlation, competitiveness, trade openness, public debt, net migration and employment protection are partitioned into the pre-crisis, financial crisis, euro crisis and post-crises periods by multiplying them by the mentioned step dummies in (18). All three models include time fixed effects and the capturing of outliers (|std. residuals|> 2.75). These occur only in the residuals of Greece in 2004, 2006, 2007, and 2012 in models (1) and (2) and in 1999 in model (3). Moreover, the variance of the Greek residuals is also visibly higher. Given this, the models exhibit some difficulty in predicting Greece, suggesting that Greece's post-1999 performance is influenced, at least to some extent, not only by the euro adoption but also by an individual growth path.

Table 9 Regression results

A first analysis of models (1) and (2) shows that the main drivers of gains and losses are real convergence processes, trade openness, public debt, competitiveness and net migration. All of these factors are highly significant and have the correct sign. Business cycle synchronization and employment protection are not significant. The results of model (2) show that the exclusion of net migration does not change the sign, magnitude, and significance of the other variables, leading to the conclusion that the addition of net migration and the accompanying loss of 12 observations is not a trade-off. Moreover, \({\overline{R} }^{2}\) indicates that model (1) is slightly better and therefore the inclusion of net migration may be the better choice to explain the gains and losses.

When analysing model (3), it is important to keep in mind that splitting the effects of the variables across the pre-crisis, financial crisis, euro crisis and post-crises periods drastically reduces the degrees of freedom for estimating the respective parameters. Since there are 123 degrees of freedom to estimate 8 variables in model (1), extracting the effects for different periods reduces the degrees of freedom to 108 to estimate 26 variables in model (3). Moreover, the observations for estimating the parameters in different periods reduce to 8*9 for the pre-crises period, 8*2 for the financial crisis period, 8*4 for the euro crisis period, and 8*6 for the post-crises period. Thus, the analysis of model (3) is not so much about significance, but about the signs of the parameters and the comparison of their magnitudes with model (1). Thus, a comparison of the results of model (3) with those of model (1), focusing mainly on the sign and magnitude of the parameters, provides some interesting information. The first interesting observation is that \({\overline{R} }^{2}\) is much higher than in model (3), even though the same variables are used. This suggests that considering different slopes in different sample periods not only provides interesting information about the variables, but also makes sense from an econometric point of view.

The other interesting information concerns the variables. Instead of being insignificant and close to zero in model (1) and (2), the business cycle correlation actually affects profits and losses, given the different slopes in the different periods. While its impact is positive in the pre-crises period, its slope becomes negative thereafter. A simple interpretation of this change in slope is that ECB policy was optimal in the pre-crises period but detrimental thereafter. The difficult task is to explain why the effect of ECB policies changes. As a possible explanation, I examine in Fig. 6 the cycle movements of the various EA12 countries and \({\overline{\rho }}_{it}^{cor}\) of the CEA8 from 1990 to 2019. In general, the cycles of the EA12 countries coincide very well until 2007. Some exceptions to the general co-movement are Germany in the early 1990s and around 2005, the Netherlands during the turn of the century, and Greece, whose cycle starts to diverge around 1998. The only negative correlation until 2007 is for Greece in 1999. Looking now at the period after 2007, the picture is different. In particular, the crisis causes the cycles to diverge and the correlation to become lower, which changes again after the euro crisis around 2016/2017. Therefore, in the pre-crises period, it was more or less easy for the ECB to pursue an optimal business cycle policy for all EMU members, as the business cycles were synchronous. However, during and after the crisis, the cycles were no longer so synchronous, and the ECB may have chosen its policy to dampen the shock for the heaviest "losers" of the crisis, namely Greece, Italy and Spain (and indeed quantitative easing, debt relief for Greece and low interest rates suggest that the ECB did so). In turn, ECB policy was more optimal for the less correlated EMU members, leaving out the more correlated countries. However, this effect needs further research to fully explain the negative sign since the financial crisis.

Fig. 6
figure 6

Business cycle correlation and mean correlation calculated for 1990–1998, 1999–2007, 2008–2009, 2010–2014 and 2015–2019 for the various CEA8 with the EA12 (left) and extracted business cycles of the EA12 (right)

While the real unit labour cost parameter is negative and significant in models (1) and (2), the impact of competitiveness seems to be a driving force especially during the euro crisis and in the post-crises period. As expected, its sign is negative at this time. During the financial crisis, on the other hand, lower competitiveness appears to be beneficial. However, the t-statistics of real unit labour costs are very small in each period. Net migration is an important factor in dampening the shock of the euro crisis, while the fiscal stance is important in the pre-crisis and post-crises periods. As expected, the negative impact of fiscal instability is much larger during and after the euro crisis than in the pre-crises period. Trade openness is beneficial until the financial crisis, but it does not help dampen the shock of the euro crisis. On the contrary, a more open economy further increases the losses from EMU membership during and after the euro crisis. Employment protection has a positive effect in the pre-crises period, but as expected, a rigid labour market increases losses during and after the crisis.

Overall, the empirical results of the three regressions paint a picture that is largely consistent with the hypotheses, and examination of the crisis channel shows that some of the main drivers do indeed exert their effects during a crisis. Lastly, Fig. 7 shows the plot of the estimated gap for the EA19 aggregate together with the time fixed effects of the three regression models. The estimates for an average EMU member (the EA19 aggregate is a weighted combination of the 19 EMU members and can therefore be considered an EMU average) are very similar despite different approaches, further cross-validating the estimation results of the HT-SCM and the regression. Only in the 2004–2007 period are the time fixed effects of models (1) and (2) significantly lower than those of model (3) and those of the estimated EA19 gap of the full model, while the time effects of model (3) are significantly higher in the 2008–2010 period. When capturing outliers in models (1)-(3), the time fixed effects are even closer to the gap, which is due to the fact that capturing outliers isolates Greece's observations from the regression at the corresponding points in time. Therefore, when comparing different models, it may be more appropriate not to include outliers in the regressions.

Fig. 7
figure 7

Estimated EA19 gap of the full HT-SCM model and time fixed effects of the regression models when regression outliers are captured (left) and when regression outliers are not captured (right). The red, blue and green line visualize the time fixed effects of regression model (1), (2), and (3), respectively

5 Conclusion

This study assesses the impact of euro adoption on GDP per capita for eight early adopters and the EA19 aggregate by applying a new SCM based on STM and cointegration. The results show that France and Italy are clear losers from the euro. Italy in particular suffers drastic losses. While Greece gains a lot until the financial crisis, its losses thereafter are also remarkable. Spain also gains until 2009, but loses thereafter. After the euro crisis, Spain increasingly catches up with its synthetic estimate. Germany loses initially, but is able to catch up during the financial and euro crises. The Netherlands, Austria and Belgium are hardly affected with slight gains or losses until 2009, but then lose due to the euro crisis. The results clearly show that, with the exception of Germany, all CEA8 slip into the crisis in 2010.

Identifying an ex-post benchmark for the net euro effect opens the possibility to test hypotheses from the OCA literature. Running panel regressions shows that real convergence, trade openness, and net migration as a proxy for labour mobility are the main drivers of gains from EMU membership, while fiscal instability and low levels of competitiveness are the main drivers of losses. Thus, the regression confirms the thoughts and theories of the OCA literature.

Both the OCA literature and the estimated synthetic controls suggest that the crisis channel plays an important role. While the OCA literature concludes that the main costs stem from the inability to respond to asymmetric shocks with independent central bank policy, the estimated gaps between the CEA8 and their synthetic controls show that the euro crisis in particular leads to a larger negative gap. Examining the effects of the main drivers in the pre-crisis, financial crisis, euro crisis and post-crises periods, I find that the effects of the main drivers unfold during the crises. Taking into account different slopes for these periods, I find that business cycle synchronization has a negative impact during and after the crises, while it is beneficial in the pre-crises period. In contrast, its effect turns out to be insignificant when different slopes are not taken into account. This suggests that ECB policy was not optimal for the more correlated EMU members and focused mainly on the less correlated and more suffering economies of Greece, Spain and Italy. Employment protection, as an indicator of wage and labour market rigidity, is negative during and after the crises, thus acting mainly through the crisis channel. Net migration has its positive effect during the euro crisis. Openness is beneficial in the pre-crises period, but its effect is negative thereafter. Finally, fiscal instability is found to be the main driver of losses during and after the euro crisis, confirming that the euro crisis is a sovereign debt crisis. Lower competitiveness may also have a negative effect in the period after the financial crisis.

In summary, the results of this study draw a rather pessimistic picture of the economic impact of a euro adoption, and both the STM synthetic controls and the time fixed effects of the regression models suggest that an average EMU economy has to expect losses from an adoption. However, both measures also suggest that the euro crisis was responsible for a large share of these losses. Thus, understanding the crisis channel is important for future research and for policymakers. The finding that fiscal instability is a major problem of EMU opens the debate on whether EMU members should integrate further into a fiscal union and become more and more the "United States of Europe" or whether countries should be allowed to leave EMU and return stronger after regaining their competitiveness and fiscal stance. Aside from the question of how to adjust the EMU system in detail, the results of this study show that it must be adjusted to be successful in the future.