1 Introduction

The high costs of financial crises for the public sector as well as private investors has led to a large number of empirical studies that intend to detect crises in a timely manner, such that a crisis can be avoided, or its impact can be mitigated. Typically Early Warning Systems (EWSs) are employed for this purpose, these are quantitative models that predict extreme events based on statistical information and mechanisms from the past. For overviews see Kaminsky et al. (1998) or Abiad (2003).Footnote 1

EWSs for financial crises have been criticized for several reasons, one of which is that they are only useful for detecting past crises (Frankel and Saravelos 2010). An important reason is that the data used by forecasters to construct the EWS are not available in a timely manner. Also indicators are selected with the benefit of hindsight.Footnote 2 EWSs are typically tested out-of-sample using the most recent data, known as current-vintage data. When the EWS detects crises using current-vintage data but fails to detect crises for the out-of-sample prediction period, then the EWS provides a false sense of security.

We focus on a particular type of financial crisis, the currency crisis, which is defined as a large, sudden depreciation or devaluation of the exchange rate, or an episode with high pressure on the exchange rate that may result in large losses of international reserves and/or a hike in domestic interest rates to defend the exchange rate (Berg et al. 2005). Currency crises not only occur in countries with fixed exchange rate regimes, but also in countries with flexible exchange rates which, in principle, should be more resistant to currency crises. One would expect continuous market adjustment to limit the buildup of pressures leading to extreme currency overvaluation and subsequent large discrete currency declines as may occur under fixed exchange rate regimes. Pegged and intermediate exchange rate regimes are indeed associated with greater susceptibility to currency crises, particularly in developing and emerging market countries with more open capital accounts (Ghosh et al. 2010). However, many countries with floating exchange rates have experienced currency crises. A possible explanation is the fact that countries reporting their currencies as floating are often quite reluctant to allow their currencies to float due to so-called fear of floating behavior (Calvo and Reinhart 2000), and de facto follow a pegged exchange rate regime (Glick and Hutchison 2011).

In this paper, we investigate the performance of EWSs for currency crises with real-time data, which reflects the notion that data (indicators and forecasts) are subject to revisions in the course of time. We assess the sensitivity of EWS currency crisis predictions to the choice of two alternative data sets. As our first data set, we employ current-vintage data which implies using information of the most recent vintage. The second data set that we employ contains real-time indicator forecasts, for this we convert annual forecasts for period t published in the beginning of the period t in quarterly forecasts, for details see Section 3.2 below.

We investigate two types of commonly used early warning systems for currency crises: the signal approach and the logit model. We apply each EWS to a panel of 15 countries, five from Latin America, five from Asia, and five from Central and Eastern Europe, in the period 1991–2017, employing data on a quarterly frequency. We estimate our EWSs on the in-sample period (1991Q1–2010Q4), using 10–15 economic and financial indicators that correspond to different generations of currency crises (Kaminsky 2006; see also e.g. Zhao et al. 2014, Section 2.1) and are typically employed in studies on EWSs for currency crises. Then we predict currency crises for the out-of-sample period (2011Q1–2017Q4) with two types of real-time data explained above.

To review the performance of alternative EWSs on forecasting currency crises in emerging economies in the late 1990s Berg et al. (2005) use only information that was available at the time. No information about actual outcomes is used in the forecasts. They use the internal IMF July 1999 forecasts and subsequent internal forecasts to compare the performance of alternative EWSs. Gunther and Moore (2003) compare the performance of an EWS for banking crises in the period 1996–1998 using first releases (first official publication of the data after the period has ended) versus current-vintage data that was available in May 2000 and find that the EWS with current-vintage data performs better than the EWS with first releases. Reagle and Salvatore (2005) test the robustness of predicting the Asian 1997–1998 financial crisis without and with data revisions. They estimate a probit model for a cross-section of a group of 54 emerging economies with three sets of data: the original, unrevised 1996 World Bank data, and the 1999 and 2004 updates of the 1996 World Bank data. They conclude that data revisions lead to significant changes in the model’s estimates, which presents a problem for researchers that should be recognized and addressed.

The importance of using early estimates in EWSs has been acknowledged in the literature, but real-time data have not been implemented on a widespread scale. Frankel and Saravelos (2012) comment that predictions issued in real-time would be impressive, but also especially difficult. Lo Duca and Peltonen (2013) remark that real-time data sets that contain information on the revisions of data after the first publication do not exist yet for several countries in their sample of developed and emerging economies. Data availability issues seem to be the major reason why implementation has not been widespread; early estimates for emerging economies are not publicly available.

Alessi and Detken (2011) use quasi real-time data. Although it sounds similar to real-time data, the authors actually mean current-vintage data. In their words:

The caveat is that we use the most recent vintage of data and not a true real time data set with unrevised data. Nevertheless, we use conservative lags to proxy for standard publication lags and thus real time data availability. [...] The performance of real variables could possibly be worse in a true real time setting compared to a quasi real time setting, as real variables can be heavily revised (i.e., the quality of current-vintage data can be much better than the quality of real-time vintages). [page 524]

Lo Duca and Peltonen (2013) also use quasi real-time data, in their case to predict financial stress events. Holopainen and Sarlin (2016) use quasi real-time data to predict banking crises. Their contribution to the crises EWS literature is to assume that the forecaster uses all available accounting and market-based information with a lag.

Our approach differs from Berg et al. (2005) in several ways. First, we explicitly focus on the difference between indicator forecasts and current-vintage data when making out-of-sample predictions. Second, we use the consensus forecasts of Haver Analytics, whereas they use internal predictions from the IMF, which are not publicly available. Third, we consider a longer period. As a consequence, our sample includes more and different currency crises, both from the 1990s (e.g. Mexico 1994-1995, Russia 1998, Brazil 1999) and the next decade (e.g. Argentina 2001–2002, and the Global Financial Crisis that hit most emerging economies in 2008–2009). Including more crises makes our analysis more complete, because of the variety in the crises (Kaminsky 2006). In contrast to Gunther and Moore (2003) and Reagle and Salvatore (2005) we use forecasts of indicators to predict currency crises whereas they focus on first releases of indicators to predict banking crises. Similar to Reagle and Salvatore (2000, 2005) we use panel data, but they estimate a probit model using cross-sections. Panel data have advantages over cross sections and time series, in particular increased sample variation and accommodating unobserved heterogeneity through fixed effects.

Our results can be useful for policymakers as attention is drawn to the issue of data availability. Since current-vintage data is not available when predictions have to be made, a realistic out-of-sample evaluation of EWSs requires early estimates (i.e., forecasts) of the indicators. Our results show that crisis prediction results are worse when using early estimates.

The remainder of this paper is structured as follows. Section 2 describes the two types of EWSs we employ in this paper. After the description of the data in Section 3, we present and discuss the empirical results in Sections 4 and 5. Section 6 concludes.

2 Methodology

This section introduces two of the most widely used types of EWSs to predict currency crises, the signal approach and the discrete choice model.Footnote 3 The final subsection shows how we compare the out-of-sample currency crisis prediction performance using current-vintage data and using indicator forecasts.

2.1 Signal Approach

Eichengreen et al. (1995) and Frankel and Rose (1996) introduce event study graphs to analyze and predict currency crises. The method involves a graphical comparison of the performance of indicators in times of crisis versus their performance in tranquil periods. Kaminsky et al. (1998) extend this methodology to what is known as the signal approach. This approach consists of two stages. In the first stage the indicators that are expected to play a role in the crisis, such as inflation, debt as a percentage of GDP and the current account, are selected. Typically, a visual inspection of the event study graph determines whether the indicator shows a special, extraordinary behavior before a crisis. This helps to restrict the number of potential crisis indicators. In the second stage a threshold is determined for each indicator by minimizing the probability of not signalling a crisis that occurred (type I error) and the probability of signalling a crisis that did not occur (type II error). If the variable exceeds a pre-established threshold, then a crisis is signaled and the value of 1 is assigned to the binary crisis variable, and zero otherwise. For each threshold we construct a contingency table as in Table 1. A represents the number of observations in which the model signals a crisis that actually took place (correct crisis signals); B corresponds to the number of observations in which the model signals a crisis that did not take place (false alarms, also known as type II errors); C is the number of observations in which the model does not signal a crisis that actually took place (missed crises, a.k.a type I errors); and D is the number of observations in which the model does not signal a crisis that did not take place (correct non-crisis signals).

Table 1 Contingency table of crisis realizations and signals

The main advantages of the signal approach are that the method does not impose any parametric structure on the data, and that the method is more accessible and informative than tables of coefficient estimates. The main disadvantage is that the approach is intrinsically univariate as we analyze the individual contribution instead of the marginal contribution conditional on other variables (Frankel and Rose 1996).

The optimal threshold depends on the relative costs of the two error types. Bussière and Fratzscher (2006) consider type II errors less worrisome from a policy-maker’s perspective for two reasons. First, type II errors (false alarms) tend to be less costly from a welfare perspective than type I errors (missed crises). The costs of a false alarm are the costs of taking preventive actions, the risk of a self-fulfilling prophecy, and the loss of trust in the policy makers when false alarms become frequent. However, missing a crisis (which possibly could have been prevented or the effect of which could have been lowered by pre-emptive policies) has often a higher welfare cost. For instance, the high cost of financial crises have manifested themselves in the form of large output contractions, rising unemployment and poverty rates among many affected emerging markets in the 1990s (Rewilak 2018). Second, false alarms may not always be due to the predictive failure of the model, but simply reflect the fact that although fundamentals indeed indicate vulnerabilities in the country’s economic situation, appropriate policy initiatives were taken to improve the resilience of the economy and prevent a crisis (Bussière and Fratzscher 2006).

There are several ways to determine the optimal threshold. We use the Total Misspecification Error and the Noise to Signal ratio.Footnote 4 The letters (A, B, C, D) used in the formulas below refer to the categories distinguished in Table 1.

2.1.1 Total Misspecification Error

The Total Misspecification Error (TME) is the sum of the probabilities of type I and type II errors:

$$ \begin{array}{@{}rcl@{}} TME = \frac{C}{A+C} + \frac{B}{B+D}. \end{array} $$
(1)

The lower the TME, the better the variable identifies the actual crisis. Since the number of crises (A + C) is very low compared to the number of tranquil periods (B + D), the measure implicitly values a lower number of missed crisis more than a lower number of false alarms.

2.1.2 Noise to Signal Ratio

The noise-to-signal (NtS) ratio is used in the original signal approach study of Kaminsky et al. (1998) and defined as:

$$ \begin{array}{@{}rcl@{}} NtS = \frac{B/(B+D)}{A/(A+C)}. \end{array} $$
(2)

The lower the NtS, the better the variable identifies the actual crisis. Indicators with an NtS equal to or greater than 1 should be discarded, since these do not have intrinsic predicting power. According to this criterion false alarms and missed crises are treated with equal weights.

2.2 Discrete Choice Models

Logit and probit models are widely used in EWSs for financial crises, including currency crises. Compared to the signal approach, the logit and probit models have advantages. First, the methods consider all the indicators simultaneously (Kaminsky et al. 1998), and second, the independent variables can have a nonlinear effect on the probability of a crisis, which is appropriate because of the presence of strong nonlinear effects in currency crises mechanisms (Bussière 2007).

In the binary logit model the dependent variable is dichotomous and takes the value of 1 if the event occurs and 0 otherwise. In this setup, Yit represents a binary variable for country i ∈{1,…,N} at time t ∈{1,…,Ti} where Ti denotes the number of time periods considered for the ith country. The probability of an event is characterized by the logistic distribution. That is, for each country, the probability of the event is given by:

$$ P(Y_{it} = 1) = \frac{\exp(X_{it}\beta)}{1 + \exp(X_{it} \beta)}, i = 1, \hdots, N; $$
(3)

where Xit denotes a vector of exogenous variables and β the vector of slope parameters.Footnote 5

The odds ratio, which is useful for interpretations, is determined as

$$ \frac{P(Y_{it} = 1)}{1-P(Y_{it}=1)} = \exp(X_{it} \beta). $$
(4)

A common alternative for discrete choice models is the probit model. In this paper we prefer the logit model because crises events do not occur often (as is the case in currency crises and sovereign debt crises). The logistic distribution (logit model) has heavier tails than the normal distribution (probit model) (Manasse et al. 2003; Bussière 2007). However, out-of-sample performances are broadly similar (Comelli 2014).

2.3 Out-of-Sample Performance

We estimate early warning systems for the period 1991Q1–2010Q4, with current-vintage data. Then we compare its out-of-sample prediction for the period 2011Q1–2017Q4, with (i) current-vintage data, and (ii) real-time indicator forecasts. We include the currency crises that occur in the second half of 2008 (after the fall of Lehmann Brothers) in the in-sample period, because this crisis has elements not seen in earlier crises such as the involvement of advanced economies. Moreover, this choice increases the number of crisis observations in the estimation period. According to Frankel and Saravelos (2012) leading indicators that most frequently appeared in earlier reviews are not statistically significant in the Global Financial Crisis.

2.3.1 Measures for the Out-of-Sample Performance

The methods to determine the optimal threshold, the Total Misspecification Error or loss function approach (see Section 2.1), can also be used to measure the out-of-sample performance of the signal approach and the binary logit model. For the logit model an additional measure is available which is the quadratic probability score (QPS) proposed by Diebold and Rudebusch (1989) to evaluate out-of-sample forecasts. This measure indicates how close, on average, the predicted probabilities Pt and the observed realizations Zt are. The QPS is given by

$$ \begin{array}{@{}rcl@{}} QPS &=& \frac{1}{T}\sum\limits_{t = 1}^{T} {2(P_{t} - Z_{t} )^{2}}. \end{array} $$

The QPS ranges from 0 to 2, with a score of 0 corresponding to perfect accuracy (well-predicted crisis, or a well-predicted tranquil period), and a score of 2 corresponding to a perfect false signal (missed crisis or false alarm).

3 Data

We focus on emerging economies, in particular on three regions (Latin America, Central and Eastern Europe (CEE) and Asia) for which real-time data, and notably indicator forecasts, are available. Countries in these regions implemented market reforms in the 1990s after a period of domestically-oriented economic policies. All countries have experienced political and institutional changes, changes in exchange rate regimes and at least one currency crisis since the early 1990s. For Latin America we select Argentina, Brazil, Colombia, Mexico and Venezuela. These countries are the largest economies in terms of GDP and share economic and institutional features, such as the importance of commodities, and a history of changes in exchange rate regimes, and political and institutional changes. For CEE we take the five largest economies in the region: Russia, Poland, Czech Republic, Hungary and Ukraine. Our sample is completed with the Asian countries Indonesia, Korea, Malaysia, Philippines and Thailand.

We employ quarterly data. This higher frequency is recommended in early warning systems of currency crises, because a crisis often develops rapidly. There is a complication though. Indicator forecasts are issued for the full current (calendar) year and the following year. If we use the indicator forecast of the current year, then the forecast in the first quarter is much more imprecise than the forecast in the fourth quarter, which would introduce forecast bias. Therefore we follow Dovern et al. (2012) and approximate fixed horizon forecasts as a weighted average of fixed event forecasts. We calculate four-quarters ahead forecasts, by taking weighted averages of these annual indicator forecasts. For example, the 2010Q3 forecast is calculated as 1/4 times the 2009 forecast of vintage 2009Q3 plus 3/4 times the 2010 forecast of vintage 2009Q3. Table 2 illustrates our approach graphically.

Table 2 Real-time data trapezoid for quarterly frequency

3.1 Currency Crisis Dating

Our dating of currency crises is based on Boonman (2019), who constructs a database with currency crisis dates by combining a broad range of quantitative definitions with the narrative, other studies and expert opinions. A crisis exclusion period of four quarters is implemented, which implies that a crisis that takes place within four quarters after a previous crisis is not considered a separate crisis, but a continuation of the previous crisis. We label the four quarters prior to the crisis as crisis run-up period. This run-up period is employed since we are interested in an early warning system, so we want to detect a currency crisis before it actually occurs.

The resulting currency crises are shown in Table 3. The third column of the table shows that 33 currency crisis episodes occurred in the period 1991Q1–2010Q4. Every country experienced at least one crisis. The last column shows there have been nine currency crisis episodes in the out-of-sample period 2011Q1–2017Q4. Only seven countries experienced a crisis in this episode. All crises are based on both the exchange market pressure index and major currency depreciations, except the 2015Q3 crisis in Brazil which is based solely on large depreciations.

Table 3 Identification of currency crises, 1991Q1–2017Q4

3.2 Explanatory Variables

As explanatory variables we use 10–15 economic and financial indicators that correspond to different types of currency crises (Kaminsky 2006), and that are typically used in studies on EWSs for currency crises. The explanatory variables can be divided into three groups. The first group contains variables that are available for each country as forecasts for the out-of-sample period 2010Q1–2016Q4: real GDP growth, inflation, exports growth, change in reserves, imports to reserves ratio, fiscal balance to GDP, interest rate, external debt to GDP and current account balance to GDP.Footnote 6 For each of these indicators we also use the change (both with respect to the previous quarter and the same quarter in the previous year). The second group consists of global variables that capture external conditions. These variables are also available as forecasts, but they are equal for each country in our panel. We use the USA as a proxy for global conditions, and select US real GDP growth and the US Federal funds rate to capture external conditions. The third group consists of country-specific indicators that are not available as forecasts, but are nonetheless important to include in an EWS for currency crises as control variables. We take these from current vintage data: the cyclical component of domestic credit to GDP and the overvaluation or undervaluation of the Real Exchange Rate. For both indicators we take the Hodrick-Prescott filter with a smoothing parameter of 1600 to capture the long-run trend. For the signal approach we only use variables from the first group, while for the logit models we employ all variables.

The indicator forecasts are taken from the Focus Economic Consensus of Haver Analytics (HA). The database offers monthly reported forecasts for economic indicators. We take the value of the last month of the quarter as the quarterly observation. Berkmen et al. (2012) use consensus growth forecast changes, which has the advantage of pooling across various forecasters and potentially suffering from less bias. For the same reason we prefer to use the consensus data from HA.

For the current-vintage data we use International Financial Statistics (IMF), Haver Analytics and World Development Indicators, consulted in October 2018. Details on the variables, their sources, frequency and availability are shown in Appendix A. All explanatory variables have been standardized, using country-specific means and standard deviations of the in-sample period (1991Q1-2010Q4).

4 Empirical Results

Since early warning systems are only useful when they issue an early warning signal, it is common to use indicators from period t − 1 to detect a possible crisis in period t. In other words, the model links the dependent variable (the crisis dummy) to a selection of indicators from the period prior to the crisis entry on the dependent variable (the crisis dummy). We assign a value of 1 to the crisis dummy for the four quarters that precede the crisis start, and we exclude all the observations of the crisis quarter(s) themselves, since the indicators typically show extreme behavior related to the crisis.

4.1 Signal Approach

In the signal approach we determined the optimal threshold separately for each indicator based on the in-sample period only using the lowest Total Misspecification Error (TME) criterion. Most indicators enter in three forms, the level, change vis-à-vis the previous quarter, and the change vis-à-vis the same quarter of the previous year. We include each indicator in one form only, so that we have nine indicators. The results are shown in Table 4. For each indicator we indicate whether a value below (<) or above (>) the threshold signals a currency crisis, the contingency table and outcomes of the TME and the Noise-to-Signal ratio.

Table 4 Signal Approach: In-sample results for nine indicators. Based on 15 countries, period 1991Q1 to 2010Q4

According to the TME criterion, the best indicator to signal a currency crisis is the quarterly change in the external debt to GDP as it has the lowest TME. It gives a correct signal for almost 80% of the crises, and it fails to signal only 22.7% of the crises. The probability of a false alarm is high (48.7%). When the quarterly change in the external debt to GDP is greater than − 0.10 σ, there is an increased probability of a currency crisis in the following 1–4 quarters. Since all data are standardized, the threshold can be expressed simply as − 0.10. Similar, a currency crisis is more likely when the current account to GDP is less than − 0.10, when the quarterly change in the imports to reserves ratio is greater than 0.20, the quarterly change in the fiscal balance to GDP is smaller than − 0.10, and the quarterly change in the official reserves is less than 0.10. The other indicators work less well, reflected by higher TME values.

4.2 Out-of-Sample Performance

We compare the performance of using indicator forecasts versus current-vintage data in the predictions. We use the thresholds that were estimated in the in-sample period to predict crises in the out-of-sample period 2011Q1–2017Q4. The results are shown in Table 5.

Table 5 Signal Approach: Out-of-sample results for nine indicators. Based on 15 countries, period 2011Q1 to 2017Q4

The results provide support to the main hypothesis of our paper, that the prediction performance is better (lower TME and Noise-to-Signal ratio) when using current vintage data rather than indicators forecasts. The quarterly change in the reserves is the exception.

Table 11 in Appendix B shows the crisis predictions on the individual crisis level.Footnote 7 As argued in Section 2.1, we assume that missed crises have a higher cost than false alarms, and are therefore particularly interested in the performance of the two alternative data sets (current vintage data and indicator forecasts) in currency crisis episodes. Four indicators (real GDP growth, inflation, the change in reserves, and external debt to GDP) predict all crises, both if current vintage data and indicator forecasts are used. For the other indicators the current vintage data perform better than indicator forecasts. For example, the indicator “change in imports-to-reserves” misses out on only two crises when current vintage data is used, and it misses out on four crises when indicator forecasts is used. We also observe that some indicators do not pick up a crisis, neither with current vintage data nor with indicator forecasts. The indicator “change in imports-to-reserves” does not pick up the Brazil 2015 crisis, indicator “fiscal balance to GDP” does not pick up the Malaysia 2015 crisis, and the interest rate does not pick up the Colombia 2015 crisis. We conclude that also on the individual crisis level the current vintage data perform equal or better than indicator forecasts.

4.3 Binary Logit Model

We estimate the binary logit model for the in-sample period (1991Q1–2010Q4) for the pooled dataset for 15 countries (see Section 3). The dependent variable is the currency crisis dummy variable as defined in Section 3.1. We include all potentially relevant variables, even when there is multicollinearity among variables. Under multicollinearity the estimates are unbiased and consistent, but the standard deviations may be inflated. This does not affect the predictions. In addition, we include a regional dummy to distinguish the three geographical regions. Based on the Hausman test we employ fixed effects to account for unobserved heterogeneity and mitigate endogeneity due to omitted variables bias. The outcomes of the logit model are shown in Table 6.

Table 6 Binary logit regressions with currency crisis dummy as the dependent variable for 15 countries using quarterly data from 1991Q1 to 2010Q4

To assess the performance of the model we convert the probability of a crisis into a dummy variable which takes the value 1 if a signal is sent, and 0 otherwise. For this conversion we set a threshold for the probability. The lower this threshold, the more crisis signals are sent (including false alarms), and the higher the threshold, the less crisis signals are sent (including missed crises). We show the results for four thresholds in the in-sample period in Table 7. For all thresholds, the TME and NtS are substantially lower than for the signal approach (Table 4), which implies that the logit model performs better than the signal approach.

Table 7 Logit model: in-sample performance. Based on 15 countries, period 1991Q1 to 2010Q4

4.4 Out-of-Sample Performance

We now turn to the performance in the out-of-sample period, by comparing the generated probabilities of a currency crisis with the actual outcome. We use the same cut-off probabilities from the in-sample period and show the results in Table 8.

Table 8 Logit model: Out-of-sample performance. Based on 15 countries, period 2011Q1 to 2017Q4

As in the in-sample exercise, the current-vintage data perform better (in terms of lower TME) than the indicator forecasts for all cut-off probabilities. Additionally, the logit model with a cut-off probability lower than 0.30 performs better out-of-sample (in terms of lower TME) than the signal approach.

Outcomes for an alternative statistic for the performance of the predictions, the Quadratic Probability Score, are presented in Table 9. Here we can observe that current-vintage time data do not always perform better than indicator forecasts. In particular for countries where no crisis took place (Brazil, Mexico and Czech Republic), the QPS for predictions based on indicator forecasts is lower than the QPS for predictions based on current-vintage data, indicating that predictions based on indicator forecasts are better. For countries where a currency crisis took place the reverse holds: the QPS is lower for predictions based on current-vintage data than for predictions based on indicator forecasts.

Table 9 Quadratic probability score (QPS) for binary logit regressions: out-of-sample results 2011Q1–2017Q4

Appendix B shows the crisis predictions on the individual crisis level.Footnote 8 The logit model does not pick up the crises in Brazil 2015 and Malaysia 2015 neither with current vintage data nor with indicator forecasts. The crises in 2015Q3 in Brazil and Malaysia are not predicted by the logit model. These crises are also not predicted with the signal approach with the imports-to-reserves, and fiscal balance to GDP, respectively. The two crises have in common that the pre-crisis conditions were different from earlier crises, in particular the uncertainty on necessary reforms, political scandals, high levels of debt and the high dependence on commodities (including oil). For all thresholds, the logit model based on current vintage data identifies the crisis in Venezuela in 2013 where the indicator forecasts fails. For the highest threshold (0.30), there are several differences. Both the Argentina 2014 crisis and the Venezuela 2013 crisis is picked up when using current vintage data, while the indicator forecasts pick up the Ukraine 2014 crisis.

5 Discussion

Our results from the signal approach and the logit model confirm that using current vintage data outperforms the use of indicator forecasts in predicting crises, both for the aggregate level as on the individual crisis basis. Predicting financial crises, including currency crises, is notoriously hard, even in retrospect. Taking properly account of the information that is available at the moment a researcher has to prepare predictions, makes it even more difficult. For supervisors, who assume that the costs of missed crises are higher than the costs associated with false alarms, this is not a comforting finding, because current-vintage data is not available in time for making genuine out-of-sample predictions. Supervisors may reconsider their loss function, and vary the relative penalty of type I and type II errors in an approach as in Alessi and Detken (2011).

We also find that the logit model performs consistently better than the signal approach, both in-sample and out-of-sample. The signal approach takes into account one indicator at the time. Currency crises are rarely triggered by one indicator that is out of line, but rather when a number of indicators makes a country vulnerable. For example, a country with a high current account deficit but a large amount of reserves, high growth and a fiscal surplus would not be as vulnerable as a country that suffers from a lower current account deficit, but faces a high fiscal deficit, low reserves, low growth and a high external debt. Logit models do take into account several indicators simultaneously, which functions better to estimate and predict currency crises for our sample.

We used several criteria to determine thresholds. The threshold and the performance depend to a large extent on the choice of these criteria. On the one hand, when type I errors are considered more costly than type II errors, the Total Misspecification Error is used, and the threshold tends to be lower to avoid missed crises although at the cost of false alarms. On the other hand, when both error types are considered equally costly, the Noise to Signal ratio is used. The threshold is higher to keep the number of false alarms low, although at the cost of more missed crises.

6 Conclusion

In this paper we focus on the use of real-time data for early warning systems for currency crises. EWSs have received many critiques, one of which is related to data availability. The use of realized data for EWSs is unrealistic and not feasible in practice, since these are not available when predictions are made.

We select fifteen emerging economies from three regions: Argentina, Brazil, Colombia, Mexico and Venezuela from Latin America, Czech Republic, Hungary, Poland, Russia and Ukraine from Central and Eastern Europe (CEE) and the Asian countries Indonesia, Korea, Malaysia, Philippines and Thailand. These countries form a more or less homogeneous sample in terms of size of the economy, comparable economic history since the 1990s including financial crises, and switches in exchange rate regimes. We analyze these countries in a pooled data set, covering the period from 1991Q1 to 2017Q4.

The signal approach is often used as EWS for currency crises. We apply this method to nine indicators that are typically used in the currency crisis early warning systems literature. We determine a threshold based on a criterion that avoids missed crises (type I errors) more than false alarms (type II errors), as the costs for the first are considered larger than the costs for the latter. In the out-of-sample period we use these thresholds to identify crises. Comparison of current-vintage time data and indicator forecasts shows that the latter perform worse in signalling crises. This conclusion also holds for logit models, which is the second EWS that we analyze. For both the signal approach and the logit model the results are consistent at the aggregate and individual crisis level.

We conclude that in standard EWSs current-vintage data perform better than indicator forecasts in terms of predicting currency crises. In other words, based on indicator forecasts a high number of crises will not be signaled in time. Some possible explanations are that the indicator forecasts which are based on consensus expectations (and hence are point forecasts) tend to smooth out extremes, and that the realizations are more dramatic than predicted in the run-up to crises. Future research will focus on investigating alternative EWSs and on exploiting more information from the indicator forecasts, such as the disagreement among analysts, and the skewness of the distribution of the forecasts by using density forecasts rather then point forecasts .