The evidence for critical slowing down is evaluated in two steps. Firstly, we investigate the early warning signals before real critical transitions. By using six historical time series, we examine four well-known extreme financial events in history—Black Monday 1987, the Asian Crisis, the Dot-com Bubble and the 2008 Financial Crisis. Secondly, we control the rate of false positives (i.e., the likelihood of spurious early warnings) by estimating the probability of obtaining a similar or more extreme trend in the indicator by chance, using a bootstrap method. In the end, we perform a robustness check with respect to changing the parameters set by the user in the analysis.
Financial time series
We analyze time series associated with the following four financial crises. Due to space considerations, we provide and discuss graphs only for pre-crisis sample size \(T=200\); we later provide t-statistics for the trends in the indicators also for \(T=500\).
Black Monday 1987 During a single day, October 19, 1987, the Dow Jones Industrial Average (DJIA) index lost nearly 22%. By the end of that month, most of the major exchanges had dropped by more than 20%. Stock markets around the world crashed, beginning in Hong Kong, spreading to Europe, and hitting the USA after other markets had declined by a significant margin. This event marked the beginning of a global stock market decline, making “Black Monday” one of the most dramatic days in recent financial history.
Figure 3 shows the time-varying early warning indicators during about half a year preceding “Black Monday” based on the Standard & Poor’s 500 index (S&P 500) time series. The original time series in Fig. 3a is the logarithm of the daily S&P 500 index. The time series displayed starts 200 trading days before the crash and ends 100 days after it. Stock markets raced upward during the first half of 1987, but experienced a great depreciation in the last few months. The vertical dashed line indicates Black Monday which we identify with (the start of) the critical transition. Since we are interested in EWSs before the critical transition, the time-varying indicators are strictly based on the data before the dashed line. To facilitate explanation, we align the x-axis of the critical transition with 0 to clearly distinguish the days before and after it. The red graph corresponds to the smoothed time series obtained by the Gaussian kernel smoother. The two-headed arrow shows the width of the moving window. Figure 3b shows the residuals, that is, the detrended time series used to estimate the early warning indicators.
Figure 3c–e shows the time-varying early warning indicators, AR(1), MI(1), and SD. They show that the great crash on “Black Monday” is preceded by overall upward trends in these indicators. All of these positive trends are confirmed by a positive Kendall rank correlation coefficient \(\tau \). The (naive) p values corresponding to the trends of the indicators strongly suggest an increase in the indicators during the period preceding the critical transition, suggesting that the S&P 500 time series indeed slows down before the critical transition. As noted above, we will apply a bootstrap technique to obtain p values that take into consideration temporal dependence of the time-varying indicators.
The Asian Crisis Using the same techniques, we examine the Hang Seng time series. Figure 4 shows the analysis of early warning indicators around one and a half year before the Asian Crisis. Panel (a) displays the logarithm of daily Hang Seng index from November 1995 to July 1998. This time series increases in the beginning but collapses around mid-1997, which illustrates the Asian financial crisis in July 1997. The Asian Crisis exhibited a series of currency devaluations along with stock market declines. The currency market first failed in Thailand because its government no longer pegged their local currency, the Thai Baht, to the US dollar. The currency crisis rapidly caused stock market declines spreading throughout South Asia. Thailand, South Korea and Indonesia were most affected by the crisis. As a result of the crisis, the stock markets in Japan and most of Southeast Asia fluctuated dramatically.
Figure 4 has a similar format as Figure 3. The smooth red curve in panel (a) shows the moving average used for filtering. The dashed arrow shows the width of the moving window, which is again half the size of the analyzed time series length. Panel (b) shows the residuals used to estimate the early warning indicators. The results for the Asian crisis, at least for \(T=200\) (\(n=100\)), are mixed; there is a significant positive trend in the SD indicator (panel (e)), but the AR(1) and MI(1) indicators shown in panels (c) and (d) show a significant downward trend before the critical transition of July 1997.
The Dot-com Bubble Figure 5 presents the analysis of early warning indicators from \(n=100\) trading days before the Dot-com bubble collapse. Boosted by the rise of commercial growth of the internet, the NASDAQ Composite index experienced a speculative bubble, as shown in Fig. 5a. It peaked around the year 2000, the latter part followed a typical boom and bust cycle; when the bubble “burst,” the stock prices of dot-com companies fell dramatically. Some companies went out of business completely, such as Pets.com. Some others survived but their stocks declined by more than 80%, such as Cisco and Amazon.com.
As for the Asian Crisis, the analysis of the early warning indicators shows mixed results for the NASDAQ index before the Dot-com bubble. The bubble collapse is preceded by an overall upward trend in the standard deviation, but a downward trend in the AR(1) and MI(1) estimates.
The 2008 Financial Crisis The financial crisis of 2008 is known as the most severe financial crisis since the Great Depression of the 1930s. It was triggered by the bursting of US housing bubble, which peaked approximately in 2005–2006. Banks began to give out more loans than ever before to potential home owners. When the housing bubble finally burst in the latter half of 2007, the secondary mortgage market collapsed and over 100 mortgage lenders went bankrupt during 2007 and 2008. Several major financial institutions failed, including Lehman Brothers, Merrill Lynch, Washington Mutual, Citigroup. The world wide economies experienced a great recession and stock markets around the world declined.
Figure 6a–c shows the early warning indicators around one year before the 2008 financial crisis based on three different time series, the S&P 500, the TED spread and the VIX. The VIX is a commonly used estimated time series index of the implied volatility of S&P 500 over the next 30 days.
The analyses show mixed results. For the S&P 500 index (I.), the AR(1) and MI(1) indicators as well as the SD indicator show a downward trend. The analysis of the TED spread (II.) shows strong upward trends in the AR(1) and MI(1) indicators, suggesting a slowing down before the critical transition around the time Lehman Brothers went bankrupt. However, for the same period, the SD indicator shows a downward trend. The analysis of the VIX (III.) shows significant downward trends in the AR(1) and MI(1) indicators preceding the 2008 critical transition. Also the SD indicator has a significant downward trend.
Table 2 Studies of early warning indicators for critical transitions in different time series for various sample sizes T, window lengths n; \(\tau \) is the estimated Kendall’s \(\tau \) coefficient A summary of the naive p values is provided in Table 2. The trends observed in the AR(1), MI(1) and SD indicators preceding the crises are all either significantly positive or significantly negative. The only cases where all indicators are significantly positive are Black Monday with sample size \(T=200\) and the Asian Crisis for \(T=500\). Overall these results are highly significant, but often with the unexpected negative sign. This may be related to the fact that only naive p values were reported so far.
Bootstrap time series
Above we have reported the results based on naive p values, which ignores the dependence among the sequentially observed values of the indicators. To take this into account, for each empirically observed value of Kendall’s \(\tau \) we also calculate Kendall’s \(\tau \) for a large number of bootstrap time series. This allows us to assess a p value corrected for serial dependence. The observed bootstrap distribution of tau values is used to assess the likelihood that the observed value of the trend statistic in the original data has occurred by chance in the absence of an actual trend, taking into account temporal dependence. The corrected p value is the estimated probability, in the absence of a trend, that a \(\tau \) value equal to or larger than that obtained from the original time series is observed among the bootstrap replications.
Dakos et al. (2008) applied a bootstrap of the residuals from the smoothing step. However, this treats the residuals as being independent, which doesn’t seem to be realistic in cases where there is autocorrelation in the residuals. Instead, we therefore decided to bootstrap the log returns of the original prices time series. Bootstrap log price time series (levels) are then easily obtained again by taking cumulative sums. The motivation for this approach is that log returns are known to be hardly correlated. Preliminary comparisons of the returns bootstrap with the residuals bootstrap showed that the latter is slightly less conservative (i.e., leads to smaller p values, with the possible risk of over-rejection due to ignoring the dependence mentioned above). Since refinements of the log returns bootstrap, such as, for instance, taking into account GARCH(1,1) effects, gave very similar results as the log returns bootstrap, we decided to only report the results for the log returns bootstrap in this paper.
The probability \(P(\tau \ge \tau ^{*})\) of obtaining an equally large or larger trend statistic by chance is estimated by the bootstrap p value, which is the fraction of bootstrap trend statistics with trend at least as high as the trend statistics based on the original time series. These p values are indicated by percentages in Fig. 7. For the AR(1) and MI(1) indicators, the (one-sided) p values are 4.5 and \(5.2\%\), respectively, meaning that the bootstrap p values after taking into account temporal dependence in the indicators are 4.5 and \(5.2\%\). Therefore, the trends in the AR(1) indicator for the S&P500 index prior to Black Monday are significant at the conventional \(5\%\) significance level, and for the MI(1) indicator at the \(10\%\) level. For the SD indicator, the bootstrap p value is \(30.9\%\), i.e., insignificant at the conventional significance levels of 10, \(5\%\) or less.
Table 3 Likelihood of obtaining trend statistic estimates by chance, estimated by the fraction of bootstrap estimated trend statistics (Kendall \(\tau \)-values) being larger than the trend statistic of the original residual records. Number of bootstrap replications \(B=1000\), \(\sigma =10\) Table 3 shows the likelihood of obtaining the observed or a larger trend statistic estimate by chance, as estimated by the fraction of bootstrapped trend statistic Kendall’s \(\tau \) being larger than or equal to the trend statistic of the original residual records. The results shown in Table 3; the p values can be interpreted as the probability of observing, by chance, a trend in the respective indicator that is as large or larger than the trend in the indicator for the observed time series data.
First focusing on the smaller sample size (\(n=100\), top panel), one can observe evidence of a positive trend in the AR(1) indicator, and to a lesser extent in the MI(1) indicator, prior to the 1987 Black Monday crash. We also observe some weak evidence for a positive trend in the SD indicator prior to the burst of the Dot-com bubble. If we strictly focus on results that are significant at the \(5\%\) level of significance or smaller, the only remaining significant result is a positive trend in the AR(1) coefficient prior to the 1987 Black Monday crash. In this respect, our results do not strongly confirm the finding of Guttal et al. (2016) that the SD indicator increases significantly before financial crises. Note, however, that our results also do not contradict theirs; the difference in significance may be attributed to the use of a different set of crises, and differences in pre-crisis sample size T, estimation window n, smoothing window size \(\sigma \), and details of the bootstrap methods used. Also note that Guttal et al. (2016) obtained their results for estimation window sample size \(n=500\) (about two years), which, as indicated above, we a priori considered unrealistically large to detect trends in the indicators reliably, as we expected those trends to develop on smaller time scales (of several months, up to half a year, say). For the larger sample size (\(n=250\), bottom panel), we observe a marginally significant positive trend in the SD indicator before the 2008 Crisis based on the S&P 500 time series, and a very significant (at the 2% significance level) positive trend in the MI(1) for the same crisis based on the TED spread. This latter result deserves some further attention, in particular because the results obtained for MI(1) and AR(1) are very similar for all results reported in Tables 2 and 3, except for the TED spread prior to the 2008 crisis for \(n=250\).
At first sight, one may be tempted to conclude that the MI(1) indicator apparently is able to pick up some nonlinear dependence within the time series that the AR(1) indicator is not able to pick up. However, visual inspection of the time series of residuals from the smoothing step (Fig. 8, panel b) raised our concern that the trend may have been caused primarily by changes in the sample marginal distribution as the estimation window was moved forward. Figure 9 shows the histograms of the first and second half of the pre-crisis TED spread residuals. The histograms show that the residuals have a more fat-tailed distribution in the first half of the pre-crisis period than in the second half. This is confirmed by the excess kurtosis, which is 6.90 for the first half and only 0.22 for the second half of the pre-crisis residuals.
It is known that although theoretically the mutual information is invariant under changes in the marginal distribution of the time series considered, the estimation of the mutual information involves a discretization step (binning) which in practice affects the estimates. Typically (and also in the algorithm as implemented in the R package “tseriesChaos” we used), the bins are taken equally sized between the largest and smallest observed values. Now imagine the case where the data are normally distributed versus the case where the data are leptokurtic (i.e., the distribution has fatter tails and the density is more peaked near the center of the distribution, relative to the normal). If we use the same number of bins in both cases (10 here), the bins near the mode of the distribution will be much fuller for the leptokurtic data than for normal data, and vice versa for the bins in the tails. This may well affect the MI(1) estimates.
We therefore wish to check whether the small p value observed based on the TED Spread for \(n=250\) is not just a result of the marginal distribution changing as the moving window moves forward, while the marginal distribution of the bootstrap-based residual time series is (by construction) constant over time. To test this, in an additional numerical experiment we modified the algorithm for the estimation of MI(1) by adding a preliminary step; we transformed the marginal distribution of each time series for which MI(1) was to be calculated to the standard normal distribution via the empirical CDF and the inverse standard normal CDF. Note that in theory this marginal transformation does not change the mutual information, but as we argued above, such a marginal transformation may well affect the numerical estimate of the mutual information.
Upon using this new (independent of sample marginals) way to estimate MI(1), the trend in the MI(1) estimate changed from \(\tau =0.769\) to \(\tau =-\,0.143\), which in fact has the same sign as the trend observed for the AR(1) indicator (\(\tau =-\,0.383\), see Table 2). Correspondingly, the p value changed from 0.018 to an insignificant value of 0.576. In our view, this indicates that indeed there was a significant positive trend in the original MI(1) indicator just because the shape of the marginal distribution was more leptokurtic in the first part of the data than in the second and hence that the significant result was spurious. Moreover, we rule out the possibility that the change in distribution is a sign of critical slowing down, as it is primarily caused by a small period with larger amplitude near the end of the first half of the sample. The smaller sample does not contain this period, which explains why a similar spurious result is not observed there. Taken together, the results obtained with MI(1) here suggest that although the MI(1) indicator initially seemed promising for its ability to pick up nonlinear dependence, the standard binning-based algorithm for MI estimation is too sensitive to details of the sample marginal distribution to be applied directly in EWS settings.
Robustness of parameters
In addition to varying the sample size, we also wish to check the robustness of the results with respect to changing other user-set parameters of the analysis. Apart from the pre-crisis sample size, there are two other key parameters to be set by the user—the smoothing bandwidth \(\sigma \) and the moving window size n, which we now do not necessarily take to be half the pre-crisis sample size. The bandwidth is an important parameter when filtering out long-term trends from the original time series. There is a trade-off involved in the bandwidth choice; a too narrow bandwidth would not only remove the long run trends but also the short run fluctuations we intend to study, while a too wide bandwidth might leave some slow trends, which may lead to spurious detection of trends in the indicators. A similar trade-off also plays a role in the estimation window size; a smaller window size is better to track short run changes, but a too small window size with too few sample points would make the estimations less reliable.
To check the robustness of our analysis with respect to those parameters, we perform an additional analysis by using various rolling estimation window (sample) size and bandwidth combinations. The contour plot in Fig. 10 shows the influence of these parameters on the observed trends of AR(1) indicators for Black Monday 1987. The white dot indicates the parameters used in our early warning analysis. The black dot indicates the combination of window size and bandwidth size that shows the strongest positive trends. The distribution of Kendall’s tau shown in Fig. 10 confirms the presence of strong positive trends of Kendall’s tau in the contour plot. The contour plot also indicates that although the results clearly depend on the parameters chosen, they depend on them quite smoothly, in particular around the maximum, so that there is a large set of parameters for which significant results are obtained for the trend in the AR(1) coefficient prior to Black Monday 1987.