The data used are BitCoin daily and monthly prices for the period 01.05.2013-31.05.2018; the data source is CoinMarket (https://coinmarketcap.com/). As a first step, the frequency distribution of log returns is analysed (see Table 18 and Fig. 6). As can be seen, two symmetric fat tails are present in the distribution. The next step is the choice of thresholds for detecting overreactions. To obtain a sufficient number of observations we consider values ± 10% of the average from the population, namely − 0.04 for negative overreactions and 0.05 for positive ones. Detailed results are presented in Appendix 2.
Visual inspection of Figs. 7 and 8 suggests that the frequency of overreactions varies over time (see Table 19). To provide additional evidence we carry out ANOVA analysis and Kruskal–Wallis tests (Table 2); both confirm that the differences between years are statistically significant, i.e. that the frequency of overreactions is time-varying.
Table 2 Results of ANOVA and non-parametric Kruskal–Wallis tests for statistical differences in the frequency of overreactions between different years Next we carry out a correlation analysis. Table 3 reports the results for different parameters, namely the number of negative (Over_negative) and positive overreactions (Over_positive), as well as the total number of overreactions (All_over) and the overreactions multiplier (Over_mult) and indicators (BitCoin close prices, BitCoin returns, BitCoin logreturns).
Table 3 Correlation analysis between the frequency of overreactions and different BitCoin series indicators There appears to be a positive (rather than negative, as one would expect) correlation between BitCoin prices and negative overreactions. By contrast, there is a negative correlation in the case of returns and log returns. The overreaction multiplier exhibits a rather strong negative correlation with BitCoin log returns. Finally, the overall number of overreactions has a rather weak correlation with prices.
To make sure that there is no need to shift the data in any direction we carry out a cross-correlation analysis of these indicators at the time intervals t and t + i, where I ∈ {− 10,…,10}. Figure 1 reports the cross-correlation between Bitcoin log returns and the frequency of (both positive and negative) overreactions for the whole sample period for different leads and lags. The highest coefficient corresponds to lag length zero, which means that there is no need to shift the data.
To analyse further the relationship between BitCoin log returns and the frequency of overreactions we carry out ADF tests on the series of interest (see Table 4).
Table 4 Augmented Dickey–Fuller test: BitCoin log returns and overreactions frequency data The unit root null is rejected in most cases implying stationarity. The next step is testing H1 by running a simple linear regression and one with dummy variables (see Sect. 3 for details). The results for BitCoin closes, returns and log returns regressed against all overreactions, negative and positive overreactions are presented in Tables 5, 6, and 7, respectively. In all three cases the specification with the highest explanatory power is the one including negative and positive overreactions as separate variables, though in the case of BitCoin closes the positive sign of the coefficient on negative overreactions is not what one would expect.
Table 5 Regression analysis results: BitCoin closes Table 6 Regression analysis results: BitCoin returns Table 7 Regression analysis results: BitCoin log returns To sum up, consistently with the theoretical priors, the total number of overreactions is not a significant regressor in any case. The best specification is the simple linear multiplier regression model with the frequency of positive and negative overreactions as regressors, and the best results are obtained in the case of log returns as indicated by the multiple R for the whole model and the p-values for the estimated coefficients. Specifically, the selected specification is the following:
$$ {\text{Bitcoin}}\;\log \;{\text{return}}_{i} = 0.0645 - 0.0939 \times {{O}}F_{i}^{ - } + 0.1013 \times {{O}}F_{i}^{ + } $$
(8)
which implies a strong positive (negative) relationship between Bitcoin log returns and the frequency of positive (negative) overreactions. On the whole, the above evidence supports H1. The difference between the actual and estimated values of Bitcoin can be seen as an indication of whether Bitcoin is over- or under-estimated and therefore a price increase or decrease should be expected. Obviously, BitCoin should be bought in the case of undervaluation and sold in the case of overvaluation till the divergence between actual and estimated values disappears, at which stage positions should be closed.
As mentioned before, to show that the selected specification is indeed the best linear model we use the multilayer perceptron (MLP) method. Negative and positive overreactions are the independent variables (the entry points) and log returns are the dependent variable (the exit point) in the neural net. The learning algorithm previously described generates the following optimal neural net MLP 2-2-3-1:1 (Fig. 2).
We compare it with the linear neural net L 2-2-1:1 model, which consists of 2 inputs and 1 output. The results are presented in Tables 8 and 9.
Table 8 Comparative characteristics of neural networks Table 9 Quality comparison of neural networks As can be seen, the neural net based on the multilayer perceptron structure provides better results than the linear neural net: the control error is lower (0.0392 (MLP) vs 0.0801(L)); the standard deviation error and the data ratio are also lower (0.4673 vs 0.5078); the correlation is higher (0.8844 vs 0.8719).
Figure 3 shows the distribution of BitCoin log returns, actual vs estimated (from the regression model and the neural network).
As can be seen the estimates (from the regression model and the neural network, respectively) are very similar and very close to the actual values, which suggests that the regression model (Eq. 8) captures very well the behaviour of BitCoin prices.
We also estimate ARIMA(p, d, q) models with \( p \le 3;\;q \le 3;\;d = 0 \) choosing the best specification on the basis of the AIC and BIC information criteria. Specifically, we select the following models: ARIMA(2, 0, 2) (on the basis of the AIC criterion); ARIMA(1, 0, 0) and ARIMA(0, 0, 1) (on the basis of the BIC criterion). The parameter estimates are presented in Table 10.
Table 10 Parameter estimates for the best ARIMA models As can be seen, Model 1 captures best the behaviour of BitCoin log returns: all regressors are significant at the 1% level, except \( \psi_{t - i} \), and AIC has the smallest value.
To establish whether this specification can be improved by including information about the frequency of overreactions, ARMAX models (see Eq. 6) are estimated adding as regressors \( {{O}}F_{t}^{ - } \) (negative overreactions) and \( {{O}}F_{t}^{ + } \) (positive overreactions). The estimated parameters are reported in Table 11. Model 4 adds the frequency of negative overreactions and positive overreactions to Model 1. Model 5 is a version of Model 4 chosen on the basis of the AIC and BIC criteria.
Table 11 Estimated parameters for the ARMAX models Clearly, Model 5 is the best specification for modelling BitCoin log returns: all parameters are statistically significant (except \( b_{t - 6} \)), and there is no evidence of misspecification from the residual diagnostic tests. Figure 4 plots the estimated and actual values of BitCoin log returns.
Table 12 reports Granger causality tests between BitCoin log returns and both negative (OF−) and positive overreactions (OF+).
As can be seen, the null hypothesis of no causality is rejected for negative (OF−) and positive overreactions (OF+), but not for BitCoin log returns (Y), and therefore there is evidence that forecasts of the latter can be improved by including in a VAR specification the two former variables. The optimal lag length implied by both the AIC and BIC criteria is one (see Table 13). The estimates for the VAR(1)-model are reported in Table 14.
Table 12 Granger Causality Tests between BitCoin log returns and both negative (OF−) and positive overreactions (OF +) Table 13 VAR lag length selection criteria Table 14 VAR(1) parameter estimates This model appears to be data congruent: it is stable (no root lies outside the unit circle), and there is no evidence of autocorrelation in the residuals. The IRF analysis (see Appendix 3, Figs. 9, 10 and 11 for details) shows that, in response to a 1-standard deviation shock to log returns, both negative (OF−) and positive overreactions (OF+) revert to their equilibrium value within six periods, whereas it takes log returns only one period to revert to equilibrium. There is hardly any response of log returns to shocks to either positive or negative overreactions, whilst both the latter variables tend to settle down after about six periods. The variance decomposition (VD) results are presented in Table 15. They suggest the following:
Table 15 Variance Decomposition
-
The behaviour of Y is mostly explained by its previous dynamics (97.4%); \( {{O}}F^{ - } \) accounts for only 0.2% of its variance, and \( {{O}}F^{ + } \) for only 2.4%.
-
The behaviour of \( {{O}}F^{ - } \) is also mainly determined by its previous dynamics (76.7%), with Y explaining only 22.7% of its variance and \( OF^{ + } \) only 0.6%.
-
The behaviour of \( OF^{ + } \) is mostly accounted for by the \( OF^{ - } \) dynamics (43%), with Y explaining 36.9% of its variance and \( OF^{ + } \) 20.1%.
Finally, we address the issue of seasonality (H2). Figure 5 suggests that the overreactions frequency tends to be higher at the end and the start of the year and lower at other times. Also, there appears to be a mid-year cycle: the frequency starts to increase in April, peaks in June-July and then falls till September with a “W” seasonality pattern.
Formal parametric (ANOVA) and non-parametric (Kruskal–Wallis) tests are performed; the results are presented in Tables 16 and 17. As can be seen, there are no statistically significant differences between the frequency of overreactions in different months of the year (i.e. no evidence of seasonality), i.e. H2 cannot be rejected, which is consistent with the visual evidence based on Fig. 3.
Table 16 Parametric ANOVA of monthly seasonality in the overreaction frequency Table 17 Non-parametric Kruskal–Wallis of monthly seasonality in the overreaction frequency