This chapter analyses Bitcoin’s predictability using AI and statistical models, and then identifies the model that produces the most accurate results. A multivariate time series dataset was created to train an AI (LSTM) and statistical model (ARIMA) to predict the price of Bitcoin over time. The LSTM model achieved the highest accuracy of 94% and the lowest MAPE of 5%. ARIMA had the best overall metrics, but when it came to forecasting the future, it performed poorly. The results show that it is possible to predict the BTC with a reasonable error rate; however, Bitcoin is extremely volatile, making it difficult to obtain results that can be confidently used to assert its value ahead of time. The result of the study suggested that it is possible to forecast the Bitcoin price with minimal error rates, refuting the null hypothesis. However, because the bitcoin price index is affected by various external sources, forecasting time series problems is intrinsically challenging. When attempting to predict stated sort of data, the following constraints must be considered: the models (1) do not account for exogenous variable uncertainty, and (2) do not account for the fact that forecast-error variances vary with time (Fair, Chapter 33 Evaluating the predictive accuracy of models. In: Handbook of econometrics, vol 3, pp 1979–1995. Elsevier [Online]. Available at: https://doi.org/10.1016/S1573-4412(86)03013-1. Accessed 19 Jan 2022, 1986).
- Long Short-Term Memory
- Autoregressive integrated moving average
- Time series analysis
This is a preview of subscription content, access via your institution.
Tax calculation will be finalised at checkout
Purchases are for personal use onlyLearn about institutional subscriptions
Teker, D., Teker, S. and Ozyesil, M. (2019). Determinants of Cryptocurrency Price Movements. In: LAHSS-19, MEEIS-19 Nov. 12–14, 2019, Paris (France). 12 November 2019. Higher Education And Innovation Group. [Online]. Available at: doi:10.17758/HEAIG6.H1119510 [Accessed 18 November 2021].
Elder, J. and Kennedy, P. E. (2001). Testing for Unit Roots: What Should Students Be Taught? The Journal of Economic Education, 32 (2), pp.137–146. [Online]. Available at: https://doi.org/10.1080/00220480109595179 [Accessed 18 November 2021].
Selva, P. (2019). Augmented Dickey Fuller Test (ADF Test) – Must Read Guide. Augmented Dickey Fuller Test (ADF Test) – Must Read Guide. [Online]. Available at: https://www.machinelearningplus.com/time-series/augmented-dickey-fuller-test/ [Accessed 18 November 2021].
Lamothe-Fernández, P. et al. (2020). Deep Learning Methods for Modeling Bitcoin Price. Mathematics, 8 (8), p.1245. [Online]. Available at: https://doi.org/10.3390/math8081245 [Accessed 18 November 2021].
Balcilar, M. et al. (2017). Can volume predict Bitcoin returns and volatility? A quantiles-based approach. Economic Modelling, 64, pp.74–81. [Online]. Available at: https://doi.org/10.1016/j.econmod.2017.03.019.
Kristoufek, L. (2015). What Are the Main Drivers of the Bitcoin Price? Evidence from Wavelet Coherence Analysis. Scalas, E. (Ed). PLOS ONE, 10 (4), p.e0123923. [Online]. Available at: https://doi.org/10.1371/journal.pone.0123923.
Choi, H. and Varian, H. (2009). Predicting the present with Google trends. Google Research. Google Inc., Mountain View, CA.
Da, Z., Engelberg, J. and Gao, P. (2011). In Search of Attention. The Journal of Finance, 66 (5), pp.1461–1499. [Online]. Available at: https://doi.org/10.1111/j.1540-6261.2011.01679.x.
Bouoiyour, J. and Selmi, R. (2015). What Does Bitcoin Look Like? Annals of Economics and Finance, 16, pp.449–492.
Hyndman, J. R. and Athanasopoulos, G. (2018). Forecasting: Principles and Practice (2nd ed). [Online]. Available at: https://Otexts.com/fpp2/ [Accessed 9 December 2021].
Guyon, I. and Elisseeff, A. (2003). An Introduction to Variable and Feature Selection. p.26.
Ariyo, A. A., Adewumi, A. O. and Ayo, C. K. (2014). Stock Price Prediction Using the ARIMA Model. In: 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation. March 2014. pp.106–112. [Online]. Available at: https://doi.org/10.1109/UKSim.2014.67.
McNally, S., Roche, J. and Caton, S. (2018). Predicting the Price of Bitcoin Using Machine Learning. In: 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP). March 2018. pp.339–343. [Online]. Available at: https://doi.org/10.1109/PDP2018.2018.00060
Wołk, K. (2020). Advanced social media sentiment analysis for short-term cryptocurrency price prediction. Expert Systems, 37 (2), p.e12493. [Online]. Available at: https://doi.org/10.1111/exsy.12493.
Tandon, C. et al. (2021). How can we predict the impact of the social media messages on the value of cryptocurrency? Insights from big data analytics. International Journal of Information Management Data Insights, 1 (2), p.100035. [Online]. Available at: https://doi.org/10.1016/j.jjimei.2021.100035.
Brownlee, J. (2016). Feature Selection For Machine Learning in Python. Machine Learning Mastery. [Online]. Available at: https://machinelearningmastery.com/feature-selection-machine-learning-python/ [Accessed 13 December 2021].
Bitcoin.org. FAQ – Bitcoin. 2021 [Online]. Available at: https://bitcoin.org/en/faq [Accessed 14 December 2021].
Greaves, A. and Au, B. (2015). Using the Bitcoin Transaction Graph to Predict the Price of Bitcoin. p.8.
Kristoufek, L. (2013). BitCoin meets Google Trends and Wikipedia: Quantifying the relationship between phenomena of the Internet era. Scientific Reports, 3 (1), p.3415. [Online]. Available at: https://doi.org/10.1038/srep03415.
Urquhart, A. (2016). The inefficiency of bitcoin. Economics Letters, 148, 80–82.
Tinawi, I. (2019). Machine Learning for Time Series Anomaly Detection. p.55.
Rather, A. M., Agarwal, A. and Sastry, V. N. (2015). Recurrent neural network and a hybrid model for prediction of stock returns. Expert Systems with Applications, 42 (6), pp.3234–3241. [Online]. Available at: https://doi.org/10.1016/j.eswa.2014.12.003.
Giles, C. L. (2001). Noisy Time Series Prediction using Recurrent Neural Networks and Grammatical Inference. p.23.
Valencia, F., Gómez-Espinosa, A. and Valdés-Aguirre, B. (2019). Price Movement Prediction of Cryptocurrencies Using Sentiment Analysis and Machine Learning. Entropy, 21 (6), p.589. [Online]. Available at: doi:https://doi.org/10.3390/e21060589.
Guo, Q. et al. (2021). MRC-LSTM: A Hybrid Approach of Multi-scale Residual CNN and LSTM to Predict Bitcoin Price. arXiv:2105.00707 [cs, q-fin]. [Online]. Available at: http://arxiv.org/abs/2105.00707 [Accessed 17 December 2021].
Wang, J.-J. et al. (2012). Stock index forecasting based on a hybrid model. Omega, 40 (6), pp.758–766. [Online]. Available at: https://doi.org/10.1016/j.omega.2011.07.008.
Cermak, V. (2017). Can Bitcoin Become a Viable Alternative to Fiat Currencies? An Empirical Analysis of Bitcoin’s Volatility Based on a GARCH Model. SSRN Scholarly Paper, Rochester, NY: Social Science Research Network. [Online]. Available at: https://doi.org/10.2139/ssrn.2961405 [Accessed 17 December 2021].
Xie, Y., Ueda, Y. and Sugiyama, M. (2021). A Two-Stage Short-Term Load Forecasting Method Using Long Short-Term Memory and Multilayer Perceptron. Energies, 14 (18), p.5873. [Online]. Available at: https://doi.org/10.3390/en14185873.
Fair, R. C. (1986). Chapter 33 Evaluating the predictive accuracy of models. In: Handbook of Econometrics. 3. Elsevier. pp.1979–1995. [Online]. Available at: https://doi.org/10.1016/S1573-4412(86)03013-1 [Accessed 19 January 2022].
Matta, M., Lunesu, I. and Marchesi, M. (2015). Bitcoin Spread Prediction Using Social And Web Search Media. p.10.
Nochai, R. and Nochai, T. (2006). ARIMA MODEL FOR FORECASTING OIL PALM PRICE. p.7.
Khashei, M., Bijari, M. and Raissi Ardali, G. A. (2012). Hybridization of autoregressive integrated moving average (ARIMA) with probabilistic neural networks (PNNs). Computers & Industrial Engineering, 63 (1), pp.37–45. [Online]. Available at: https://doi.org/10.1016/j.cie.2012.01.017.
Pai, P.-F. and Lin, C.-S. (2005). A hybrid ARIMA and support vector machines model in stock price forecasting. Omega, 33 (6), pp.497–505. [Online]. Available at: https://doi.org/10.1016/j.omega.2004.07.024.
Zhu, Y., Dickinson, D. and Li, J. (2017). Analysis on the influence factors of Bitcoin’s price based on VEC model. Financial Innovation, 3. [Online]. Available at: https://doi.org/10.1186/s40854-017-0054-0.
Olah, C. (2015). Understanding LSTM Networks -- colah’s blog. [Online]. Available at: http://colah.github.io/posts/2015-08-Understanding-LSTMs/ [Accessed 17 February 2022].
Barão, S. M. M. (2008). Linear and Non-linear time series analysis: forecasting financial markets. [Online]. Available at: https://www.semanticscholar.org/paper/Linear-and-Non-linear-time-series-analysis%3A-markets-Bar%C3%A3o/c61536dba552c6a874f43c556e529eeb6e5409e3 [Accessed 17 February 2022].
Buchholz, M. et al. (2012). Bits and Bets Information, Price Volatility, and Demand for Bitcoin. p.48.
Paul, S. (2020). Feature Selection Tutorial in Python Sklearn. [Online]. Available at: https://www.datacamp.com/community/tutorials/feature-selection-python [Accessed 10 March 2022].
Edwards, J. (2022). Bitcoin’s Price History. [Online]. Available at: https://www.investopedia.com/articles/forex/121815/bitcoins-price-history.asp [Accessed 17 March 2022].
Aggarwal, G. et al. (2019). Understanding the Social Factors Affecting the Cryptocurrency Market. arXiv:1901.06245 [cs]. [Online]. Available at: http://arxiv.org/abs/1901.06245 [Accessed 17 March 2022].
Mudassir, M. et al. (2020). Time-series forecasting of Bitcoin prices using high-dimensional features: a machine learning approach. Neural Computing and Applications. [Online]. Available at: https://doi.org/10.1007/s00521-020-05129-6 [Accessed 24 March 2022].
Raudys, A., & Mockus, J. (1999). Comparison of ARMA and multilayer perceptron based methods for economic time series forecasting. Informatica, 10(2), 231--244.
Box, G. E. P. and Jenkins, G. M. 1967. Statistical Models for Prediction and Control, Madison, Wisconsin: Department of Statistics, University of Wisconsin. Technical Reports #72, 77, 79, 94, 95, 99, 103, 104, 116, 121, and 122
Ciaian, P., Rajcaniova, M., & Kancs, D. A. (2016). The economics of BitCoin price formation. Applied economics, 48(19), 1799--1815.
Editors and Affiliations
1.1 Appendix A: Data Retrieval Sources
Bitcoin price index, and Blockchain data were obtained on 18 February 2021 from the Blockchain API (https://blockchain.com/). In addition to that, the other exogenous variables, such as Gold Price and Exchange Rates were extracted from Investing.com: (https://uk.investing.com/commodities/gold,https://uk.investing.com/currencies/usd-cny-historical-data, respectively). Search volume data were retrieved by accessing the Google Trends website (http://www.google.com/trends) on 22 February 2021 and the Wikipedia article traffic statistics site (https://www.wikishark.com/) on 2 February 2021.
All Bitcoin and Blockchain historical data were validated and cross-checked for accuracy by comparing data in popular sources like CoinDesk (https://www.coindesk.com/price/bitcoin/), and Yahoo Finance (https://finance.yahoo.com/).
1.2 Appendix B: Model Data Pre-processing
LSTM: When training a network with data with large range of values, as large input values can slow down the learning and sometimes can prevent the network from learning effectively. Therefore, the dataset was scaled using the MinMax Scaler available in the sklearn library. This process scales the dataset values to fit between 0 and 1.
ARIMA: Selecting the order of the model is crucial to build a good model. Firstly, the data was explored as discussed in the Results section, unit root tests were conducted to identify the best order of differencing, and in search of the best model, a grid search with different orders was iteratively fit. Finally, the best model was determined by plotting the residuals and comparing AIC scores.
1.3 Appendix C: ADF Test Scores
1.4 Appendix D: Findings
The boxplot shows the closing prices of each month throughout all the years, the price gradually grew during the first 4 months and began to decline over the next 5 months, and finally beginning to rise again. It must also be noted there are plenty of outliers following May, this could be due to the high gains for BTC during 2017 and 2019 May, which accumulated for over 50%. This could be what is causing the model to predictions with higher MAE, the smaller size the of observations was not comprehensive for the model to identify the outliers (Fig. 17).
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Jegathees, K.J., Usman, A.B., ODea, M. (2023). Assessing the Predictability of Bitcoin Using AI and Statistical Models. In: Maleh, Y., Alazab, M., Romdhani, I. (eds) Blockchain for Cybersecurity in Cyber-Physical Systems. Advances in Information Security, vol 102. Springer, Cham. https://doi.org/10.1007/978-3-031-25506-9_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25505-2
Online ISBN: 978-3-031-25506-9