Skip to main content

Assessing the Predictability of Bitcoin Using AI and Statistical Models

  • Chapter
  • First Online:
Blockchain for Cybersecurity in Cyber-Physical Systems

Part of the book series: Advances in Information Security ((ADIS,volume 102))

  • 409 Accesses

Abstract

This chapter analyses Bitcoin’s predictability using AI and statistical models, and then identifies the model that produces the most accurate results. A multivariate time series dataset was created to train an AI (LSTM) and statistical model (ARIMA) to predict the price of Bitcoin over time. The LSTM model achieved the highest accuracy of 94% and the lowest MAPE of 5%. ARIMA had the best overall metrics, but when it came to forecasting the future, it performed poorly. The results show that it is possible to predict the BTC with a reasonable error rate; however, Bitcoin is extremely volatile, making it difficult to obtain results that can be confidently used to assert its value ahead of time. The result of the study suggested that it is possible to forecast the Bitcoin price with minimal error rates, refuting the null hypothesis. However, because the bitcoin price index is affected by various external sources, forecasting time series problems is intrinsically challenging. When attempting to predict stated sort of data, the following constraints must be considered: the models (1) do not account for exogenous variable uncertainty, and (2) do not account for the fact that forecast-error variances vary with time (Fair, Chapter 33 Evaluating the predictive accuracy of models. In: Handbook of econometrics, vol 3, pp 1979–1995. Elsevier [Online]. Available at: https://doi.org/10.1016/S1573-4412(86)03013-1. Accessed 19 Jan 2022, 1986).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Teker, D., Teker, S. and Ozyesil, M. (2019). Determinants of Cryptocurrency Price Movements. In: LAHSS-19, MEEIS-19 Nov. 12–14, 2019, Paris (France). 12 November 2019. Higher Education And Innovation Group. [Online]. Available at: doi:10.17758/HEAIG6.H1119510 [Accessed 18 November 2021].

    Google Scholar 

  2. Elder, J. and Kennedy, P. E. (2001). Testing for Unit Roots: What Should Students Be Taught? The Journal of Economic Education, 32 (2), pp.137–146. [Online]. Available at: https://doi.org/10.1080/00220480109595179 [Accessed 18 November 2021].

  3. Selva, P. (2019). Augmented Dickey Fuller Test (ADF Test) – Must Read Guide. Augmented Dickey Fuller Test (ADF Test) – Must Read Guide. [Online]. Available at: https://www.machinelearningplus.com/time-series/augmented-dickey-fuller-test/ [Accessed 18 November 2021].

  4. Lamothe-Fernández, P. et al. (2020). Deep Learning Methods for Modeling Bitcoin Price. Mathematics, 8 (8), p.1245. [Online]. Available at: https://doi.org/10.3390/math8081245 [Accessed 18 November 2021].

  5. Balcilar, M. et al. (2017). Can volume predict Bitcoin returns and volatility? A quantiles-based approach. Economic Modelling, 64, pp.74–81. [Online]. Available at: https://doi.org/10.1016/j.econmod.2017.03.019.

  6. Kristoufek, L. (2015). What Are the Main Drivers of the Bitcoin Price? Evidence from Wavelet Coherence Analysis. Scalas, E. (Ed). PLOS ONE, 10 (4), p.e0123923. [Online]. Available at: https://doi.org/10.1371/journal.pone.0123923.

  7. Choi, H. and Varian, H. (2009). Predicting the present with Google trends. Google Research. Google Inc., Mountain View, CA.

    Google Scholar 

  8. Da, Z., Engelberg, J. and Gao, P. (2011). In Search of Attention. The Journal of Finance, 66 (5), pp.1461–1499. [Online]. Available at: https://doi.org/10.1111/j.1540-6261.2011.01679.x.

  9. Bouoiyour, J. and Selmi, R. (2015). What Does Bitcoin Look Like? Annals of Economics and Finance, 16, pp.449–492.

    Google Scholar 

  10. Hyndman, J. R. and Athanasopoulos, G. (2018). Forecasting: Principles and Practice (2nd ed). [Online]. Available at: https://Otexts.com/fpp2/ [Accessed 9 December 2021].

  11. Guyon, I. and Elisseeff, A. (2003). An Introduction to Variable and Feature Selection. p.26.

    Google Scholar 

  12. Ariyo, A. A., Adewumi, A. O. and Ayo, C. K. (2014). Stock Price Prediction Using the ARIMA Model. In: 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation. March 2014. pp.106–112. [Online]. Available at: https://doi.org/10.1109/UKSim.2014.67.

  13. McNally, S., Roche, J. and Caton, S. (2018). Predicting the Price of Bitcoin Using Machine Learning. In: 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP). March 2018. pp.339–343. [Online]. Available at: https://doi.org/10.1109/PDP2018.2018.00060

  14. Wołk, K. (2020). Advanced social media sentiment analysis for short-term cryptocurrency price prediction. Expert Systems, 37 (2), p.e12493. [Online]. Available at: https://doi.org/10.1111/exsy.12493.

  15. Tandon, C. et al. (2021). How can we predict the impact of the social media messages on the value of cryptocurrency? Insights from big data analytics. International Journal of Information Management Data Insights, 1 (2), p.100035. [Online]. Available at: https://doi.org/10.1016/j.jjimei.2021.100035.

  16. Brownlee, J. (2016). Feature Selection For Machine Learning in Python. Machine Learning Mastery. [Online]. Available at: https://machinelearningmastery.com/feature-selection-machine-learning-python/ [Accessed 13 December 2021].

  17. Bitcoin.org. FAQ – Bitcoin. 2021 [Online]. Available at: https://bitcoin.org/en/faq [Accessed 14 December 2021].

  18. Greaves, A. and Au, B. (2015). Using the Bitcoin Transaction Graph to Predict the Price of Bitcoin. p.8.

    Google Scholar 

  19. Kristoufek, L. (2013). BitCoin meets Google Trends and Wikipedia: Quantifying the relationship between phenomena of the Internet era. Scientific Reports, 3 (1), p.3415. [Online]. Available at: https://doi.org/10.1038/srep03415.

  20. Urquhart, A. (2016). The inefficiency of bitcoin. Economics Letters, 148, 80–82.

    Google Scholar 

  21. Tinawi, I. (2019). Machine Learning for Time Series Anomaly Detection. p.55.

    Google Scholar 

  22. Rather, A. M., Agarwal, A. and Sastry, V. N. (2015). Recurrent neural network and a hybrid model for prediction of stock returns. Expert Systems with Applications, 42 (6), pp.3234–3241. [Online]. Available at: https://doi.org/10.1016/j.eswa.2014.12.003.

  23. Giles, C. L. (2001). Noisy Time Series Prediction using Recurrent Neural Networks and Grammatical Inference. p.23.

    Google Scholar 

  24. Valencia, F., Gómez-Espinosa, A. and Valdés-Aguirre, B. (2019). Price Movement Prediction of Cryptocurrencies Using Sentiment Analysis and Machine Learning. Entropy, 21 (6), p.589. [Online]. Available at: doi:https://doi.org/10.3390/e21060589.

  25. Guo, Q. et al. (2021). MRC-LSTM: A Hybrid Approach of Multi-scale Residual CNN and LSTM to Predict Bitcoin Price. arXiv:2105.00707 [cs, q-fin]. [Online]. Available at: http://arxiv.org/abs/2105.00707 [Accessed 17 December 2021].

  26. Wang, J.-J. et al. (2012). Stock index forecasting based on a hybrid model. Omega, 40 (6), pp.758–766. [Online]. Available at: https://doi.org/10.1016/j.omega.2011.07.008.

  27. Cermak, V. (2017). Can Bitcoin Become a Viable Alternative to Fiat Currencies? An Empirical Analysis of Bitcoin’s Volatility Based on a GARCH Model. SSRN Scholarly Paper, Rochester, NY: Social Science Research Network. [Online]. Available at: https://doi.org/10.2139/ssrn.2961405 [Accessed 17 December 2021].

  28. Xie, Y., Ueda, Y. and Sugiyama, M. (2021). A Two-Stage Short-Term Load Forecasting Method Using Long Short-Term Memory and Multilayer Perceptron. Energies, 14 (18), p.5873. [Online]. Available at: https://doi.org/10.3390/en14185873.

  29. Fair, R. C. (1986). Chapter 33 Evaluating the predictive accuracy of models. In: Handbook of Econometrics. 3. Elsevier. pp.1979–1995. [Online]. Available at: https://doi.org/10.1016/S1573-4412(86)03013-1 [Accessed 19 January 2022].

  30. Matta, M., Lunesu, I. and Marchesi, M. (2015). Bitcoin Spread Prediction Using Social And Web Search Media. p.10.

    Google Scholar 

  31. Nochai, R. and Nochai, T. (2006). ARIMA MODEL FOR FORECASTING OIL PALM PRICE. p.7.

    Google Scholar 

  32. Khashei, M., Bijari, M. and Raissi Ardali, G. A. (2012). Hybridization of autoregressive integrated moving average (ARIMA) with probabilistic neural networks (PNNs). Computers & Industrial Engineering, 63 (1), pp.37–45. [Online]. Available at: https://doi.org/10.1016/j.cie.2012.01.017.

  33. Pai, P.-F. and Lin, C.-S. (2005). A hybrid ARIMA and support vector machines model in stock price forecasting. Omega, 33 (6), pp.497–505. [Online]. Available at: https://doi.org/10.1016/j.omega.2004.07.024.

  34. Zhu, Y., Dickinson, D. and Li, J. (2017). Analysis on the influence factors of Bitcoin’s price based on VEC model. Financial Innovation, 3. [Online]. Available at: https://doi.org/10.1186/s40854-017-0054-0.

  35. Olah, C. (2015). Understanding LSTM Networks -- colah’s blog. [Online]. Available at: http://colah.github.io/posts/2015-08-Understanding-LSTMs/ [Accessed 17 February 2022].

  36. Barão, S. M. M. (2008). Linear and Non-linear time series analysis: forecasting financial markets. [Online]. Available at: https://www.semanticscholar.org/paper/Linear-and-Non-linear-time-series-analysis%3A-markets-Bar%C3%A3o/c61536dba552c6a874f43c556e529eeb6e5409e3 [Accessed 17 February 2022].

  37. Buchholz, M. et al. (2012). Bits and Bets Information, Price Volatility, and Demand for Bitcoin. p.48.

    Google Scholar 

  38. Paul, S. (2020). Feature Selection Tutorial in Python Sklearn. [Online]. Available at: https://www.datacamp.com/community/tutorials/feature-selection-python [Accessed 10 March 2022].

  39. Edwards, J. (2022). Bitcoin’s Price History. [Online]. Available at: https://www.investopedia.com/articles/forex/121815/bitcoins-price-history.asp [Accessed 17 March 2022].

  40. Aggarwal, G. et al. (2019). Understanding the Social Factors Affecting the Cryptocurrency Market. arXiv:1901.06245 [cs]. [Online]. Available at: http://arxiv.org/abs/1901.06245 [Accessed 17 March 2022].

  41. Mudassir, M. et al. (2020). Time-series forecasting of Bitcoin prices using high-dimensional features: a machine learning approach. Neural Computing and Applications. [Online]. Available at: https://doi.org/10.1007/s00521-020-05129-6 [Accessed 24 March 2022].

  42. Raudys, A., & Mockus, J. (1999). Comparison of ARMA and multilayer perceptron based methods for economic time series forecasting. Informatica, 10(2), 231--244.

    Google Scholar 

  43. Box, G. E. P. and Jenkins, G. M. 1967. Statistical Models for Prediction and Control, Madison, Wisconsin: Department of Statistics, University of Wisconsin. Technical Reports #72, 77, 79, 94, 95, 99, 103, 104, 116, 121, and 122

    Google Scholar 

  44. Ciaian, P., Rajcaniova, M., & Kancs, D. A. (2016). The economics of BitCoin price formation. Applied economics, 48(19), 1799--1815.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aminu Bello Usman .

Editor information

Editors and Affiliations

Appendices

Appendices

1.1 Appendix A: Data Retrieval Sources

Bitcoin price index, and Blockchain data were obtained on 18 February 2021 from the Blockchain API (https://blockchain.com/). In addition to that, the other exogenous variables, such as Gold Price and Exchange Rates were extracted from Investing.com: (https://uk.investing.com/commodities/gold,https://uk.investing.com/currencies/usd-cny-historical-data, respectively). Search volume data were retrieved by accessing the Google Trends website (http://www.google.com/trends) on 22 February 2021 and the Wikipedia article traffic statistics site (https://www.wikishark.com/) on 2 February 2021.

All Bitcoin and Blockchain historical data were validated and cross-checked for accuracy by comparing data in popular sources like CoinDesk (https://www.coindesk.com/price/bitcoin/), and Yahoo Finance (https://finance.yahoo.com/).

1.2 Appendix B: Model Data Pre-processing

LSTM: When training a network with data with large range of values, as large input values can slow down the learning and sometimes can prevent the network from learning effectively. Therefore, the dataset was scaled using the MinMax Scaler available in the sklearn library. This process scales the dataset values to fit between 0 and 1.

ARIMA: Selecting the order of the model is crucial to build a good model. Firstly, the data was explored as discussed in the Results section, unit root tests were conducted to identify the best order of differencing, and in search of the best model, a grid search with different orders was iteratively fit. Finally, the best model was determined by plotting the residuals and comparing AIC scores.

1.3 Appendix C: ADF Test Scores

 

Differencing order

Column

No-diff

diff(0)

diff(1)

Close

0.868240

 

0.000000e+00

Open

0.858296

 

0.000000e+00

High

0.873539

 

0.000000e+00

Low

0.856106

 

0.000000e+00

Estimated-transaction-volume-usd

0.361389

 

3.769493e-21

n-transactions

0.077388

 

1.535323e-19

Hash-rate

0.899546

 

4.206086e-26

Difficulty

0.948766

 

1.827273e-21

Cost-per-transaction

0.577504

 

2.476612e-14

Gold price

0.885700

 

6.186569e-21

Output-volume

0.023501

0.023501

 

Trade-volume

0.080964

 

6.361866e-22

USD-CNY Price

0.693700

 

9.939304e-21

SVI

0.006120

0.006120

 

Wikiviews

0.016293

0.016293

 

1.4 Appendix D: Findings

The boxplot shows the closing prices of each month throughout all the years, the price gradually grew during the first 4 months and began to decline over the next 5 months, and finally beginning to rise again. It must also be noted there are plenty of outliers following May, this could be due to the high gains for BTC during 2017 and 2019 May, which accumulated for over 50%. This could be what is causing the model to predictions with higher MAE, the smaller size the of observations was not comprehensive for the model to identify the outliers (Fig. 17).

Fig. 17
A bar plot of bitcoin closing price between January and December with error bars. The bar is maximum in March with a value of approximately 5000 to 42000.

BTC Closing Price throughout the decade

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Jegathees, K.J., Usman, A.B., ODea, M. (2023). Assessing the Predictability of Bitcoin Using AI and Statistical Models. In: Maleh, Y., Alazab, M., Romdhani, I. (eds) Blockchain for Cybersecurity in Cyber-Physical Systems. Advances in Information Security, vol 102. Springer, Cham. https://doi.org/10.1007/978-3-031-25506-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25506-9_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25505-2

  • Online ISBN: 978-3-031-25506-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics