Skip to main content
Log in

Designing a forecasting assistant of the Bitcoin price based on deep learning using market sentiment analysis and multiple feature extraction

  • Soft computing in decision making and in modeling in economics
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Nowadays, the issue of fluctuations in the price of digital Bitcoin currency has a striking impact on the profit or loss of people, international relations, and trade. Accordingly, designing a model that can take into account the various significant factors for predicting the Bitcoin price with the highest accuracy is essential. Hence, the current paper uses market sentiment and multiple feature extraction to present several Bitcoin price prediction models based on convolutional neural network and long short-term memory. In the proposed models, several parameters, including Twitter data, news headlines, news content, Google Trends, Bitcoin-based stock, and finance, are employed based on deep learning to make a more accurate prediction. Such parameters are the input data used to predict the Bitcoin price. Besides, the proposed model analyzes the Valence Aware Dictionary and Sentiment Reasoner sentiments to examine the market’s latest news and cryptocurrencies. According to this study’s various inputs and analyses, several effective feature selection methods, including mutual information regression, linear regression, correlation-based, and a combination of the feature selection models, are exploited to predict the price of Bitcoin. Finally, a careful comparison is made between the proposed models in terms of some performance criteria like mean square error (MSE), root-mean-square error, mean absolute error, median absolute error, and coefficient of determination (R2). The obtained results indicate that the proposed hybrid model based on sentiments analysis and combined feature selection with MSE value of 0.001 and R2 value of 0.98 provides better estimations with more minor errors regarding Bitcoin price. This proposed model can also be employed as an individual assistant for more informed trading decisions associated with Bitcoin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Data availability

The data are not publicly available due to the privacy of the research participants.

References

  • Ahanin Z, Ismail MA (2022) A multi-label emoji classification method using balanced pointwise mutual information-based feature selection. Comput Speech Lang 73:101330

    Google Scholar 

  • Alweshah M, Alkhalaileh S, Albashish D, Mafarja M, Bsoul Q, Dorgham O (2021) A hybrid mine blast algorithm for feature selection problems. Soft Comput 25:517–534

    Google Scholar 

  • Asur S, Huberman BA (2010) Predicting the Future with Social Media. In: 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology. (Vol. 1, pp. 492-499). IEEE in the US

  • Awoke T, Rout M, Mohanty L, Satapathy SC (2021) Bitcoin price prediction and analysis using deep learning models. Springer, Singapore, pp 631–640

    Google Scholar 

  • Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2:1–8

    Google Scholar 

  • Bordino I, Battiston S, Caldarelli G, Cristelli M, Ukkonen A, Weber I (2012) Web search queries can predict stock market volumes. PLoS One 7:e40014

    Google Scholar 

  • Bui D-K, Nguyen T, Chou J-S, Nguyen-Xuan H, Ngo TD (2018) A modified firefly algorithm-artificial neural network expert system for predicting compressive and tensile strength of high-performance concrete. Constr Build Mater 180:320–333

    Google Scholar 

  • Cambria E, Poria S, Hazarika D, Kwok K (2018) SenticNet 5: discovering conceptual primitives for sentiment analysis by means of context embeddings. In: Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1), USA

  • Chatfield C, Yar M (1988) Holt-Winters forecasting: some practical issues. J R Stat Soc Ser D (The Statistician) 37:129–140

    Google Scholar 

  • Chaudhari H, Crane M (2020) Cross-correlation dynamics and community structures of cryptocurrencies. J Comput Sci 44:101130

    MathSciNet  Google Scholar 

  • Chen H, De P, Hu YJ, Hwang B-H (2013) Customers as advisors: the role of social media in financial markets. In: Working paper

  • Chohan UW (2017) Cryptocurrencies: a brief thematic review. Canberra: University of New South Wales

  • Choi H, Varian H (2012) Predicting the present with Google Trends. Econ Rec 88:2–9

    Google Scholar 

  • Chou J-S, Bui D-K (2014) Modeling heating and cooling loads by artificial intelligence for energy-efficient building design. Energy Build 82:437–446

    Google Scholar 

  • Chou J-S, Chong WK, Bui D-K (2016) Nature-inspired metaheuristic regression system: programming and implementation for civil engineering applications. J Comput Civ Eng 30:04016007

    Google Scholar 

  • Colianni S, Rosales S, Signorotti M (2015) Algorithmic trading of cryptocurrency based on Twitter sentiment analysis. CS229 Project, pp 1–5

  • Curtis EA, Comiskey C, Dempsey O (2016) Importance and use of correlational research. Nurse Res 23:20–25

    Google Scholar 

  • Dai B, Jiang S, Li C, Zhu M, Wang S (2021) A multi-hop cross-blockchain transaction model based on improved hash-locking. Int J Comput Sci Eng 24:610–620

    Google Scholar 

  • Das S, Billah M, Mumu SA (2021) A hybrid approach for predicting Bitcoin price using Bi-LSTM and Bi-RNN based neural network. Springer International Publishing, Cham, pp 223–233

    Google Scholar 

  • de Jong P, Elfayoumy S, Schnusenberg O (2017) From returns to tweets and back: an investigation of the stocks in the Dow Jones industrial average. J Behav Finance 18:54–64

    Google Scholar 

  • Dokeroglu T, Deniz A, Kiziloz HE (2022) A comprehensive survey on recent metaheuristics for feature selection. Neurocomputing 494:269–296

    Google Scholar 

  • Dolan RJ (2002) Emotion, cognition, and behavior. Science 298:1191–1194

    Google Scholar 

  • Duangsoithong R, Windeatt T (2010) Correlation-based and causal feature selection analysis for ensemble classifiers. In: ANNPR. Springer, pp 25–36

  • Dutta A, Kumar S, Basu M (2020) A gated recurrent unit approach to Bitcoin price prediction. J Risk Financ Manag 13:23

    Google Scholar 

  • ElRahman SA, Alluhaidan AS (2021) Blockchain technology and IoT-edge framework for sharing healthcare services. Soft Comput 25:13753–13777

    Google Scholar 

  • Ettredge M, Gerdes J, Karuga G (2005) Using web-based search data to predict macroeconomic statistics. Commun ACM 48:87–92

    Google Scholar 

  • Gao W, Su C (2020) Analysis of earnings forecast of blockchain financial products based on particle swarm optimization. J Comput Appl Math 372:112724

    MathSciNet  MATH  Google Scholar 

  • Guresen E, Kayakutlu G, Daim TU (2011) Using artificial neural network models in stock market index prediction. Expert Syst Appl 38:10389–10397

    Google Scholar 

  • Heidari AA, Akhoondzadeh M, Chen H (2022) A wavelet PM2. 5 prediction system using optimized kernel extreme learning with Boruta-XGBoost feature selection. Mathematics 10:3566

    Google Scholar 

  • Hota HS, Sharma DK, Verma N (2021) 14—Lexicon-based sentiment analysis using Twitter data: a case of COVID-19 outbreak in India and abroad. In: Kose U, Gupta D, de Albuquerque VHC, Khanna A (eds) Data science for COVID-19. Academic Press, Cambridge, pp 275–295

    Google Scholar 

  • Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the international AAAI conference on web and social media (Vol. 8, No. 1, pp. 216-225), USA

  • Jain A, Tripathi S, Dwivedi HD, Saxena P (2018) Forecasting price of cryptocurrencies using Tweets sentiment analysis. In: 2018 eleventh international conference on contemporary computing (IC3). pp 1–7. IEEE in the US

  • Jurafsky D (2000) Speech and language processing. Pearson Education, Hoboken

    Google Scholar 

  • Kai-Ineman D, Tversky A (1979) Prospect theory: an analysis of decision under risk. Econometrica 47:363–391

    MathSciNet  Google Scholar 

  • Karalevicius V, Degrande N, De Weerdt J (2018) Using sentiment analysis to predict interday Bitcoin price movements. J Risk Finance 19:56–75

    Google Scholar 

  • Kimoto T, Asakawa K, Yoda M, Takeoka M (1990) Stock market prediction system with modular neural networks. In: 1990 IJCNN international joint conference on neural networks, vol 1. pp 1–6. IEEE in the US

  • Kouloumpis E, Wilson T, Moore J (2011) Twitter sentiment analysis: the good the bad and the omg! In: Fifth International AAAI conference on weblogs and social media (Vol. 5, No. 1, pp. 538-541) in Johns Hopkins University

  • Kraskov A, Stögbauer H, Grassberger P (2004) Estimating mutual information. Phys Rev E 69:066138

    MathSciNet  Google Scholar 

  • Kristoufek L (2015) What are the main drivers of the Bitcoin price? Evidence from wavelet coherence analysis. PLoS One 10:e0123923

    Google Scholar 

  • Lamon C, Nielsen E, Redondo E (2017) Cryptocurrency price prediction using news and social media sentiment. SMU Data Sci Rev 1:1–22

    Google Scholar 

  • Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2017) Feature selection: a data perspective. ACM Comput Surv (CSUR) 50:1–45

    Google Scholar 

  • Li D, Han D, Weng T-H, Zheng Z, Li H, Liu H, Castiglione A, Li K-C (2021) Blockchain for federated learning toward secure distributed machine learning systems: a systemic survey. Soft Comput 26:4423–4440

    Google Scholar 

  • Liu M, Li G, Li J, Zhu X, Yao Y (2021) Forecasting the price of Bitcoin using deep learning. Finance Res Lett 40:101755

    Google Scholar 

  • Madan I, Saluja S, Zhao A (2015) Automated bitcoin trading via machine learning algorithms. http://cs229.stanford.edu/proj2014/Isaac%20Madan

  • Manning C, Schutze H (1999) Foundations of statistical natural language processing. MIT Press, Cambridge

    MATH  Google Scholar 

  • Manning CD, Surdeanu M, Bauer J, Finkel JR, Bethard S, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations. pp 55–60. USA

  • Matta M, Lunesu I, Marchesi M (2015) Bitcoin spread prediction using social and web search media. In: UMAP workshops. pp 1–10

  • McNally S, Roche J, Caton S (2018) Predicting the price of Bitcoin using machine learning. In: 2018 26th euromicro international conference on parallel, distributed and network-based processing (PDP). pp 339–343. IEEE in the US

  • Mensi W, Rehman MU, Al-Yahyaee KH, Al-Jarrah IMW, Kang SH (2019) Time frequency analysis of the commonalities between Bitcoin and major cryptocurrencies: portfolio risk management implications. N Am J Econ Finance 48:283–294

    Google Scholar 

  • Mittal A, Dhiman V, Singh A, Prakash C (2019) Short-term Bitcoin price fluctuation prediction using social media and web search data. In: 2019 twelfth international conference on contemporary computing (IC3). pp 1–6. IEEE in the US

  • Naimy VY, Hayek MR (2018) Modelling and predicting the Bitcoin volatility using GARCH models. Int J Math Model Numer Optim 8:197–215

    MATH  Google Scholar 

  • Nakamoto S (2008) Bitcoin: a peer-to-peer electronic cash system. Decent Bus Rev. https://doi.org/10.1007/978-3-030-91608-4_23

    Article  Google Scholar 

  • Nakano M, Takahashi A, Takahashi S (2018) Bitcoin technical trading with artificial neural network. Phys A 510:587–609

    Google Scholar 

  • O'Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: linking text sentiment to public opinion time series. In: Fourth international AAAI conference on weblogs and social media (Vol. 4, No. 1, pp. 122-129), Carnegie Mellon University

  • Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: LREc. (Vol. 10, No. 2010, pp. 1320-1326), Universit´e de Paris-Sud

  • Panger GT (2017) Emotion in social media. University of California, Berkeley

    Google Scholar 

  • Pant DR, Neupane P, Poudel A, Pokhrel AK, Lama BK (2018) Recurrent neural network based Bitcoin price prediction by Twitter sentiment analysis. In: 2018 IEEE 3rd international conference on computing, communication and security (ICCCS). pp 128–132. IEEE, 2018 in the US

  • Papadamou S, Kyriazis NA, Tzeremes PG (2021) Nonlinear causal linkages of EPU and gold with major cryptocurrencies during bull and bear markets. N Am J Econ Finance 56:101343

    Google Scholar 

  • Pettey C (2010) Gartner says majority of consumers rely on social networks to guide purchase decisions, Online im Internet: http://www.gartner.com/it/page.jsp

  • Porter MF (1980) An algorithm for suffix stripping. Program 14:130–137

    Google Scholar 

  • Porter MF (2001) Snowball: a language for stemming algorithms

  • Radityo A, Munajat Q, Budi I (2017) Prediction of Bitcoin exchange rate to American dollar using artificial neural network methods. In: 2017 international conference on advanced computer science and information systems (ICACSIS). pp 433–438. IEEE in the US

  • Ramadhan NG, Tanjung NAF, Adhinata FD (2021) Implementation of LSTM-RNN for Bitcoin prediction. Indones J Comput (indo-JC) 6:17–24

    Google Scholar 

  • Rani R, Lobiyal DK (2018) Automatic construction of generic stop words list for hindi text. Procedia Comput Sci 132:362–370

    Google Scholar 

  • Saarela M, Jauhiainen S (2021) Comparison of feature importance measures as explanations for classification models. SN Appl Sci 3:272

    Google Scholar 

  • Şahin DÖ, Kural OE, Akleylek S, Kılıç E (2021) A novel permission-based Android malware detection system using feature selection based on linear regression. Neural Comput Appl 35:4903–4918

    Google Scholar 

  • Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24:513–523

    Google Scholar 

  • Shah D, Zhang K (2014) Bayesian regression and Bitcoin. In: 2014 52nd annual Allerton conference on communication, control, and computing (Allerton). pp 409–414. IEEE in the US

  • Shamoi E, Turdybay A, Shamoi P, Akhmetov I, Jaxylykova A, Pak A (2022) Sentiment analysis of vegan related tweets using mutual information for feature selection. PeerJ Comput Sci 8:e1149

    Google Scholar 

  • Soni N, Sharma EK, Kapoor A (2021) Hybrid meta-heuristic algorithm based deep neural network for face recognition. J Comput Sci 51:101352

    Google Scholar 

  • Stenqvist E, Lönnö J (2017) Predicting Bitcoin price fluctuation with Twitter sentiment analysis

  • Sul H, Dennis AR, Yuan AR (2014) Trading on Twitter: the financial information content of emotion in social media. In: 2014 47th Hawaii international conference on system sciences. pp 806–815. IEEE in the US

  • Surendar P (2021) Diagnosis of lung cancer using hybrid deep neural network with adaptive sine cosine crow search algorithm. J Comput Sci 53:101374

    MathSciNet  Google Scholar 

  • Tetlock PC (2007) Giving content to investor sentiment: The role of media in the stock market. J Finance 62:1139–1168

    Google Scholar 

  • Toğaçar M, Ergen B, Cömert Z, Özyurt F (2020) A deep feature learning model for pneumonia detection applying a combination of mRMR feature selection and machine learning models. IRBM 41:212–222

    Google Scholar 

  • Utama H (2019) Sentiment analysis in airline tweets using mutual information for feature selection. In: 2019 4th international conference on information technology, information systems and electrical engineering (ICITISEE). IEEE, pp 295–300

  • Vergara JR, Estévez PA (2014) A review of feature selection methods based on mutual information. Neural Comput Appl 24:175–186

    Google Scholar 

  • Xu J, Croft WB (1998) Corpus-based stemming using cooccurrence of word variants. ACM Trans Inf Syst (TOIS) 16:61–81

    Google Scholar 

  • Xue B, Zhang M, Browne WN, Yao X (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 20:606–626

    Google Scholar 

  • Zhu X, Wang H, Xu L, Li H (2008) Predicting stock index increments by neural networks: the role of trading volume under different horizons. Expert Syst Appl 34:3043–3054

    Google Scholar 

  • Zuiderwijk A, Chen Y-C, Salem F (2021) Implications of the use of artificial intelligence in public governance: a systematic literature review and a research agenda. Gov Inf Q 38:101577

    Google Scholar 

Download references

Acknowledgements

The authors appreciate the unknown referee’s valuable and profound comments.

Funding

This study has no funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sina Fakharchian.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Research involving human participants, their data or biological material

Not included.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fakharchian, S. Designing a forecasting assistant of the Bitcoin price based on deep learning using market sentiment analysis and multiple feature extraction. Soft Comput 27, 18803–18827 (2023). https://doi.org/10.1007/s00500-023-09028-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-09028-5

Keywords

Navigation