Abstract
Nowadays, the issue of fluctuations in the price of digital Bitcoin currency has a striking impact on the profit or loss of people, international relations, and trade. Accordingly, designing a model that can take into account the various significant factors for predicting the Bitcoin price with the highest accuracy is essential. Hence, the current paper uses market sentiment and multiple feature extraction to present several Bitcoin price prediction models based on convolutional neural network and long short-term memory. In the proposed models, several parameters, including Twitter data, news headlines, news content, Google Trends, Bitcoin-based stock, and finance, are employed based on deep learning to make a more accurate prediction. Such parameters are the input data used to predict the Bitcoin price. Besides, the proposed model analyzes the Valence Aware Dictionary and Sentiment Reasoner sentiments to examine the market’s latest news and cryptocurrencies. According to this study’s various inputs and analyses, several effective feature selection methods, including mutual information regression, linear regression, correlation-based, and a combination of the feature selection models, are exploited to predict the price of Bitcoin. Finally, a careful comparison is made between the proposed models in terms of some performance criteria like mean square error (MSE), root-mean-square error, mean absolute error, median absolute error, and coefficient of determination (R2). The obtained results indicate that the proposed hybrid model based on sentiments analysis and combined feature selection with MSE value of 0.001 and R2 value of 0.98 provides better estimations with more minor errors regarding Bitcoin price. This proposed model can also be employed as an individual assistant for more informed trading decisions associated with Bitcoin.
Similar content being viewed by others
Data availability
The data are not publicly available due to the privacy of the research participants.
References
Ahanin Z, Ismail MA (2022) A multi-label emoji classification method using balanced pointwise mutual information-based feature selection. Comput Speech Lang 73:101330
Alweshah M, Alkhalaileh S, Albashish D, Mafarja M, Bsoul Q, Dorgham O (2021) A hybrid mine blast algorithm for feature selection problems. Soft Comput 25:517–534
Asur S, Huberman BA (2010) Predicting the Future with Social Media. In: 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology. (Vol. 1, pp. 492-499). IEEE in the US
Awoke T, Rout M, Mohanty L, Satapathy SC (2021) Bitcoin price prediction and analysis using deep learning models. Springer, Singapore, pp 631–640
Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2:1–8
Bordino I, Battiston S, Caldarelli G, Cristelli M, Ukkonen A, Weber I (2012) Web search queries can predict stock market volumes. PLoS One 7:e40014
Bui D-K, Nguyen T, Chou J-S, Nguyen-Xuan H, Ngo TD (2018) A modified firefly algorithm-artificial neural network expert system for predicting compressive and tensile strength of high-performance concrete. Constr Build Mater 180:320–333
Cambria E, Poria S, Hazarika D, Kwok K (2018) SenticNet 5: discovering conceptual primitives for sentiment analysis by means of context embeddings. In: Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1), USA
Chatfield C, Yar M (1988) Holt-Winters forecasting: some practical issues. J R Stat Soc Ser D (The Statistician) 37:129–140
Chaudhari H, Crane M (2020) Cross-correlation dynamics and community structures of cryptocurrencies. J Comput Sci 44:101130
Chen H, De P, Hu YJ, Hwang B-H (2013) Customers as advisors: the role of social media in financial markets. In: Working paper
Chohan UW (2017) Cryptocurrencies: a brief thematic review. Canberra: University of New South Wales
Choi H, Varian H (2012) Predicting the present with Google Trends. Econ Rec 88:2–9
Chou J-S, Bui D-K (2014) Modeling heating and cooling loads by artificial intelligence for energy-efficient building design. Energy Build 82:437–446
Chou J-S, Chong WK, Bui D-K (2016) Nature-inspired metaheuristic regression system: programming and implementation for civil engineering applications. J Comput Civ Eng 30:04016007
Colianni S, Rosales S, Signorotti M (2015) Algorithmic trading of cryptocurrency based on Twitter sentiment analysis. CS229 Project, pp 1–5
Curtis EA, Comiskey C, Dempsey O (2016) Importance and use of correlational research. Nurse Res 23:20–25
Dai B, Jiang S, Li C, Zhu M, Wang S (2021) A multi-hop cross-blockchain transaction model based on improved hash-locking. Int J Comput Sci Eng 24:610–620
Das S, Billah M, Mumu SA (2021) A hybrid approach for predicting Bitcoin price using Bi-LSTM and Bi-RNN based neural network. Springer International Publishing, Cham, pp 223–233
de Jong P, Elfayoumy S, Schnusenberg O (2017) From returns to tweets and back: an investigation of the stocks in the Dow Jones industrial average. J Behav Finance 18:54–64
Dokeroglu T, Deniz A, Kiziloz HE (2022) A comprehensive survey on recent metaheuristics for feature selection. Neurocomputing 494:269–296
Dolan RJ (2002) Emotion, cognition, and behavior. Science 298:1191–1194
Duangsoithong R, Windeatt T (2010) Correlation-based and causal feature selection analysis for ensemble classifiers. In: ANNPR. Springer, pp 25–36
Dutta A, Kumar S, Basu M (2020) A gated recurrent unit approach to Bitcoin price prediction. J Risk Financ Manag 13:23
ElRahman SA, Alluhaidan AS (2021) Blockchain technology and IoT-edge framework for sharing healthcare services. Soft Comput 25:13753–13777
Ettredge M, Gerdes J, Karuga G (2005) Using web-based search data to predict macroeconomic statistics. Commun ACM 48:87–92
Gao W, Su C (2020) Analysis of earnings forecast of blockchain financial products based on particle swarm optimization. J Comput Appl Math 372:112724
Guresen E, Kayakutlu G, Daim TU (2011) Using artificial neural network models in stock market index prediction. Expert Syst Appl 38:10389–10397
Heidari AA, Akhoondzadeh M, Chen H (2022) A wavelet PM2. 5 prediction system using optimized kernel extreme learning with Boruta-XGBoost feature selection. Mathematics 10:3566
Hota HS, Sharma DK, Verma N (2021) 14—Lexicon-based sentiment analysis using Twitter data: a case of COVID-19 outbreak in India and abroad. In: Kose U, Gupta D, de Albuquerque VHC, Khanna A (eds) Data science for COVID-19. Academic Press, Cambridge, pp 275–295
Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the international AAAI conference on web and social media (Vol. 8, No. 1, pp. 216-225), USA
Jain A, Tripathi S, Dwivedi HD, Saxena P (2018) Forecasting price of cryptocurrencies using Tweets sentiment analysis. In: 2018 eleventh international conference on contemporary computing (IC3). pp 1–7. IEEE in the US
Jurafsky D (2000) Speech and language processing. Pearson Education, Hoboken
Kai-Ineman D, Tversky A (1979) Prospect theory: an analysis of decision under risk. Econometrica 47:363–391
Karalevicius V, Degrande N, De Weerdt J (2018) Using sentiment analysis to predict interday Bitcoin price movements. J Risk Finance 19:56–75
Kimoto T, Asakawa K, Yoda M, Takeoka M (1990) Stock market prediction system with modular neural networks. In: 1990 IJCNN international joint conference on neural networks, vol 1. pp 1–6. IEEE in the US
Kouloumpis E, Wilson T, Moore J (2011) Twitter sentiment analysis: the good the bad and the omg! In: Fifth International AAAI conference on weblogs and social media (Vol. 5, No. 1, pp. 538-541) in Johns Hopkins University
Kraskov A, Stögbauer H, Grassberger P (2004) Estimating mutual information. Phys Rev E 69:066138
Kristoufek L (2015) What are the main drivers of the Bitcoin price? Evidence from wavelet coherence analysis. PLoS One 10:e0123923
Lamon C, Nielsen E, Redondo E (2017) Cryptocurrency price prediction using news and social media sentiment. SMU Data Sci Rev 1:1–22
Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2017) Feature selection: a data perspective. ACM Comput Surv (CSUR) 50:1–45
Li D, Han D, Weng T-H, Zheng Z, Li H, Liu H, Castiglione A, Li K-C (2021) Blockchain for federated learning toward secure distributed machine learning systems: a systemic survey. Soft Comput 26:4423–4440
Liu M, Li G, Li J, Zhu X, Yao Y (2021) Forecasting the price of Bitcoin using deep learning. Finance Res Lett 40:101755
Madan I, Saluja S, Zhao A (2015) Automated bitcoin trading via machine learning algorithms. http://cs229.stanford.edu/proj2014/Isaac%20Madan
Manning C, Schutze H (1999) Foundations of statistical natural language processing. MIT Press, Cambridge
Manning CD, Surdeanu M, Bauer J, Finkel JR, Bethard S, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations. pp 55–60. USA
Matta M, Lunesu I, Marchesi M (2015) Bitcoin spread prediction using social and web search media. In: UMAP workshops. pp 1–10
McNally S, Roche J, Caton S (2018) Predicting the price of Bitcoin using machine learning. In: 2018 26th euromicro international conference on parallel, distributed and network-based processing (PDP). pp 339–343. IEEE in the US
Mensi W, Rehman MU, Al-Yahyaee KH, Al-Jarrah IMW, Kang SH (2019) Time frequency analysis of the commonalities between Bitcoin and major cryptocurrencies: portfolio risk management implications. N Am J Econ Finance 48:283–294
Mittal A, Dhiman V, Singh A, Prakash C (2019) Short-term Bitcoin price fluctuation prediction using social media and web search data. In: 2019 twelfth international conference on contemporary computing (IC3). pp 1–6. IEEE in the US
Naimy VY, Hayek MR (2018) Modelling and predicting the Bitcoin volatility using GARCH models. Int J Math Model Numer Optim 8:197–215
Nakamoto S (2008) Bitcoin: a peer-to-peer electronic cash system. Decent Bus Rev. https://doi.org/10.1007/978-3-030-91608-4_23
Nakano M, Takahashi A, Takahashi S (2018) Bitcoin technical trading with artificial neural network. Phys A 510:587–609
O'Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: linking text sentiment to public opinion time series. In: Fourth international AAAI conference on weblogs and social media (Vol. 4, No. 1, pp. 122-129), Carnegie Mellon University
Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: LREc. (Vol. 10, No. 2010, pp. 1320-1326), Universit´e de Paris-Sud
Panger GT (2017) Emotion in social media. University of California, Berkeley
Pant DR, Neupane P, Poudel A, Pokhrel AK, Lama BK (2018) Recurrent neural network based Bitcoin price prediction by Twitter sentiment analysis. In: 2018 IEEE 3rd international conference on computing, communication and security (ICCCS). pp 128–132. IEEE, 2018 in the US
Papadamou S, Kyriazis NA, Tzeremes PG (2021) Nonlinear causal linkages of EPU and gold with major cryptocurrencies during bull and bear markets. N Am J Econ Finance 56:101343
Pettey C (2010) Gartner says majority of consumers rely on social networks to guide purchase decisions, Online im Internet: http://www.gartner.com/it/page.jsp
Porter MF (1980) An algorithm for suffix stripping. Program 14:130–137
Porter MF (2001) Snowball: a language for stemming algorithms
Radityo A, Munajat Q, Budi I (2017) Prediction of Bitcoin exchange rate to American dollar using artificial neural network methods. In: 2017 international conference on advanced computer science and information systems (ICACSIS). pp 433–438. IEEE in the US
Ramadhan NG, Tanjung NAF, Adhinata FD (2021) Implementation of LSTM-RNN for Bitcoin prediction. Indones J Comput (indo-JC) 6:17–24
Rani R, Lobiyal DK (2018) Automatic construction of generic stop words list for hindi text. Procedia Comput Sci 132:362–370
Saarela M, Jauhiainen S (2021) Comparison of feature importance measures as explanations for classification models. SN Appl Sci 3:272
Şahin DÖ, Kural OE, Akleylek S, Kılıç E (2021) A novel permission-based Android malware detection system using feature selection based on linear regression. Neural Comput Appl 35:4903–4918
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24:513–523
Shah D, Zhang K (2014) Bayesian regression and Bitcoin. In: 2014 52nd annual Allerton conference on communication, control, and computing (Allerton). pp 409–414. IEEE in the US
Shamoi E, Turdybay A, Shamoi P, Akhmetov I, Jaxylykova A, Pak A (2022) Sentiment analysis of vegan related tweets using mutual information for feature selection. PeerJ Comput Sci 8:e1149
Soni N, Sharma EK, Kapoor A (2021) Hybrid meta-heuristic algorithm based deep neural network for face recognition. J Comput Sci 51:101352
Stenqvist E, Lönnö J (2017) Predicting Bitcoin price fluctuation with Twitter sentiment analysis
Sul H, Dennis AR, Yuan AR (2014) Trading on Twitter: the financial information content of emotion in social media. In: 2014 47th Hawaii international conference on system sciences. pp 806–815. IEEE in the US
Surendar P (2021) Diagnosis of lung cancer using hybrid deep neural network with adaptive sine cosine crow search algorithm. J Comput Sci 53:101374
Tetlock PC (2007) Giving content to investor sentiment: The role of media in the stock market. J Finance 62:1139–1168
Toğaçar M, Ergen B, Cömert Z, Özyurt F (2020) A deep feature learning model for pneumonia detection applying a combination of mRMR feature selection and machine learning models. IRBM 41:212–222
Utama H (2019) Sentiment analysis in airline tweets using mutual information for feature selection. In: 2019 4th international conference on information technology, information systems and electrical engineering (ICITISEE). IEEE, pp 295–300
Vergara JR, Estévez PA (2014) A review of feature selection methods based on mutual information. Neural Comput Appl 24:175–186
Xu J, Croft WB (1998) Corpus-based stemming using cooccurrence of word variants. ACM Trans Inf Syst (TOIS) 16:61–81
Xue B, Zhang M, Browne WN, Yao X (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 20:606–626
Zhu X, Wang H, Xu L, Li H (2008) Predicting stock index increments by neural networks: the role of trading volume under different horizons. Expert Syst Appl 34:3043–3054
Zuiderwijk A, Chen Y-C, Salem F (2021) Implications of the use of artificial intelligence in public governance: a systematic literature review and a research agenda. Gov Inf Q 38:101577
Acknowledgements
The authors appreciate the unknown referee’s valuable and profound comments.
Funding
This study has no funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Research involving human participants, their data or biological material
Not included.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Fakharchian, S. Designing a forecasting assistant of the Bitcoin price based on deep learning using market sentiment analysis and multiple feature extraction. Soft Comput 27, 18803–18827 (2023). https://doi.org/10.1007/s00500-023-09028-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-023-09028-5