Skip to main content
Log in

Dataset on sentiment-based cryptocurrency-related news and tweets in English and Malay language

  • Original Paper
  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

Cryptocurrency trading is becoming popular due to its profitable investment and has led to worldwide involvement in buying and selling cryptocurrency assets. Sentiments expressed by cryptocurrency enthusiasts toward some news via social media or other online platforms may affect the cryptocurrency market activities. Thus, it has become a challenge to determine the level of positivity or negativity (regression) inhibiting the texts than simply classifying the sentiment into categorical classes. Regression offers more detailed information than a simple classification which can be robust to noisy data as they consider the entire range of possible target values. On the contrary, classification can lead to biased models due to imbalanced dataset and tend to cause overfitting. Hence, this work emphasises in creating sentiment-based cryptocurrency-related corpora in English and Malay focusing on Bitcoin and Ethereum. The data was collected from January to December 2021 from the publicly available news online and tweets from Twitter in English and Malay. The dataset contains a total of 29,694 instances comprised of 5694 news data and 24,000 tweets data. During the annotation process, the annotators are trained until Krippendorf’s alpha agreement of above 60% is achieved since it is considered an applicable benckmark due to the annotation complexity. The corpora is available on Github for cryptocurrency-related experiments using various machine learning or deep learning models to study English and Malay sentiments effect on the global market, particularly the Malaysian market and can be extended for further analysis for Bitcoin and Ethereum market volatile nature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

Available upon request.

References

  • Agarwal, B., Harjule, P., Chouhan, L., Saraswat, U., Airan, H., & Agarwal, P. (2021). Prediction of dogecoin price using deep learning and social media trends. EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 8(29), 171188. https://doi.org/10.4108/eai.29-9-2021.171188

    Article  Google Scholar 

  • Aggarwal, A., Gupta, I., Garg, N., & Goel, A. (2019). Deep learning approach to determine the impact of socio economic factors on Bitcoin price prediction. 2019 Twelfth International Conference on Contemporary Computing (IC3), 1–5. https://doi.org/10.1109/IC3.2019.8844928

  • Ahmad, W., Wang, B., Martin, P., Xu, M., & Xu, H. (2023). Enhanced sentiment analysis regarding COVID-19 news from global channels. Journal of Computational Social Science, 6(1), 19–57. https://doi.org/10.1007/s42001-022-00189-1

    Article  Google Scholar 

  • Alghamdi, S., Alqethami, S., Alsubait, T., & Alhakami, H. (2022). Cryptocurrency price prediction using forecasting and sentiment analysis. International Journal of Advanced Computer Science and Applications. https://doi.org/10.14569/IJACSA.2022.01310105

    Article  Google Scholar 

  • Althnian, A., AlSaeed, D., Al-Baity, H., Samha, A., Dris, A. B., Alzakari, N., Abou Elwafa, A., & Kurdi, H. (2021). Impact of dataset size on classification performance: An empirical evaluation in the medical domain. Applied Sciences, 11(2), 796. https://doi.org/10.3390/app11020796

    Article  Google Scholar 

  • Attila, S. D. (2017). Impact of social media on cryptocurrency trading with deep learning. Scientific Students’ Conference (p. 47).

  • Balfagih, A. M., & Keselj, V. (2019). Evaluating sentiment classifiers for Bitcoin tweets in price prediction task. 2019 IEEE International Conference on Big Data (Big Data) (pp. 5499–5506). https://doi.org/10.1109/BigData47090.2019.9006140

  • Barbaglia, L., Frattarolo, L., Onorante, L., Pericoli, F. M., Ratto, M., & Tiozzo Pezzoli, L. (2022). Testing big data in a big crisis: Nowcasting under Covid-19. International Journal of Forecasting. https://doi.org/10.1016/j.ijforecast.2022.10.005

    Article  Google Scholar 

  • Bonta, V., Kumaresh, N., & Janardhan, N. (2019). A comprehensive study on lexicon based approaches for sentiment analysis. Asian Journal of Computer Science and Technology, 8(S2), 1–6. https://doi.org/10.51983/ajcst-2019.8.S2.2037

    Article  Google Scholar 

  • Cambria, E., Li, Y., Xing, F. Z., Poria, S., & Kwok, K. (2020). SenticNet 6: Ensemble Application of Symbolic and Subsymbolic AI for Sentiment Analysis. Proceedings of the 29th ACM International Conference on Information & Knowledge Management (pp. 105–114). https://doi.org/10.1145/3340531.3412003

  • Cerda, G. N. C. (2021). Bitcoin price prediction through stimulus analysis: On the footprints of Twitter’s crypto-influencers [Master’s Thesis, Pontificia Universidad Católica de Chile]. https://repositorio.uc.cl/xmlui/bitstream/handle/11534/60881/TESIS_GCheuque_Firma%20Final.pdf?sequence=1

  • Chawla, N. V., Japkowicz, N., & Kotcz, A. (2004). Editorial: Special issue on learning from imbalanced data sets. ACM SIGKDD Explorations Newsletter, 6(1), 1–6. https://doi.org/10.1145/1007730.1007733

    Article  Google Scholar 

  • Chen, C. Y.-H., Després, R., Guo, L., & Renault, T. (2019). What makes cryptocurrencies special? Investor sentiment and return predictability during the bubble. Comparative Political Economy: Monetary Policy eJournal 1–36.

  • Chin, C. K., & Omar, N. (2020). Bitcoin price prediction based on sentiment of news article and market data with LSTM model. Asia-Pacific Journal of Information Technology and Multimedia, 9(1), 1–16. https://doi.org/10.17576/apjitm-2020-0901-01

    Article  Google Scholar 

  • Chowdhury, R., Rahman, M. A., Rahman, M. S., & Mahdy, M. R. C. (2020). An approach to predict and forecast the price of constituents and index of cryptocurrency using machine learning. Physica a: Statistical Mechanics and Its Applications, 551, 124569. https://doi.org/10.1016/j.physa.2020.124569

    Article  Google Scholar 

  • Corbet, S., Hou, Y., Hu, Y., Larkin, C., Lucey, B., & Oxley, L. (2022). Cryptocurrency liquidity and volatility interrelationships during the COVID-19 pandemic. Finance Research Letters, 45, 102137. https://doi.org/10.1016/j.frl.2021.102137

    Article  Google Scholar 

  • Critien, J. V., Gatt, A., & Ellul, J. (2022). Bitcoin price change and trend prediction through twitter sentiment and data volume. Financial Innovation, 8(1), 45. https://doi.org/10.1186/s40854-022-00352-7

    Article  Google Scholar 

  • Daskalakis, N., & Georgitseas, P. (2020). An Introduction to Cryptocurrencies: The Crypto Market Ecosystem. Routledge.

    Book  Google Scholar 

  • D’Orazio, M., Di Giuseppe, E., & Bernardini, G. (2022). Automatic detection of maintenance requests: Comparison of Human Manual Annotation and Sentiment Analysis techniques. Automation in Construction, 134, 104068. https://doi.org/10.1016/j.autcon.2021.104068

    Article  Google Scholar 

  • Edgari, E., Thiojaya, J., & Qomariyah, N. N. (2022). The impact of Twitter sentiment analysis on Bitcoin price during COVID-19 with XGBoost. 2022 5th International Conference on Computing and Informatics (ICCI) (pp. 337–342). https://doi.org/10.1109/ICCI54321.2022.9756123

  • El Haddaoui, B., Chiheb, R., Faizi, R., & El Afia, A. (2023). The influence of social media on cryptocurrency price: A sentiment analysis approach. International Journal of Computing and Digital Systems, 13(1), 1–15. https://doi.org/10.12785/ijcds/130137

    Article  Google Scholar 

  • Endalie, D., Haile, G., & Taye, W. (2022). Bi-directional long short term memory-gated recurrent unit model for Amharic next word prediction. PLoS ONE, 17(8), e0273156. https://doi.org/10.1371/journal.pone.0273156

    Article  Google Scholar 

  • Farhana, K., & Muthaiyah, S. (2022). Behavioral intention to use cryptocurrency as an electronic payment in Malaysia. Journal of System and management Science, 12(4), 219–231.

    Google Scholar 

  • Galeshchuk, S., Vasylchyshyn, O., & Krysovatyy, A. (2018). Bitcoin response to Twitter sentiments. ICTERI Workshops (pp. 160–168).

  • Garg, A., Shah, T., Jain, V. K., & Sharma, R. (2021). CrypTop12: A dataset for cryptocurrency price movement prediction from tweets and historical prices. 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA) (pp. 379–384). https://doi.org/10.1109/ICMLA52953.2021.00065

  • Goleman, T. (2018). Cryptocurrency: Mining, investing and trading in Blockchain for Beginners. How to buy cryptocurrencies (Bitcoin, Ethereum, Ripple, Litecoin or Dash) and what wallet to use. Cryptocurrency investment strategies. Zen Mastery.

    Google Scholar 

  • Gong, X.-R., Jin, J.-X., & Zhang, T. (2019). Sentiment analysis using autoregressive language modeling and broad learning system. 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 1130–1134). https://doi.org/10.1109/BIBM47256.2019.8983025

  • Gurdgiev, C., & O’Loughlin, D. (2020). Herding and anchoring in cryptocurrency markets: Investor reaction to fear and uncertainty. Journal of Behavioral and Experimental Finance, 25, 100271. https://doi.org/10.1016/j.jbef.2020.100271

    Article  Google Scholar 

  • Hartmann, J., Heitmann, M., Siebert, C., & Schamp, C. (2023). More than a feeling: Accuracy and application of sentiment analysis. International Journal of Research in Marketing, 40(1), 75–87. https://doi.org/10.1016/j.ijresmar.2022.05.005

    Article  Google Scholar 

  • Hasan, S. H., Hasan, S. H., Ahmed, M. S., & Hasan, S. H. (2022). A Novel cryptocurrency prediction method using optimum CNN. Computers, Materials & Continua, 71(1), 1051–1063. https://doi.org/10.32604/cmc.2022.020823

    Article  Google Scholar 

  • Hitam, N. A., Ismail, A. R., Samsudin, R., & Ameerbakhsh, O. (2021). The influence of sentiments in digital currency prediction using hybrid sentiment-based support vector machine with whale optimization algorithm (SVMWOA). International Congress of Advanced Technology and Engineering (ICOTEN). https://doi.org/10.1109/ICOTEN52080.2021.9493454

    Article  Google Scholar 

  • Hooson, M., & Pratt, K. (2023, October 4). Our Pick Of The Best Cryptocurrencies Of October 2023. Forbes Advisor. https://www.forbes.com/uk/advisor/investing/cryptocurrency/top-10-cryptocurrencies-october-2023/

  • Hu, M., & Liu, B. (2004). Mining and Summarizing Customer Reviews.

  • Hutto, C. J., & Gilbert, E. (2015). VADER: A parsimonious rule-based model for sentiment analysis of social media text. Proceedings of the 8th International Conference on Weblogs and Social Media (ICWSM 2014).

  • Ibrahim, A. (2021). Forecasting the early market movement in Bitcoin using Twitter’s sentiment analysis: An ensemble-based prediction model. 2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS) (pp. 1–5). https://doi.org/10.1109/IEMTRONICS52119.2021.9422647

  • Inamdar, A., Bhagtani, A., Bhatt, S., & Shetty, P. M. (2019). Predicting cryptocurrency value using sentiment analysis. 2019 International Conference on Intelligent Computing and Control Systems (ICCS) (pp. 932–934). https://doi.org/10.1109/ICCS45141.2019.9065838

  • Jahjah, F. H., & Rajab, M. (2020). Impact of Twitter sentiment related to bitcoin on stock price returns. Journal of Engineering, 26(6), 60–71. https://doi.org/10.31026/j.eng.2020.06.05

    Article  Google Scholar 

  • Jain, A., Tripathi, S., Dwivedi, H. D., & Saxena, P. (2018). Forecasting price of cryptocurrencies using tweets sentiment analysis. 2018 Eleventh International Conference on Contemporary Computing (IC3) (pp. 1–7). https://doi.org/10.1109/IC3.2018.8530659

  • John, V., & Vechtomova, O. (2017a). SemEval-2017 Task 5 News and Microblogs dataset. SemEval-2017 Task 5. https://alt.qcri.org/semeval2017/task5/index.php?id=data-and-tools

  • John, V., & Vechtomova, O. (2017b). UW-FinSent at SemEval-2017 Task 5: Sentiment analysis on financial news headlines using training dataset augmentation. Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) (pp. 872–876). https://doi.org/10.18653/v1/S17-2149

  • Kang, C. Y., Lee, C. P., & Lim, K. M. (2022). Cryptocurrency price prediction with Convolutional neural network and Stacked Gated Recurrent Unit. Data, 7(11), 149. https://doi.org/10.3390/data7110149

    Article  Google Scholar 

  • Kilimci, Z. H. (2020). Sentiment analysis based direction prediction in bitcoin using deep learning algorithms and word embedding models. International Journal of Intelligent Systems and Applications in Engineering, 8(2), 60–65. https://doi.org/10.18201/ijisae.2020261585

    Article  Google Scholar 

  • Krippendorff, K. (2004). Content analysis: An introduction to its methodology (2nd ed.). Sage.

    Google Scholar 

  • Krippendorff, K. (2018). Content analysis: An introduction to its methodology (4th ed.). SAGE Publications.

    Google Scholar 

  • Lahmiri, S., & Bekiros, S. (2020). The impact of COVID-19 pandemic upon stability and sequential irregularity of equity and cryptocurrency markets. Chaos, Solitons & Fractals, 138, 109936. https://doi.org/10.1016/j.chaos.2020.109936

    Article  Google Scholar 

  • Lamon, C., Nielsen, E., & Redondo, E. (2017). Cryptocurrency price prediction using news and social media sentiment. SMU Data Science Review, 1, 1–22.

    Google Scholar 

  • Li, X., Ding, L., Du, Y., Fan, Y., & Shen, F. (2022). Position-enhanced multi-head self-attention based bidirectional gated recurrent unit for aspect-level sentiment classification. Frontiers in Psychology, 12, 799926. https://doi.org/10.3389/fpsyg.2021.799926

    Article  Google Scholar 

  • Lisivick, M. (2017). NewsAPI [Computer software]. https://github.com/mattlisiv/newsapi-python

  • Liu, X., Zhou, G., Kong, M., Yin, Z., Li, X., Yin, L., & Zheng, W. (2023). Developing multi-labelled corpus of twitter short texts: A semi-automatic method. Systems, 11(8), 390. https://doi.org/10.3390/systems11080390

    Article  Google Scholar 

  • Loginova, E., Tsang, W. K., van Heijningen, G., Kerkhove, L.-P., & Benoit, D. F. (2021). Forecasting directional bitcoin price returns using aspect-based sentiment analysis on online text data. Machine Learning. https://doi.org/10.1007/s10994-021-06095-3

    Article  Google Scholar 

  • Loughran, T., & Mcdonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. The Journal of Finance, 66(1), 35–65. https://doi.org/10.1111/j.1540-6261.2010.01625.x

    Article  Google Scholar 

  • Loughran, T., & Mcdonald, B. (2014). Measuring Readability in financial disclosures. The Journal of Finance, 69(4), 1643–1671. https://doi.org/10.1111/jofi.12162

    Article  Google Scholar 

  • Luo, J. (2020). Bitcoin price prediction in the time of COVID-19. 2020 Management Science Informatization and Economic Innovation Development Conference (MSIEID) (pp. 243–247).https://doi.org/10.1109/MSIEID52046.2020.00050

  • Mai, F., Shan, Z., Bai, Q., Wang, X., & Chiang, R. H. L. (2018). How does social media impact Bitcoin value? A test of the Silent Majority Hypothesis. Journal of Management Information Systems, 35(1), 19–52. https://doi.org/10.1080/07421222.2018.1440774

    Article  Google Scholar 

  • Manaf, S. A., & Nordin, M. J. (2009). Review on statistical approaches for automatic image annotation. 2009 International Conference on Electrical Engineering and Informatics (pp. 56–61).https://doi.org/10.1109/ICEEI.2009.5254815

  • Maqsood, U., Khuhawar, F. Y., Talpur, S., Jaskani, F. H., & Memon, A. A. (2022). Twitter Mining based Forecasting of cryptocurrency using sentimental analysis of Tweets. 2022 Global Conference on Wireless and Optical Technologies (GCWOT) (pp. 1–6).https://doi.org/10.1109/GCWOT53057.2022.9772923

  • Mohammad, S. (2016). A Practical Guide to Sentiment Annotation: Challenges and Solutions. Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (pp. 174–179). https://doi.org/10.18653/v1/W16-0429

  • Mohanty, P., Patel, D., Patel, P., & Roy, S. (2018). Predicting fluctuations in cryptocurrencies’ price using users’ comments and real-time prices. 7th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO).

  • Mohapatra, S., Ahmed, N., & Alencar, P. (2019). KryptoOracle: A real-time cryptocurrency price prediction platform using Twitter sentiments. 2019 IEEE International Conference on Big Data (Big Data) (pp. 5544–5551).https://doi.org/10.1109/BigData47090.2019.9006554

  • Nakamoto, S. (2008). Bitcoin: A peer-to-peer electronic cash system. Decentralized Business Review 21260.

  • Nandwani, P., & Verma, R. (2021). A review on sentiment analysis and emotion detection from text. Social Network Analysis and Mining, 11(1), 81. https://doi.org/10.1007/s13278-021-00776-6

    Article  Google Scholar 

  • Nasekin, S., & Chen, C.Y.-H. (2020). Deep learning-based cryptocurrency sentiment construction. Digital Finance, 2(1–2), 39–67. https://doi.org/10.1007/s42521-020-00018-y

    Article  Google Scholar 

  • Oikonomopoulos, S., Tzafilkou, K., Karapiperis, D., & Verykios, V. (2022). Cryptocurrency Price Prediction using Social Media Sentiment Analysis. 2022 13th International Conference on Information, Intelligence, Systems & Applications (IISA) (pp. 1–8). https://doi.org/10.1109/IISA56318.2022.9904351

  • Ortu, M., Uras, N., Conversano, C., Bartolucci, S., & Destefanis, G. (2022). On technical trading and social media indicators for cryptocurrency price classification through deep learning. Expert Systems with Applications, 198, 116804. https://doi.org/10.1016/j.eswa.2022.116804

    Article  Google Scholar 

  • Pant, D. R., Neupane, P., Poudel, A., Pokhrel, A. K., & Lama, B. K. (2018). Recurrent neural network based Bitcoin price prediction by Twitter sentiment analysis. 2018 IEEE 3rd International Conference on Computing, Communication and Security (ICCCS) (pp. 128–132). https://doi.org/10.1109/CCCS.2018.8586824

  • Parekh, R., Patel, N. P., Thakkar, N., Gupta, R., Tanwar, S., Sharma, G., Davidson, I. E., & Sharma, R. (2022). DL-GuesS: Deep learning and sentiment analysis-based cryptocurrency price prediction. IEEE Access, 10, 35398–35409. https://doi.org/10.1109/ACCESS.2022.3163305

    Article  Google Scholar 

  • Passalis, N., Avramelou, L., Seficha, S., Tsantekidis, A., Doropoulos, S., Makris, G., & Tefas, A. (2022). Multisource financial sentiment analysis for detecting Bitcoin price change indications using deep learning. Neural Computing and Applications, 34(22), 19441–19452. https://doi.org/10.1007/s00521-022-07509-6

    Article  Google Scholar 

  • Pathak, S., & Kakkar, A. (2020). Cryptocurrency price prediction based on historical data and social media sentiment analysis. Proceedings of 7th Innovations in Computer Science and Engineering (ICICSE), 97, 177–192.

    Google Scholar 

  • Pillai, S., Biyani, D., Motghare, R., & Karia, D. (2021). Price prediction and notification system for cryptocurrency share market trading. 2021 International Conference on Communication Information and Computing Technology (ICCICT) (pp. 1–7). https://doi.org/10.1109/ICCICT50803.2021.9510122

  • Pintelas, E., Livieris, I. E., Stavroyiannis, S., Kotsilieris, T., & Pintelas, P. (2020). Investigating the problem of cryptocurrency price prediction: A deep learning approach. In I. Maglogiannis, L. Iliadis, & E. Pimenidis (Eds.), Artificial Intelligence Applications and Innovations (Vol. 584, pp. 99–110). Springer International Publishing.

    Chapter  Google Scholar 

  • Prajapati, P. (2020). Predictive analysis of Bitcoin price considering social sentiments. arXiv:2001.10343. http://arxiv.org/abs/2001.10343

  • Qiao, Y., Xiong, C., Liu, Z., & Liu, Z. (2019). Understanding the Behaviors of BERT in Ranking (arXiv:1904.07531). http://arxiv.org/abs/1904.07531

  • Raju, S. M., & Tarif, A. M. (2020). Real-time prediction of Bitcoin price using machine learning techniques and public sentiment analysis 14.

  • Reis, J., Benevenuto, F., Olmo, P., Prates, R., Kwak, H., & An, J. (2015). Breaking the News: First Impressions Matter on Online News.

  • Riccosan, & Saputra, K. E. (2023). Multilabel multiclass sentiment and emotion dataset from indonesian mobile application review. Data in Brief, 50, 109576. https://doi.org/10.1016/j.dib.2023.109576

    Article  Google Scholar 

  • Ritchie, M. J., Drummond, K. L., Smith, B. N., Sullivan, J. L., & Landes, S. J. (2022). Development of a qualitative data analysis codebook informed by the i-PARIHS framework. Implementation Science Communications, 3(1), 98. https://doi.org/10.1186/s43058-022-00344-9

    Article  Google Scholar 

  • Rognone, L., Hyde, S., & Zhang, S. S. (2020). News sentiment in the cryptocurrency market: An empirical comparison with Forex. International Review of Financial Analysis, 69, 101462. https://doi.org/10.1016/j.irfa.2020.101462

    Article  Google Scholar 

  • Salač, A. (2019). Forecasting of the cryptocurrency market through social media sentiment analysis [[Student Theses], University of Twente]. https://essay.utwente.nl/78607/

  • Sarica, S., & Luo, J. (2021). Stopwords in technical language processing. PLoS ONE, 16(8), e0254937. https://doi.org/10.1371/journal.pone.0254937

    Article  Google Scholar 

  • Sattarov, O., Jeon, H. S., Oh, R., & Lee, J. D. (2020). Forecasting Bitcoin Price Fluctuation by Twitter Sentiment Analysis. 2020 International Conference on Information Science and Communications Technologies (ICISCT) (pp. 1–4).https://doi.org/10.1109/ICISCT50599.2020.9351527

  • Schulte, M., & Eggert, M. (2021). Predicting hourly bitcoin prices based on long short-term memory neural networks.

  • Serafini, G., Yi, P., Zhang, Q., Brambilla, M., Wang, J., Hu, Y., & Li, B. (2020). Sentiment-driven price prediction of the Bitcoin based on statistical and deep learning approaches. 2020 International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). https://doi.org/10.1109/IJCNN48605.2020.9206704

  • Seroyizhko, P., Zhexenova, Z., Shafiq, M. Z., Merizzi, F., Galassi, A., & Ruggeri, F. (2022). A sentiment and emotion annotated dataset for bitcoin price forecasting based on reddit posts. Proceedings of the 4th Workshop on Financial Technology and Natural Language Processing.

  • Shah, N., & Rohilla, S. (2022). Emot (3.1) [Computer software]. https://github.com/NeelShah18/emot

  • Shahzad, M. K., Bukhari, L., Khan, T. M., Islam, S. M. R., Hossain, M., & Kwak, K.-S. (2021). BPTE: Bitcoin Price Prediction and Trend Examination using Twitter Sentiment Analysis. 2021 International Conference on Information and Communication Technology Convergence (ICTC) (pp. 119–122).https://doi.org/10.1109/ICTC52510.2021.9620216

  • SNScrape. (2018). [Computer software]. https://github.com/JustAnotherArchivist/snscrape

  • Steinert, L., & Herff, C. (2018). Predicting altcoin returns using social media. PLoS ONE, 13(12), e0208119. https://doi.org/10.1371/journal.pone.0208119

    Article  Google Scholar 

  • Stenqvist, E., & Lönnö, J. (2017). Predicting Bitcoin price fluctuation with Twitter sentiment analysis [Degree Project]. KTH Royal Institute of Technology School of Computer Science and Communication.

  • Sukumaran, S., Bee, T. S., & Wasiuzzaman, S. (2022). Cryptocurrency as an investment: The Malaysian context. Risks, 10(4), 86. https://doi.org/10.3390/risks10040086

    Article  Google Scholar 

  • Symeonidis, S., Kordonis, J., Effrosynidis, D., & Arampatzis, A. (2017). DUTH at SemEval-2017 Task 5: Sentiment predictability in financial microblogging and news articles. Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) (pp. 861–865). https://doi.org/10.18653/v1/S17-2147

  • Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics, 37(2), 267–307. https://doi.org/10.1162/COLI_a_00049

    Article  Google Scholar 

  • Valencia, F., Gómez-Espinosa, A., & Valdés-Aguirre, B. (2019). Price movement prediction of cryptocurrencies using sentiment analysis and machine learning. Entropy, 21(6), 589. https://doi.org/10.3390/e21060589

    Article  Google Scholar 

  • Van Atteveldt, W., Van Der Velden, M. A. C. G., & Boukes, M. (2021). The validity of sentiment analysis: Comparing manual annotation, crowd-coding, dictionary approaches, and machine learning algorithms. Communication Methods and Measures, 15(2), 121–140. https://doi.org/10.1080/19312458.2020.1869198

    Article  Google Scholar 

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is All you Need.

  • Vo, A.-D., Nguyen, Q.-P., & Ock, C.-Y. (2019). Sentiment analysis of news for effective cryptocurrency price prediction. International Journal of Knowledge Engineering, 5(2), 47–52. https://doi.org/10.18178/ijke.2019.5.2.116

    Article  Google Scholar 

  • Wan, X., Yang, J., Marinov, S., Calliess, J.-P., Zohren, S., & Dong, X. (2021). Sentiment correlation in financial news networks and associated market movements. Scientific Reports, 11(1), 3062. https://doi.org/10.1038/s41598-021-82338-6

    Article  Google Scholar 

  • Wołk, K. (2019). Advanced social media sentiment analysis for short-term cryptocurrency price prediction. Expert Systems, 37(2), 1–16. https://doi.org/10.1111/exsy.12493

    Article  Google Scholar 

  • Wooley, S., Edmonds, A., Bagavathi, A., & Krishnan, S. (2019). Extracting cryptocurrency price movements from the Reddit network sentiment. 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA) (pp. 500–505). https://doi.org/10.1109/ICMLA.2019.00093

  • Yang, Y., Zha, K., Chen, Y.-C., Wang, H., & Katabi, D. (2021). Delving into deep imbalanced regression. International Conference on Machine Learning.

  • Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., & Le, Q. V. (2020). XLNet: Generalized autoregressive pretraining for language understanding. Advances in Neural Information Processing Systems (NeurIPS 2019) (pp. 1–11).

  • Yao, W., Xu, K., & Li, Q. (2019). Exploring the influence of news articles on Bitcoin price with machine learning. 2019 IEEE Symposium on Computers and Communications (ISCC) (pp. 1–6).https://doi.org/10.1109/ISCC47284.2019.8969596

  • Yusof, M. F., Ab. Rasid, L., & Masri, R. (2021). Implementation Of Zakat Payment Platform For Cryptocurrencies. AZKA International Journal of Zakat & Social Finance. https://doi.org/10.51377/azjaf.vol2no1.41

    Article  Google Scholar 

  • Zamani, N. A. M., & Kamaruddin, N. (2023). Crypto-sentiment detection in Malay text using language models with an attention mechanism. Journal of Information System Engineering and Business Intelligence, 9(2), 147–160.

    Article  Google Scholar 

  • Zamani, N. A. M., Liew, J. S. Y., & Yusof, A. M. (2022a). XLNET-GRU sentiment regression model for cryptocurrency news in English and Malay. Proceedings of the 4th Financial Narrative Processing Workshop @ LREC 2022 (pp. 36–42).

  • Zamani, N. A. M., Yan, J. L. S., & Yusof, A. M. (2022b). Cryptocurrency price prediction using Bi-GRU model with English and Malay news sentiment features. 2022 3rd International Conference on Artificial Intelligence and Data Sciences (AiDAS) (pp. 136–141). https://doi.org/10.1109/AiDAS56890.2022.9918725

  • Zhang, X., Wu, Z., Liu, K., Zhao, Z., Wang, J., & Wu, C. (2023). Text sentiment classification based on BERT embedding and sliced multi-head self-attention bi-GRU. Sensors. https://doi.org/10.3390/s23031481

    Article  Google Scholar 

  • Zolkepli, H. (2022). Malaya Documentation [Computer software]. https://malaya.readthedocs.io/en/stable/Dataset.html#

Download references

Acknowledgements

The authors would like to thank the annotators who participated in the creation of the Malay news and tweets corpora from Universiti Teknologi MARA (UiTM) and Universiti Sains Malaysia (USM).

Funding

This research received no specific grant from any funding agency.

Author information

Authors and Affiliations

Authors

Contributions

Mohamad Zamani: Conceptualization, Methodology, Data Curation, Investigation, Writing – Original Draft. Kamaruddin: Writing – Review & Editing, Investigation. Yusof: Writing – Editing.

Corresponding author

Correspondence to Nur Azmina Mohamad Zamani.

Ethics declarations

Conflicts of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mohamad Zamani, N.A., Kamaruddin, N. & Yusof, A.M.B. Dataset on sentiment-based cryptocurrency-related news and tweets in English and Malay language. Lang Resources & Evaluation (2024). https://doi.org/10.1007/s10579-024-09733-z

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10579-024-09733-z

Keywords

Navigation