Objective

Behavioral economics has proven that analyzing market behavior effectively predicts stock price trends [1]. It has also been shown in sate-of-arts [2,3,4] that sentiment analysis of comments on social networks such as X (former Twitter), Telegram, Reddit, and Facebook can effectively help predict the price trend of cryptocurrencies. In this article, Telegram comments of more than ten popular cryptocurrency-related channels have been extracted using the Telegram API from December 2023 to March 2024. Unlike the previous dataset [5,6,7] that are extracted only through the hashtag of the names of cryptocurrencies, this dataset contains the analysis of experts in cryptocurrencies, which is very effective on investors’ decisions in buying, selling, or holding cryptocurrencies. Unlike most existing datasets, this dataset covers a wide range of ciphers. Also, in addition to the main text of Telegram comments, this data set includes the number of views of each comment, the date of publication, and the polarity and the polarity score of the comments. After the extraction and preprocessing of Telegram comments, the polarity of these comments is determined by the HRDB model [8]. This model uses the RoBERTa pre-trained neural network as the backbone for transfer learning. Then, the extracted knowledge is injected into the deep neural network of BiGRU by combining the attention layer to determine the polarity of emotions. The main goal of this research is to provide a dataset of Telegram comments for the sentiment analysis of passwords, which has many applications in the training of neural networks for research in the field of passwords. This dataset helps researchers analyze the opinions of Telegram channels on a wide range of cryptocurrencies. The introduced data package includes an Excel table containing the Telegram monitoring set and two Word files. The Word files contain the descriptions of the columns of the main Table and Python code to extract comments from Telegram channels.

Data description

This data package has three files. An Excel file contains the opinions of over ten popular Telegram channels about cryptocurrencies. The monitoring of these Telegram channels covers a wide range of cryptocurrencies from December 2023 to March 2024. It was collected through the Telegram API, and the code for extracting these comments is available in the Word package file. After extracting the comments, the operations were performed on them, including equalization, removing stop words, and lemmatization. Then, these data are injected into the HDRB model, described in detail in the research of Kia et al. [8], along with its implementation method. HDRB is a hybrid model based on transfer deep learning that uses the RoBERTa as a backbone and feature extractor and BiGRU deep neural network and attention layer to obtain sentiment polarity and text aspects. This dataset package and Python codes for pre-processing and extracting Telegram comments are listed in Table (1).

Table 1 Overview of data files/data sets

The information of Dataset 1 is (1) text, (2) date, (3) views, (4) scores, (5) compound, and (6) sentiment_type. In the mentioned features, “text” is the preprocessed Telegram comment, “date” column shows the time and date of publication of the comment, “views” shows the number of people’s views of a comment, " scores” shows the percentage of positive, negative, and neutral polarities. These percentages were obtained with the HDRB model [8], “compound” shows the sum of all polarities in a normalized form between − 1 (most extreme negative) and + 1 (most extreme positive), and " sentiment_type” It shows the type of tweet polarity (positive, negative, or neutral). Researchers can easily change the number of polarities by using compound values—for example, strongly positive, positive, neutral, negative, and strongly negative.

Limitations

There are no limitations in the datasets, and the Telegram channels used in the datasets to extract Telegram’s comments are public.