RETRACTED ARTICLE: A combination of TEXTCNN model and Bayesian classifier for microblog sentiment analysis

Wang, Zhanfeng; Yao, Lisha; Shao, Xiaoyu; Wang, Honghai

doi:10.1007/s10878-023-01038-1

RETRACTED ARTICLE: A combination of TEXTCNN model and Bayesian classifier for microblog sentiment analysis

Published: 11 May 2023

Volume 45, article number 109, (2023)
Cite this article

Download PDF

Journal of Combinatorial Optimization Aims and scope Submit manuscript

RETRACTED ARTICLE: A combination of TEXTCNN model and Bayesian classifier for microblog sentiment analysis

Download PDF

Zhanfeng Wang¹,
Lisha Yao ORCID: orcid.org/0000-0002-6006-3092²,
Xiaoyu Shao² &
…
Honghai Wang¹

1769 Accesses
2 Citations
Explore all metrics

This article was retracted on 25 March 2024

This article has been updated

Abstract

More and more individuals are paying attention to the research on the emotional information found in micro-blog comments. TEXTCNN is growing rapidly in the short text space. However, because the training model of TEXTCNN model itself is not very extensible and interpretable, it is difficult to quantify and evaluate the relative importance of features and themselves. At the same time, word embedding can't solve the problem of polysemy at one time. This research suggests a microblog sentiment analysis method based on TEXTCNN and Bayes that addresses this flaw. First, the word embedding vector is obtained by word2vec tool, and based on the word vector, the ELMo word vector integrating contextual features and different semantic features is generated by ELMo model. Second, the local features of ELMo word vector are extracted from multiple angles by using the convolution layer and pooling layer of TEXTCNN model. Finally, the training task of emotion data classification is completed by combining Bayes classifier. On the Stanford Sentiment Classification Corpus data set SST (Stanford Sentiment Classification Corpus Data bank), the experimental findings demonstrate that the model in this paper is compared with TEXTCNN, LSTM, and LSTM–TEXTCNN models. The Accuracy, Precision, Recall, and F1-score of the experimental results of this research have all greatly increased. Their values are respectively 0.9813, 0.9821, 0.9804 and 0.9812, which are superior to other comparison models and can be effectively used for emotional accurate analysis and identification of events in microblog emotion analysis.

A survey on sentiment analysis methods, applications, and challenges

Article 07 February 2022

A review on sentiment analysis and emotion detection from text

Article 28 August 2021

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Article Open access 05 March 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

With the increasing use of Internet and mobile devices, Internet users are accustomed to the pursuit of expressing public opinions of the criticism and suggestions on public network channels, such as positive and negative comments on social media interaction in the share of corporate brand, product, policy issues, etc., the analysis of the positive and negative evaluation is the main application scenario in which sentiment analysis lies (Valdivia 2017).

Users share real-time information and comment on it through their computers, mobile phones, pads and other devices. Netizens make comments on weibo as a carrier to form the spread and expression of attitudes, opinions and emotions (Liu and Zhang 2013; Cambria et al. 2016). With the explosion of Weibo users, some commercial microblogs that evade all kinds of responsibilities have appeared in chaos. False information, buying and selling fans, network water army protrusion. The mass of microblog celebrities makes it difficult to distinguish between true and false, which poses a serious threat to social stability and security (Yang et al. 2020). Recently, network security has aroused the attention of network security supervision department. How to monitor the network micro-blog is a big issue in front of the national government and the network supervision department. Besides wasting too much manpower and material resources, it is also difficult to grasp the sensitive content and communication trend of weibo public opinion timely, accurately and comprehensively (Wang et al. 2013).

Therefore, sentiment analysis of weibo comments provides important help for early warning and emergency response of public events (Ghasemi et al. 2022). However, the number of human subjective emotional messages is steadily rising. The analysis of emotional words has gradually changed from the previous simple recognition method of subjective emotional words to the increasingly complex and diverse analysis of emotional texts (Wang et al. 2022).

However, traditional sentiment analysis methods lack deeper semantic logic support for sentiment analysis of microblog public opinion events, resulting in low accuracy of classification and prediction of emotional events, which cannot realize the perception of microblog public opinion emergencies in the first time. In light of the aforementioned issues, this study suggests an emotion feature analysis model combining TEXTCNN and Bayes classifier to mine the semantic association of emotion feature words at the deepest level.

This paper initially discusses the necessity of microblogging public opinion events in its first section. The accomplishments of researchers in this subject are examined in the second section. The third section provides information on the specifics of the emotion analysis model which combines TEXTCNN model and Bayes classifier. The fourth section compares the results of TEXTCNN, LSTM and LSTM-TEXTCNN models and the model in this paper through experimental analysis. The conclusion of this essay is found in the fifth section.

2 Related research

Aiming at the analysis of text emotion polarity and its theoretical systematic classification, foreign academic circles started relatively early. For example, Turney et al. (Turney 2002) officially put forward for the first time the systematic classification method for dealing with the emotional polarity of text, based on which different text emotional polarity issues that are genuinely present in the text structure can be classified using an unsupervised machine learning method. (Ye et al. 2009) integrated n-Gram model with Bayesian, SVM and other algorithms on the basis of previous studies and proposed another hybrid model. After a series of experimental studies and comparisons, they proved that the hybrid model had higher accuracy than other single hybrid model methods. In addition, Yang et al. (2017) especially made a significant improvement in the method model's system analysis based on the emotional dictionary's scoring calculation. A set of emotion model is constructed and improved which can be used for language synthesis and scoring calculation and analysis based on the comparison of semantic emotion tendency between the words in the text materials treated and processed by the emotion dictionary. The analysis framework of text emotion orientation and its value of applied research in the framework of language academic theory and applied research in industry and commerce. Mikolov et al. (2014) firstly proposed and established a Log bilinear model and applied it for the first time in the study of a large number of natural language and the actual workflow of fast task processing. The accuracy, for the first time, really reached the level that people can be accepted and recognized at present. And it will continue to be applied in its future research and workflow design, improved its model and launched a widely used open source word vector training tool Word2Vec. Chen et al.(2022) classified the sentiment of online user comment text based on naive Bayes classifier, calculated the uncertainty of social media information, and combined the emotional tendency of users and the influence of users' nodes to show the public opinion in social media during COVID-19.

Deep learning network (Liu et al. 2021; Qu et al. 2021; Wu et al. 2020, 2021; Xu et al. 2021a, b) is a hierarchical network (Liu et al. 2022). Deep learning network has made many innovative progress in the application of text emotion feature analysis, but it still faces many challenges (Phan et al. 2022; Zhang et al. 2022). Deep speech learning theory was first developed in the area of Natural Language Processing research (NLP) fast intelligent Processing technology, and has made some academic achievements. In another research plan organized by Jeff et al. (2019), the model in computer deep semantic learning and processing technology system was used to reconstruct the language model, and the concept of word vector was put forward successfully for the first time. Due to the lack of investment in the popular computer hardware equipment and the training and research of computational learning ability, the experimental effects of training are not particularly prominent, and there is not enough material to arouse the great depth of training and research and attention of people today. Ronan et al. (2008) began to study the technology of Word Embedding in pre-training as another very simple but effective and practical language learning tool applied to the design field of cognitive and neural network models in 2008. Kim (2014) first proposed a systematic and in-depth study of convolution operation in deep convolutional neural network theory, and extracted various important emotional features hidden in the word vector obtained from quantitative text representation processing. Kalchbrenner et al. (2014) are further put forward or research to such a kind of method based on dynamic pool can be used to ensure that can be directly from the text vector space to capture some important characteristics of information and can also make all capture the important features of information between each other at the same time remain in existence for a range of relatively Deep convolutional neural network model of position potential difference in vector space. Zhang et al. (2021) suggested a TextCNN-based Chinese short text classification model that employs reverse transcription to expand data and compensate for a lack of training data.

LeCun et al. (2019) attempted to use the model established by Convolutional Neural Networks (CNN) and combined it with gradient machine language learning to perform semantic intelligent analysis and recognition of document content. Liu et al. (2020) proposed a multi-modal emotion recognition model based on LSTM network to solve the issue that recognition a single modal model's accuracy depends on emotion type, so as to better described the degree of emotion. Li et al. (2020) proposed a LSTM-TextCNN joint model. In order to obtain more representative information in the text classification model and increase the accuracy of classification, they tested and compared this paper's combined model with single model LSTM, TEXTCNN. According on the experimental findings, the classification accuracy of LSTM-TEXTCNN combined model adopted in this paper is 83.3%, 4.2% higher than TEXTCNN model and 9.1% higher than LSTM model. The method in this paper is superior to traditional single model TEXTCNN and LSTM in capturing text features. It is clear that in terms of text feature extraction and expression, the deep learning model is superior than the model algorithm used by the conventional sentence classification approach.

In addition to constructing a suitable model for emotional analysis, effective word vector representation is also very important. At present, word embedding is commonly used to generate word representations, and the representative methods are word2vec (Hong et al. 2022) and GloVe (Li et al. 2022). These methods do not need prior knowledge, but can train semantic features by providing text corpus, which is favored by many researchers. However, these methods lack the expression of different semantics of polysemous words, which affects the accurate expression of word semantics and leads to inaccurate subsequent analysis. In order to solve the problem of one-time polysemy and accurately express word features, this paper uses ELMo model and two-way LSTM model to learn pre-training corpus, and obtains embedding vectors combining contextual features and different semantics. The word vector not only contains the semantic features of the word itself, but also integrates the contextual features, which makes up for the shortcomings of traditional word embedding methods.

To sum up, TEXTCNN model has simple structure, few parameters, fast training speed, and is widely used in text classification (Aljohani et al. 2023; Angeli et al. 2022; Alizadeh et al. 2021). TEXTCNN model uses reverse translation to realize data expansion and make up for the deficiency of training data. However, because the TEXTCNN model is not extensible and interpretive, it is difficult to directly optimize and adjust the emotional characteristics of specific objects according to the analysis results obtained by the tuning model during the analysis process. Emotional characteristics analysis of microblog comments often lacks the support of deeper semantic logic. As a result, the accuracy of classification and prediction of emotional events is usually low, and it is impossible to perceive public opinion emergencies on Weibo in the first time. Bayes classifier assumes that the attributes of a given target value are independent of each other, that is, no attribute variable has a large proportion of the decision result, and no attribute variable has a small proportion of the decision result (Yang et al. 2021; Tan et al. 2022). This hypothesis reduces the influence of the proportion of attribute variables on the decision result. Naive Bayes algorithm is one of the classifiers with better learning efficiency and classification effect. The algorithm is intuitive, simple logic and has good interpretation. In light of the aforementioned issues, this work suggests a model for emotion feature analysis that combines TEXTCNN and Bayes classifier to harvest the deepest level of semantic connection of emotion feature words. Experimental comparison on the Stanford Sentiment Classification Corpus dataset SST(Stanford Sentiment Sentiment Treebank) proves the validity of the proposed model.

3 Sentiment analysis model combining TEXTCNN model and bayesian classifier

3.1 General guidelines

TEXTCNN model established separately is not strong in extensible explanatory characteristics, so it is difficult to directly optimize and adjust the emotional characteristics of specific objects according to the analysis results obtained from training when using the optimization model. The emotional characteristics analysis of Weibo public opinion often lacks deeper semantic logic support, which leads to the low accuracy of emotional event classification and prediction, and it is impossible to realize the first-time perception of Weibo public opinion emergencies. In order to solve the above problems, this paper proposes an emotional feature analysis model based on TEXTCNN and Bayesian classifier to mine the deepest emotional feature word meaning association.

In order to overcome the problem that traditional word embedding methods can only express single semantics, this paper generates word vectors of words based on ELMo model. On the basis of TEXTCNN model framework, this paper introduces the naive bayesian classifier into the characteristics of the output for the first time, to avoid the introduction of the classification model of keywords a local characteristics of convergence, the characteristics of a global optimal solution. Thus, it further improves the characteristics of global keyword feature information of the model output of feature extraction and classification recognition. The training stage and testing stage are the two sections of the algorithm. During the learning phase, Word vectors are obtained by word2vec tool, and then ELMo word vectors are generated by ELMo model. ELMo word vector is input into TEXTCNN model for training. Then all text features are extracted using the trained network model. After the extracted text features are processed, the naive Bayes classifier is trained by the text feature set. In the testing stage, after preprocessing the text set of the test set, text features can be extracted through TEXTCNN and introduced into the naive Bayes classifier to obtain the classification results, so as to better carry out the emotional semantic classification analysis and semantic classification prediction of the emotional semantic information of microblog public opinion.

Sentiment analysis can be realized by combining TEXTCNN and Bayesian classifier. These are the precise steps:

(1)
Training in a pre-training corpus by using word2vec tools to obtain a word embedding vector.
(2)
Input the word vectors obtained by word2vec into the ELMo model, and train the model in the pre-training corpus to generate the ELMo word vectors.
(3)
ELMo word vector is input into TEXTCNN model. TEXTCNN model was used to process word vectors, and kernels of convolution in various sizes were used to obtain multidimensional feature maps, and representative local features were extracted.
(4)
To extract the maximum value from each feature graph, 1-maximum pooling is employed. The word vector with the retrieved emotion feature is processed by Naive Bayes classifier to further highlight the weight of key words. Naive Bayes classifier is used to classify the emotion of Weibo public opinion.

3.2 ELMo model

In 2018, ELMo model was proposed to establish context-related word vectors, provide accurate representation of polysemous words, and overcome the problem that traditional word embedding can only express single semantics. This model uses the bidirectional LSTM network to train the language model on the pre-training corpus, and obtains the semantic vector that integrates the context-related features.

ELMo model uses LSTM to build a language model, which is used to calculate the probability of a given sentence. Suppose the sentence $S$ contains $n$ words $S=\{{t}_{1},{t}_{2},\cdots ,{t}_{n}\}$. If the occurrence probability of the $k$-th word t_k in a sentence is only related to the first $k-1$ words, it is a forward language model. If the probability of the $k$-th word ${t}_{k}$ in a sentence is only related to the first $n-k$ words, it is a backward language model.

ELMo model uses LSTM to build forward language model. Firstly, the word embedding vectors ${x}_{1},{x}_{2},\cdots ,{x}_{ k}$ corresponding to ${t}_{1},{t}_{2},\cdots ,{t}_{k}$ are obtained by word2vec tool. Then, the hidden layer states ${h}_{l1},{h}_{l2},\cdots ,{h}_{lk}$ of different layers are obtained by inputting them into the $L$-layer LSTM network in turn, where $l$ is the number of layers of LSTM $l=\{\mathrm{1,2}, \cdots ,L\}$. Similarly, ELMo model uses another LSTM network to build a backward language model, which is opposite to the forward language model. Embedding the words corresponding to ${t}_{1},{t}_{2},\cdots ,{t}_{k}$ into vectors ${x}_{1},{x}_{2},\cdots ,{x}_{ k}$ and inputting them into LSTM can also get the hidden layer states ${h}_{l1}^{\mathrm{^{\prime}}},{h}_{l2}^{\mathrm{^{\prime}}},\cdots ,{h}_{lk}^{\mathrm{^{\prime}}}$ of the corresponding layer. Finally, a two-way language model is constructed, and the hidden layer states of the last layer of LSTM in two directions are connected to obtain ${H}_{L1},{H}_{L2},\cdots ,{H}_{Ln}$. Among them, the $k$-th word corresponds to the hidden layer state ${H}_{Lk}=\{{h}_{Lk},{h}_{Lk}^{\mathrm{^{\prime}}}\}$. The model is shown in Fig. 1.

ELMo word vector is obtained by weighted summation of hidden layer state vectors of each layer of bidirectional LSTM. Let the vector after the bidirectional LSTM hidden layer connection be ${H}_{lk}$ and the input word vector be ${x}_{ k}$, then the corresponding ELMo word vector is expressed as shown in Eq. (1).

$${ELMo}_{k}=\gamma \left({s}_{0}{x}_{k}+{\sum }_{j=1}^{L}{s}_{j}{H}_{lk}\right),$$

(1)

where $\gamma $ represents the scaling factor, ${s}_{j}$ represents the normalized coefficient of the hidden layer state, and represents the weight of the hidden layer state of each layer. These parameters can be obtained through the optimization of subsequent tasks, or directly use the state of the last hidden layer of LSTM as the ELMo word vector, as shown in Eq. (2).

$${ELMo}_{k}={H}_{Lk}.$$

(2)

In this paper, the latter is adopted, that is, the last hidden layer state of LSTM is directly used as ELMo word vector. ELMo word vector contains not only the semantics of the word itself, but also the corresponding contextual semantics of the word, which contains more information.

3.3 TEXTCNN model

A version of the CNN model is the Text Convolutional Neural Networks (TEXTCNN) model. By accurately defining the filtering kernel size relationship between various feature kinds of text, this article aims to achieve speedy comprehensive analysis and extraction of local feature data information of various types of text, so as to obtain some text features with more diverse information and relatively stronger representation of feature data. It has been proved that this model can be directly applied to the field of classified information processing similar to text.

Sentiment analysis is an important application direction of natural language processing. Its main purpose is to analyze the emotional polarity of text supporters from a large number of subjective texts and judge whether the emotion is positive or negative. In recent years, sentiment analysis method has been applied to many different types of texts, including comment data, news, Twitter and tweets on Weibo platform. It is necessary to preprocess the relevant text information content separately, such as word segmentation, sentence removal or word use. First of all, the method of word segmentation is to divide each sentence into each group of words as much as possible. The principle of stopping word segmentation is to remove the same group of meaningless connecting words and combination symbols first. For the vectorization representation of words, high-dimensional distributed vector representation gives high similarity between words with similar meanings. At the same time, compared with the single-hot representation, the word embedding method can better solve the problems of semantic redundancy of short texts and reduce the amount of calculation. In this paper, based on the ELMo model, we learn from the two-way LSTM model, and connect the hidden layer state vectors of the last layer to generate the ELMo word vectors, and then input the ELMo word vectors into the TEXTCNN model to learn and extract the local features located in different positions of the text.

The basic structure of TEXTCNN model is intuitive and simple, consisting of embedding layer, convolution layer, pooling layer, full connection layer and Softmax layer. Figure 2 displays the diagram of the model architecture.

The model's embedding layer is its top layer. One of main functions of this layer model is that it can be used to directly transform Word vector into Word vector. The previously trained word vector matrix is used to model and train the system so as to construct the Embedding layer structure. In TEXTCNN model, data matrix can be sent into 2d convolution matrix layer only after the Embedding matrix transformation operation is completed.

The second layer is the convolution layer. Here, the quadratic convolution of the matrix only re-uses the one-dimensional convolution, and the one-dimensional convolution matrix operation performed on a specific time dimension is required to be re-performed by the matrix, which can be used to calculate the conversion relationship between adjacent words directly obtained. In this paper, the extraction of multidimensional feature maps makes use of three convolution kernels of various sizes. Three convolution kernels have sizes of 3, 4, and 5.

The third layer is the pooling layer. After feature mapping or pre-processing of the convolution processing layer, The data is transmitted to the processing layer with the largest data pooling in time sequence, that is, the largest data pooling of one-dimensional global data. The final processing layer's primary processing task is to extract a few key data features that are present in each filter's output waveform, and filtering then reduces the impact of these erroneous data features on the filtering system. All the output vector data of a matrix can be directly obtained by combining all the output vectors of the above matrix through processing and operation of timing maximum pooling, and all of the data can be utilized as the whole connection layer's input directly.

Fusion layer is the fourth layer. Fusion layer concatenates features extracted from three different convolution kernels obtained by pooling layer to obtain more representative text feature vectors. Input the concatenated feature vectors into Bayes Bayes classifier for learning and classification, and get the final classification result.

Text features are primarily extracted using TEXTCNN models. The model design is as follows:

(1)
The heart of the TEXTCNN model is the convolution layer. Different-sized convolution kernels extract multi-dimensional feature maps. The calculation is shown in Eq. (3).
$$c=f\left({w}_{i}\bullet M+b\right),$$
(3)
where $f$ represents the convolution layer activation function, usually the ReLU function. ${w}_{i}$ represents convolution kernel, and $b$ represents offset term. $M\in {R}^{L\times d}$, $L$ represents sentence length, $d$ represents word vector dimension, and $M$ is word vector matrix. $c$ is the feature map extracted from convolution layer. It serves as the pool layer's input.
(2)
The network structure can be made to pay more attention to the change of characteristic parameters (such as the precise location of non-characteristic) of a particular object by using a pooling layer, and then effectively achieve the modeling purpose of network dimension reduction model by minimizing the size.of the network parameter change space and the number of network feature vectors. In this paper, using the method of maximum pooling, the feature $c$ extracted from convolution layer is subjected to maximum pooling to gain access to the reserved feature $k$. The calculation is shown in Eq. (4).
$$k=\mathrm{max}\left({c}_{1}{,c}_{2},\cdots ,{c}_{n}\right).$$
(4)
(3)
The combination feature data extracted by the fusion layer are spliced, and finally become the most representative feature combination vector.

3.4 Naive bayes classifier

Naive Bayes Classifier (NBC) is a classification method based on Bayes theorem and independent hypothesis of feature conditions. It mainly predicts the posterior probability of a sample belonging to a certain category according to the prior probability distribution, and selects the category with the highest probability as the predicted category. The whole classification process of naive Bayesian classifier is as follows: for a given training data set, firstly, based on the independent hypothesis of characteristic conditions, the joint probability distribution of input/output is learned; Then, based on this probability distribution, for a given input x, the output y with the maximum posterior probability is obtained by Bayesian theorem. The mathematical model of naive Bayesian classifier can be expressed as follows: Assume that the input feature vector $X\left({x}_{1},{x}_{2},\cdots ,{x}_{ n}\right)$ is the sample to be classified, and the output space is the set of class labels $Y=\left\{{c}_{1},{c}_{2},\cdots ,{c}_{m}\right\}$. To classify the sample X, it is necessary to calculate $P\left({c}_{1}|X\right),P\left({c}_{2}|X\right),\cdots ,P\left({c}_{m}|X\right)$, then the prediction category expression of X is shown in Eq. (5).

$$P\left({c}_{k}|X\right)=max\left\{P\left({c}_{1}|X\right),P\left({c}_{2}|X\right),\cdots ,P\left({c}_{m}|X\right)\right\},$$

(5)

where ${c}_{k}$ is the sample category to be classified predicted by naive Bayesian classifier. The steps of conditional probability in Eq. (5) are as follows:

(1)
Constructing a training sample set with known class labels;
(2)
Statistical conditional probability of each feature in the training set in each category, such as $P\left({x}_{1}|{c}_{1}\right),P\left({x}_{2}|{c}_{1}\right),\cdots ,P\left({x}_{n}|{c}_{1}\right)$;
(3)
Assuming that each feature attribute is independent of each other, the conditional probability expression can be obtained according to Bayesian theorem. The calculation is shown in Eq. (6).
$$P\left({c}_{i}|X\right)=\frac{P\left(X|{c}_{i}\right)P\left({c}_{i}\right)}{P\left(X\right)}.$$
(6)
(4)
In Eq. (6), the denominator P(X) is the same for all categories, so it is only necessary to maximize the numerator. The simplified conditional probability expression is shown in Eq. (7).
$$P\left(X|{c}_{i}\right)P\left({c}_{i}\right)=P\left({c}_{i}\right)\prod_{i=1}^{m}P\left({x}_{i}|{c}_{i}\right).$$
(7)

The basic principle of Bayes classifier is to assume that the features are independent of each other, and use Bayes theorem to construct a classification method, which is the application of the principle of statistical probability, in essence is a conditional probabilistic classification. Bayes classifier has a good classification effect and is a mainstream machine learning classification method. It is more suitable for computing classification requirements under multiple conditions, and text sentiment analysis belongs to the condition of multi-condition classification. Compared with other current mainstream machine learning models, its simple learning mechanism does not cause high computational cost, and has certain effects and advantages in processing low-complexity data. Therefore, this paper introduces Bayesian classifier combined with TEXTCNN to improve the efficiency and accuracy of sentiment analysis.

After text data is passed through text convolution network TEXTCNN. The main feature attributes of sentences are finally obtained, and the features of sentences are classified by naive bayes classifier. In a given sentence feature vector, the first use of bayesian conditional probability formula, such as the type shown in Eq. (8). $P\left({x}_{j}|c\right)$ for the characteristics of the known sentence category $c$ attributes ${x}_{j}$ probability. Bayesian formula is used to calculate the posterior probabilities of known sentence attributes belonging to different sentence categories, as shown in Eq. (9). Finally, according to the maximum posteriori probability, the sentence is summed up as the sentence category with the maximum posteriori probability, as shown in Eq. (10).

$$P\left({x}_{1},{x}_{2},\cdots ,{x}_{n}|c\right)=\prod_{j=1}^{n}P\left({x}_{j}|c\right).$$

(8)

$$P\left(c{|x}_{1},{x}_{2},\cdots ,{x}_{n}\right)=\frac{P\left(c\right)\prod_{j=1}^{n}P\left({x}_{j}|c\right)}{P\left({x}_{1},{x}_{2},\cdots ,{x}_{n}\right)}.$$

(9)

$$P\left(c{|x}_{1},{x}_{2},\cdots ,{x}_{n}\right)=argmax P\left(c\right)\prod_{j=1}^{n}P\left({x}_{j}|c\right).$$

(10)

3.5 Model parameters

The purpose of this paper is to extract and analyze high-quality text information on Weibo through data acquisition preprocessing, including word segmentation, stop words filtering, feature extraction and word segmentation embedding. This paper studies, designs, develops and applies an optimization algorithm for sentiment analysis based on the mixed model of TEXTCNN and Bayes classifier. The word vector dimension of ELMo model in this paper is 128. The number of layers of the bidirectional LSTM layer is 2. The deployment depth is 30. The vector dimension of ELMo word is 256. ELMo model adopts pre-training, and the pre-training adopts English Wikipedia data https://dumps.wikimedia.org/enwiki/. Convolution kernel sizes of 3, 4, and 5 are used, together with 256 channels and a 5e-5 learning rate. The most effective formula for model optimization is always to select the Adam optimizer and add it to the Dropout layer to prevent adjusting too quickly in the optimization model. Set the Dropput rate to 0.5. The model's primary parameter settings are displayed in Table 1.

Table 1 Lists the model's parameters

Full size table

4 Experimental analysis

4.1 Data set

To ascertain the accuracy of the model used in this paper, experiment adopts the Sentiment classification corpus data set SST(Stanford Sentiment Treebank, https://nlp.Stanford.edu/sentiment/) . SST-2 data set comes from Stanford Sent-ment Treebank and is an extension of the movie review data set M R data set with about 11 855 pieces of text. SST-2 is a single sentence categorization task that contains human annotations of sentences in movie reviews and their emotions. The task was the emotion of a given sentence, divided into two categories: positive emotion (sample label corresponds to 1) and negative emotion (sample label corresponds to 0), and only sentence-level labels were used. The data statistics required in this paper are shown in Table 2.

Table 2 Lists the model's parameters

Full size table

4.2 Evaluation criteria

In this paper, user sentiment analysis and research are carried out for data set SST. The analysis and evaluation process generally adopts four classical and practical analysis indicators, namely Accuracy, Precision, Recall and F1-score. The calculation formula is as follows: Eqs. (11)-(14).

$$Accuraacy=TP+TN/TP+TN+FP+FN,$$

(11)

$$Precision=TP/TP+FP,$$

(12)

$$Recall=TP/TP+FN,$$

(13)

$${F}_{1}-score=2\times Precision\times Recall/Precision+Recall,$$

(14)

where TP is the number of positive and correct sentiment prediction; TN is the percentage of accurate negative sentiment predictions; FP is the amount of projected positive classes for negative class mistakes; FN is the number of projected negative classes and positive class errors.

The percentage of the entire number that is correct is referred to as accuracy. Precision refers to the prediction result, which means the probability that all the samples predicted to be positive are actually positive. The accuracy rate is a measure of how accurately a sample's positive findings were predicted. Recall is a kind of Recall rate used to evaluate the number of samples belonging to a specific category. The F1-score is the harmonic value or average of the weighted ratio between the accuracy value and the recall, which is usually used for comprehensive measurement or final quantitative assessment of the overall caliber of the categorization model's outcomes under test. In order to conduct a thorough sentiment analysis on the amount of user comments and the caliber of information about public opinion that is influenced by public opinion, this study chooses the key indicators of the aforementioned four dimensions.

4.3 Comparative experimental analysis

In order to verify the validity of the model presented in this paper, experimental analysis was conducted on the data set SST.

Example 1

The model in this paper is used to train and verify the data set SST.

Figure 3 illustrates how the model's accuracy and loss values varied on the training set and verification set in this paper. Figure 3 models in the training show that set point precision will exceed 90% after training. The training model for the fifth round has also begun the training phase. The precision of the model after training is soon to reach 98%. After each stage of training model in the process of training round model accuracy will be fluctuations near the close to the value range.

Example 2

Model parameter analysis. The setting of model parameters has an important influence on the results. This experiment investigates the convolution kernel size and learning rate parameters to further enhance the model's performance.

The size of convolution kernel determines the visual field of features extracted by the model. The convolution kernel has many features extracted, but it will affect the calculation because of many parameters, so the appropriate size should be selected. Table 3 displays the convolution kernel size results. It can be seen from the results that when the convolution kernel size is set to {3,4,5}, the accuracy, precision, recall and F1-score of the model are up to 98.13%, 98.21%, 98.04% and 98.12%, respectively, and the effect is the best. When the convolution kernel size is set to {2,3,4}, the minimum values of accuracy, precision, recall and F1-score of the model are the worst. When the convolution kernel size is set to {4,5,6} and {3,4,5,6}, the effect is very close. Considering the calculation cost, this paper chooses the convolution kernel size as {3,4,5}.

Table 3 Convolution kernel size affects the result

Full size table

The model's learning rate, a crucial parameter, has an impact on how quickly it converges. The convergence process will bog down if the learning rate is set too slowly, and the training time will be increased. The model may enter a local optimum or perhaps fail to converge if the learning rate is set too high. Therefore, due to the choice of appropriate learning rate.

In order to find a suitable learning rate, the experiment obtained a line chart of the change of loss value and accuracy rate with learning rate through training, as seen in Fig. 4 (a) and (b), respectively, for the influence of learning rate change on loss value and accuracy. As observed in the picture, when the learning rate is smaller than 1e-6, the loss value reduces and the accuracy increases as the learning rate increases. When the learning rate is greater than 1e-5, loss and accuracy will not change with the increase of the learning rate. Therefore, this paper chooses 5e-5 as the best learning rate, which can speed up the convergence of the model and improve the training efficiency.

Example 3

In order to verify the effectiveness of the ELMo model, the experiment replaces the ELMo model of the word vector generation part of the TEXTCNN-NBC model with word2vec tool and GloVe model, which are named as word2vec + TN model and GloVe + TN model respectively. The accuracy of the three models is shown in Fig. 5.

As can be seen from Fig. 5, the accuracy of the word vector generated by ELMo model in this model is improved by 10.37% and 5.94% respectively compared with that generated by word2vec tool and GloVe model. Experiments show that the ELMo model is effective. Compared with other traditional word vectors, ELMo can integrate context-related features and express polysemous words better.

Example 4

Five groups of comparative experimental models were built up to examine the effectiveness of the combined model of TEXTCNN and Bayesian classifier provided in this paper in order to further highlight the experimental effect of the model.

Model 1. NBC (Naive Bayes Classifier), Chen et al. (2022) based on NBC classified the comments of network users into emotions, analyzed the influence of weibo user nodes through calculation, and presented the emotional map of public opinion in colleges and universities in social media by combining user emotional inclination and user node influence.

Model 2. TEXTCNN (TextConvolutional Neural Networks),Text CNN-based Chinese short text classification model is put out by Zhang et al. (2022) that employs reverse translation to expand the available data and make up for the lack of training data.

Model 3. LSTM (Long short-term Memory), an LSTM-based long- and short-term neural network model for text sentiment analysis. Liu et al. (2020) proposed a multi-modal emotion recognition model based on LSTM network to fix the issue that a single modal model's recognition accuracy is based on the type of emotion, so as to better describe the degree of emotion.

Model 4. LSTM- TEXTCNN, in order to collect more representative data for the text classification model and improve classification accuracy, Li et al. (2020) suggested an LSTM-TEXTCNN hybrid model.

Model 5. TEXTCNN-NBC, the text sentiment analysis model based on TEXTCNN-NBC combination proposed in this paper.

On the same data set SST,TEXTCNN-NBC, NBC model, TEXTCNN model and LSTM mode and LSTM-TEXTCNN model for comparative experiments. Figure 6 displays the comparing outcomes of the experimental accuracy.

Figure 6 illustrates this, accuracy of results obtained by TEXTCNN-NBC model in this study outperforms other algorithms based on deep machine learning technology. The TEXTCNN-NBC model's accuracy in this study is 0.9813, 0.1629 higher than TEXTCNN model and 0.1726 higher than LSTM model. It is 0.1435 higher than LSTM-TEXTCNN model and 0.1805 higher than NBC model.

The accuracy rate rises and the loss value falls as the number of iterations rises. The performance of the experiment's model that was created performs better when the experiment's Loss value decreases. As can be seen from Fig. 7, compared with the other four models, the TEXTCNN-NBC model in this paper has faster convergence speed and better performance.

Meanwhile, in order to compare the model's simulation results in a thorough and detailed manner. Table 4 displays the results of a thorough evaluation of the model's average values for Accuracy, Precision, Recall, and F1-Score. In order to analyze the results of the above five methods more intuitively, the results of the average values of Accuracy, Precision, Recall and F1-score of the five methods are displayed visually as shown in Fig. 8.

Table 4 Comprehensive comparison results

Full size table

As can be shown, the TextCNN-NBC model developed in this study has an average accuracy value of 0.9813, an average precision value of 0.9821, an average recall value of 0.9804, and an average F1 value of 0.9812. The model proposed in this paper is 16.29%, 15.46%, 15.53% and 15.49% higher than the Accuracy, Precision, Recall and F1-score of the basic model TEXTCNN. Compared with LSTM, LSTM-TEXTCNN and NBC, the model proposed in this paper performs better in Accuracy, Precision, Recall and F1-score.

Through many comparative experiments on the above models, these related experimental models and data analysis results fully show that the TEXTCNN combined with Bayesian classifier model proposed in this paper can enhance the interpretability of the deep learning model in the classification process, and further analyze and identify various deep emotional features hidden behind Weibo's public opinion more effectively and quickly, which can help to further pay attention to mining more complex and deep-seated network emotional semantic knowledge, and thus help to further study and obtain a good classification and retrieval effect of network emotional information.

5 Conclusion

The results obtained from quantitative analysis of text emotion issues involved in microblog public opinion analysis based on TEXTCNN are not very quantifiable and explanatory, so it is difficult to directly quantify and evaluate the characteristics of each training concept and its relative importance. This study suggests a text sentiment analysis model based on Bayesian classifier and TEXTCNN. This model does not adopt the traditional word embedding method, but uses ELMo model to learn word vector, and uses two-way LSTM model to learn word context, enrich expression of word vector, and better deal with polysemous words. By analyzing TEXTCNN model, it can help to analyze and identify emotion features in weibo public opinions more effectively and accurately, and then by combining Bayesian classifier, it can quickly pay attention to deeper semantic knowledge. Thus, a relatively comprehensive and good effect of event emotion feature classification and recognition can be obtained. The comparison of experimental and test analysis results also shows that the algorithm presented in the paper's conclusion is implemented using a manner that is significantly more effective than the other four comparison models. The final verification results' average accuracy is 0.9813, average precision is 0.9821, average recall is 0.9804, and average F1 is 0.9812.

Data availability

In this study, publicly accessible datasets were examined. These details are available at: https://nlp.stanford. edu/sentiment/index.html.

Change history

25 March 2024
This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1007/s10878-024-01129-7

References

Alizadeh S-H, Hediehloo A, Harzevili N-S (2021) Multi independent latent component extension of naive Bayes classifier. Knowl-Based Syst 213(2):106646
Article Google Scholar
Aljohani N-R, Fayoumi A, Hassan S-U (2023) A novel focal-loss and class-weight-aware convolutional neural network for the classification of in-text citations. J Inf Sci 49(1):79–92
Article Google Scholar
Angeli K-D, Gao S, Danciu I, Durbin EB, Wu XC, Stroup A, Doherty J, Schwartz S, Wiggins C, Damesyn M (2022) Class imbalance in out-of-distribution datasets: Improving the robustness of the TextCNN for the classification of rare cancer types. J Biomed Inform 125:103957
Article Google Scholar
Baity-Jesi M, Sagun L, Geiger M, Spigler S, BenArous G, Cammarota C, LeCun Y, Wyart M, Biroli G (2019) Comparing dynamics:deep neural networks versus glassy systems. J Stat Mech: Theory Exp 12:1240131–12401315
MathSciNet Google Scholar
Cambria E (2016) Affective computing and sentiment analysis. IEEE Intell Syst 31(2):102–107
Article Google Scholar
Chen Y, Li Y, Wang Z-F, Quintero A-J, Alma J, Yang C-W, Ji W-Y (2022) Rapid perception of public opinion in emergency events through social media. Nat Hazards Rev 23(2):4021066
Article Google Scholar
Ghasemi R, Asli S, Momtazi S (2022) Deep persian sentiment analysis: cross-lingual training for low-resource languages. J Inf Sci 48(4):449–462
Article Google Scholar
Hong S, Kim J, Woo H-G, Kim Y-C, Lee C-Y (2022) Screening ideas in the early stages of technology development A word2vec and convolutional neural network approach. Technovation: The International Journal of technological Innovation Entrepreneurship and Technology Management.112: 102407
Jeff H (2019) Ian Goodfellow, Yoshua Bengio, and Aaron Courville: Deep learning. Genet Program Evolvable Mach 19(1/2):305–307
Google Scholar
Kalchbrenner N, Grefenstette E, Blunsum P (2014) A convolutional neural network for modelling sentences//52nd annual meeting of the association for computational linguistics,vol.1 part A:52nd Annual meeting of the association for computational linguistics (ACL 2014) June 22–27 Baltimore Maryland USA:Association for Computational Linguistics:655–665
Kim Y (2014) Convolutional Neural Networks for Sentence Classification. Eprint Arxiv https://doi.org/10.3115/v1/D14-1181
Li Z-J, Geng C-Y, Song P (2020) Research on short rext classification based on joint LSTM TextCNN model. J xi’an Technol University 40(3):299–304
Google Scholar
Li X-Y, Rodolfo C-R, Shi X-M (2022) GloVe-CNN-BiLSTM model for sentiment analysis on text reviews. Journal of Sensors. https://doi.org/10.1155/2022/7212366
Article Google Scholar
Liu J-J, Wu X-F (2020) Real-time multimodal emotion recognition and emotion space labeling using LSTM networks. J Fudan Univ: Nat Sci 59(5):565–574
Google Scholar
Liu B, Zhang L (2013) A survey of opinion mining and sentiment analysis. Mining Text Data 2013:415–463
Google Scholar
Liu J-B, Bao Y, Zheng W-T, Hayat S (2021) Network coherence analysis on a family of nested weighted n-polygon networks. Fract Interdisciplinary J Complex Geometry Nat 29(8):1–15
Google Scholar
Liu J-B, Bao Y, Zheng WT (2022) Analyses of some structural properties on a class of hierarchical scale-free networks. Fractals 30(7):1–11
Article Google Scholar
Mikolov T (2014) Using neural networks for modeling and representing natural languages//25th International conference on computational linguistics:tutorial abstracts:25th International conference on computational linguistics (COLING 2014) August 23–9 Dublin Ireland:Association for Computational Linguistics:3–4
Phan H-T, Nguyen N-T, Hwang D (2022) Convolutional attention neural network over graph structures for improving the performance of aspect-level sentiment analysis. Inf Sci 589:416–439
Article Google Scholar
Qu G-J, Wu H-W, Li R-D, Jiao P-F (2021) DMRO: deep meta reinforcement learning-based task offloading for edge-cloud computing. IEEE Trans Netw Serv Manage 18(3):3448–3459
Article Google Scholar
Ronan C, Jason W, Jacob K (2008) A unified architecture for natural language processing: deep neural networks with multitask learning// Machine Learning,in: Proceedings of the twenty-fifth international conference (ICML 2008) Helsinki Finland June 5–9 2008:160–167
Tan Q, Mu X, Fu M, Yuan H, Sun J, Liang G, Sun L (2022) A new sensor fault diagnosis method for gas leakage monitoring based on the naive Bayes and probabilistic neural network classifier. Measurement 194:111037
Article Google Scholar
Turney P-D (2002) Thumbs up or thumbs down Semantic orientation applied to unsupervised classification of reviews. Association for Comput Linguist 2002:417–424
Google Scholar
Valdivia A, Luzon M-V, Herrera F (2017) Sentiment analysis in tripAdvisor. IEEE Intell 32(4):72–77
Article Google Scholar
Wang Z-H, Wang Z-Q, Li S-S, Li P-F (2013) Feature selection for imbalanced sentiment classification. Journal of Chinese Information Processing 4:113–118
Google Scholar
Wang H, Yang M, Li Z, Liu Z-H, Hu H, Fu Z-W, Liu F (2022) SCANET: Improving multimodal representation and fusion with sparse- and cross- attention for multimodal sentiment analysis. Computer Animation and Virtual Worlds 33(3/4):1–12
Google Scholar
Wu H-M, Zhang Z-R, Guan C, Wolter K, Xu M (2020) Collaborate edge and cloud computing with distributed deep learning for smart city internet of things. IEEE Internet Things J 7(9):8099–8110
Article Google Scholar
Wu H-M, Wolter K, Jiao P-F, Deng Y, Zhao Y, Xu M (2021) EEDTO: an energy-efficient dynamic task offloading algorithm for blockchain-enabled IOT-Edge-Cloud orchestrated oomputing. IEEE Internet Things J 8(4):2163–2176
Article Google Scholar
Xu XL, Tian H, Zhang X-Y, Qi L-Y, He Q, Dou W-H (2021a) DisCOV: distributed COVID-19 detection on X-Ray images with edge-cloud collaboration. IEEE Trans Serv Comput. https://doi.org/10.1109/TSC.2022.3142265
Article Google Scholar
Xu X-L, Huang Q-H, Zhu H-B, Sharma S, Zhang X-Y, Qi L-Y, Bhuiyan M-ZA (2021b) Secure service offloading for internet of vehicles in SDN-enabled mobile edge computing. IEEE Trans Intell Transp Syst 22(6):3720–3729
Article Google Scholar
Yang X-P, Zhang Z-X, Wang L, Zhang Y-J, Ma Q-F, Wu J-N, Zhang Y (2017) Automatic construction and optimization of sentiment lexicon based on Word2Vec. Comput Sci 44(1):42–47
Google Scholar
Yang L, Li Y, Wang J, Sherratt RS (2020) Sentiment analysis for E-commerce product reviews in Chinese based on sentiment lexicon and deep learning. IEEE Access 8:23522–23530
Article Google Scholar
Yang J, Huang Y, Zhang R, Huang F, Meng Q, Feng S (2021) Study on PPG biometric recognition based on multifeature extraction and naive bayes classifier. Sci Program 12:1–12
Google Scholar
Ye Q, Zhang Z-Q, Law R (2009) Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Exp Syst Appl 36(3):6527–6535
Article Google Scholar
Zhang T, You F (2021) Research on short text classification based on TextCNN. J Phys: Conf Ser 1757(1):012092
Google Scholar
Zhang H, Chen Z, Chen B, Hu B, Li M, Yang C, Jiang B (2022) Complete quadruple extraction using a two-stage neural model for aspect-based sentiment analysis. Neurocomputing 492:452–463
Article Google Scholar

Download references

Funding

This work was supported by Big Data Comprehensive Experiment and Training Center, a university-level Quality Engineering Demonstration Experiment and Training Center (No. 2020 sysxx01), School level scientific research project (No. XLZ-202208), Key Research Project of Natural Science in Universities of Anhui Province (No. KJ2020A0782),and Special Support Plan for Innovation and Entrepreneurship Leaders in Anhui Province.

Author information

Authors and Affiliations

School of Computer Science and Artificial Intelligence, Chaohu University, Hefei, 238024, Anhui, China
Zhanfeng Wang & Honghai Wang
School of Big Data and Artificial Intelligence, Anhui Xinhua University, Hefei, 230088, Anhui, China
Lisha Yao & Xiaoyu Shao

Authors

Zhanfeng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lisha Yao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyu Shao
View author publications
You can also search for this author in PubMed Google Scholar
Honghai Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

ZF-W mainly designed the method, conducted the tests, evaluated the findings, and produced the manuscript. LS-Y, XY-S and HH-W participated in the design and discussed the findings and made extensive changes to the report. The results were discussed, and the report underwent significant revisions. Other authors participated in the method discussion and gave reference opinions.

Corresponding author

Correspondence to Lisha Yao.

Ethics declarations

Conflict of interest

It is stated by the authors that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article has been retracted. Please see the retraction notice for more detail: https://doi.org/10.1007/s10878-024-01129-7"

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

About this article

Cite this article

Wang, Z., Yao, L., Shao, X. et al. RETRACTED ARTICLE: A combination of TEXTCNN model and Bayesian classifier for microblog sentiment analysis. J Comb Optim 45, 109 (2023). https://doi.org/10.1007/s10878-023-01038-1

Download citation

Accepted: 25 April 2023
Published: 11 May 2023
DOI: https://doi.org/10.1007/s10878-023-01038-1

RETRACTED ARTICLE: A combination of TEXTCNN model and Bayesian classifier for microblog sentiment analysis

Abstract

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

A review on sentiment analysis and emotion detection from text

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

1 Introduction

2 Related research

3 Sentiment analysis model combining TEXTCNN model and bayesian classifier

3.1 General guidelines

3.2 ELMo model

3.3 TEXTCNN model

3.4 Naive bayes classifier

3.5 Model parameters

4 Experimental analysis

4.1 Data set

4.2 Evaluation criteria

4.3 Comparative experimental analysis

Example 1

Example 2

Example 3

Example 4

5 Conclusion

Data availability

Change history

25 March 2024

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation