1 Introduction

In social media, information is shared collaboratively through platforms like Facebook,Footnote 1 Twitter,Footnote 2 or Wikinews.Footnote 3 Such platforms enable the rapid dissemination of information regardless of its trustworthiness, leading to instant consumption of non-curated news. The negative consequence of this openness of social media platforms is the spread of false information disguised as truth, i.e., fake news. Fake news can be defined as deceptive posts with an intention to mislead consumers in their purchase or approaching the context of misinformation and disinformation (Xiao et al., 2020). Specifically, while misinformation is an inadvertent action, disinformation is a deliberate creation/sharing of false information. The authenticity and intention can be distinguished as: (i) non-factual and mislead, i.e., deceptive news and disinformation; (ii) factual and mislead (cherry-picking); (iii) undefined and mislead (click-bait); and (iv) non-factual and undefined, i.e., misinformation.

Misinformation and fake news are characterized by their big volume, uncertainty, and short-lived nature. Furthermore, they disseminate faster and further on social media sites causing serious impact on politics and economics (Tandoc, 2019). Accordingly, the report on digital transformation of media and the rise of disinformation/fake news of the European Union (EU) (Martens et al., 2018) reinforces the need to strengthen trust in digital media.

This work contributes with a real-time explainable classification method to recognize fake news, promoting trust in digital media as suggested by the SocialTruth project.Footnote 4 In fact, the early discarding of fake news has a positive impact on both information quality and reliability. The proposed method employs stream processing, updating the profiling and classification models on each incoming event. The profiling is built using side-based (related to the creator user and propagation context) and content-based features (extracted from the news text through Natural Language Processing (nlp) techniques), together with unsupervised methods, to create clusters of representative features. The classification relies on stream Machine Learning (ml) algorithms to classify in real-time the nature of each cluster. Finally, the proposed method includes an explanation mechanism to detail why an event has been classified as fake or non-fake. The explanations are presented visually and in natural language on the user dashboard.

The rest of this paper is organized as follows. Section 2 overviews the relevant work on fake news concerning the profiling, classification and detection tasks. Section 3 introduces the proposed method, detailing the data processing and stream-based classification procedures along with the online explainability. Section 4 describes the experimental set-up and the empirical evaluation results considering the online classification and explanation. Finally, Sect. 5 concludes and highlights the achievements and future work.

2 Related work

Social media plays a crucial role in news consumption due to its low cost, easy access, variety, and rapid dissemination (Hu et al., 2014). Indeed, social media is becoming an increasing source of breaking news. However, the fake news problem indicates that social platforms suffer from lack of transparency, reliability, and real-time modeling. In this context, fake news (misinformation/disinformation, such as rumor, deception, hoaxes, spam opinion, click-bait and cherry-picking) are false information created with the dishonest intention to mislead consumers (Choraś et al., 2021; Xiao et al., 2020). To characterize the nature of fake news and understand whether they result from inadvertent or deliberate action, it is necessary to establish their authenticity and the intention of the creator (Shu et al., 2017). In addition, social media streams are subject to feature variation over time (Bondielli and Marcelloni, 2019; Choraś et al., 2021). Thus, the accurate detection of fake news in real time requires proper profiling and classification techniques. However, according to Shu (2022), the current detection techniques are based on opaque models, leaving users clueless about classification outcomes. Consequently, the current work addresses transparency through explanations, reliability through fake news detection, and real-time modeling through incremental content profiling.

The following discussion compares existing works in terms of: (i) stream-based profile modeling for fake detection; (ii) stream-based classification mechanisms; and (iii) transparency and credibility in detection tasks.

2.1 Profiling

Profiling methods model the stakeholders according to their contributions and interactions. Due to information sparsity, it is frequent to represent profiles using side and content information. In addition, in stream-based modeling, profiles are continuously updated and refined. To model fake news stakeholders, the literature contemplates multiple types of profiling methods: (i) creator-based; (ii) content-based; and (iii) context-based.

  • Creator-based profiling focuses on both demographic and behavioral characteristics of the creator. Specifically, the literature contemplates account name, anomaly score,Footnote 5 credibility score, geolocation information, ratio between friends and followers, total number of tweets/posts, etc. (Castillo et al., 2011; Goindani and Neville, 2019; Jang et al., 2021; Jain et al., 2022; Li et al., 2021; Liu and Wu, 2020; Mosallanezhad et al., 2022; Silva et al., 2021a; Vicario et al., 2019; Zubiaga et al., 2017).

  • Content-based profiling explores textual features extracted from the post aiming to identify the meaning of the content. It can be obtained using linguistic and semantic knowledge, or style analysis via nlp approaches together with fact-checking resources,Footnote 6. Most of the revised works exploit this type of features. Therefore, content-based profiling encompasses:

  • Lexical and syntactical features are properties related to the syntax, e.g., sentence-level features, such as bag-of-words approaches, n-grams, and part-of-speech. These features are exploited by Dong et al. (2020); Zhou et al. (2020). In addition, Vicario et al. (2019); Jang et al. (2021) compute the overall sentiment score of sentences.

  • Stylistic features provide emphasis and clarity to the text. Tweet-writing styles can be determined through: (i) physical style analysis (e.g., number of adjectives, nouns, hashtags and mentions as well as emotion words and casual words); and (ii) non-physical style analysis (e.g., complexity and readability of the text). The work by Jang et al. (2021) is a representative example of the physical style analysis.

  • Visual features describe the properties of images or videos used to ascertain the credibility of multimedia content. Visual features can: (i) be purely statistic (e.g., number of images/videos); (ii) represent distribution patterns; or (iii) describe user accounts (e.g., background images). While Jang et al. (2021) compute statistic visual features, Liu and Wu (2020) consider information from the user account. Li et al. (2021) verify if the image has been tampered, integrating this information as visual content, and Ying et al. (2021) combine textual with visual content to generate multi-level semantic features.

  • Context-based profiling analyses both the surrounding environment and the creator engagements around the piece of information posted (Castillo et al., 2011; Goindani and Neville, 2019; Jain et al., 2022; Jang et al., 2021; Li et al., 2021; Liu and Wu, 2020; Puraivan et al., 2021; Shu et al., 2019b; Silva et al., 2021a; Song et al., 2021; Zhao et al., 2020). Specifically, it applies user-network analysis and distribution pattern analysis to obtain:

  • Network-based features which aggregate similar online users in terms of location, education background, and habits (Liu and Wu, 2020; Shu et al., 2019b; Silva et al., 2021a).

  • Propagation-based features that describe the dissemination of fake news based on the propagation graph as in the work by Mosallanezhad et al. (2022). These may include, for an online account, the root degree, sub-trees number, the maximum/average degree and depth tree depth (Castillo et al., 2011; Jang et al., 2021) or the number of retweets/re-posts for the original tweet/post, the fraction of tweets/posts retweeted (Li et al., 2021; Zhao et al., 2020).

  • Temporal-based features which detail how two posts/tweets relate in time. They may comprise the posting frequency, the day of the week of the post (Jang et al., 2021; Silva et al., 2021a), the interval between two posts or even a complete temporal graph (Song et al., 2021).

2.2 Classification

Fake news detection is a classification task. The main news classification techniques in the literature encompass supervised, semi-supervised, unsupervised, deep learning, and reinforcement learning approaches. Deep learning, depending on the problem, can fall into the supervised or unsupervised classification scope (Mathew et al., 2021). Moreover, its high computational cost requires more computational resources than the corresponding traditional approaches, motivating a separate discussion.

  • Supervised classification is a widely used technique to map objects to classes based on numeric features or inputs (see Table 1). The most frequently used supervised fake news detectors are Bayes, Probabilistic, Neighbor-based, Decision Trees, and Ensemble classifiers.

  • Semi-supervised classification algorithms learn from both labeled and unlabeled samples. They are employed when it is difficult to annotate manually or automatically the samples. The works by Dong et al. (2020) and Shu et al. (2019b) use supervised learning for fake news detection.

  • Unsupervised classification techniques group statistically similar unlabeled data based on underlying hidden features, using clustering algorithms or neural network approaches. The most commonly used cluster algorithms include k-means, Iterative Self-Organizing Data Analysis Technique, and Agglomerative Hierarchical. Li et al. (2021) and Puraivan et al. (2021) are representative examples of this approach.

  • Deep Learning classification relies essentially on neural networks with three or more layers. In terms of fake news, deep learning has been employed mainly for text classification using Convolutional Neural Networks (cnn), Long Short Term Memory (lstm), and Recurrent Neural Networks (rnn) as in the works by Akinyemi et al. (2020) and Nasir et al. (2021).

  • Reinforcement Learning classification works with unlabeled data (Sutton and Barto, 2018), but tends to be slow when applied to real-world classification problems (Dulac-Arnold et al., 2021). While Goindani and Neville (2019), Mosallanezhad et al. (2022), and Wang et al. (2020) perform fake news detection through reinforcement learning, the most used technique is the Multivariate Hawkes Process (mhp) by Goindani and Neville (2019).

Classification can be performed offline or online. Offline or batch processes build static models from pre-existing data sets, whereas online or stream-based processes compute incremental models from live data streams in real-time.

  • Offline classification divides the data set into training—used to create the model—and testing—to assess the quality of the model—partitions. The model remains static throughout the testing stage. This is the most popular fake news detection approach found in the literature.

  • Online classification mines data streams in real-time. Fake news, being dynamic sequences of data originated from multiple sources, i.e., the crowd, demand real-time processing. Typically, whenever new data arrive, the models are incrementally updated, enabling the generation of up-to-date classifications. To the best of the authors’ knowledge, only Ksieniewicz et al. (2020) perform online fake news detection, processing samples as a data stream and considering concept drifts, i.e., that sample classification may naturally change over time.

Classification models can be interpretable and opaque. While opaque models behave as black boxes (e.g., standalone deep neural networks), interpretable models are self-explainable (e.g., trees- or neighbor-based algorithms). Interpretable classifiers explain classification outcomes (Škrlj et al., 2021), clarifying why a given content is false or misleading. More in detail, the explainable fake news detection framework by Shu et al. (2019a) integrates a news content encoder, a user comment encoder, and a sentence-comment co-attention network. The latter captures the correlation between news contents and comments and chooses the top-k sentences and comments to explain the classification outcome. Zhou et al. (2020) explore lexicon-, syntax-, semantic-, and discourse-level features to enhance the interpretablity of the models. Mahajan et al. (2021) and Kozik et al. (2022) adopt model agnostic interpretability techniques, such as Local Interpretable Model-agnostic Explanations (lime) (Ribeiro et al., 2016) and the Shapley Additive Explanations (shap) (Lundberg and Lee, 2017), respectively. Finally, Silva et al. (2021a) provide explanations based on feature weights assigned to tweet/retweet nodes in the propagation patterns.

Table 1 provides an overview of the above works considering profiling (creator-, content-, and context-based), classification (supervised, semi-supervised, unsupervised, and reinforcement learning), processing (offline and online) and explainability. Summing up, this literature review shows that existing explainable fake news detectors explore creator-, content-, and context-based profiles, essentially adopt supervised classification and mostly implement offline processing.

Table 1 Comparison of fake news detection approaches considering: (i) profiling (creator, content, context), (ii) classification (supervised, semi-supervised, unsupervised, reinforcement learning), (iii) execution (offline, online), and (iv) explainability (Ex.)

The most closely related works from the literature, considering the pheme experimental data used for design and evaluation, are the fake news classification solutions proposed by Akinyemi et al. (2020), Jain et al. (2022), Ying et al. (2021), and Zubiaga et al. (2017). Firstly, Zubiaga et al. (2017) experimented with sequential (Conditional Random Fields, Maximum Entropy and Enquiry-based) and non-sequential (Naive Bayes, Support Vector Machines (svm) and Random Forests (rf)) classifiers. Secondly, Akinyemi et al. (2020) applied a rf model as the meta classifier trained with a stack-ensemble of svm, rf, and rnn models as base learners. Thirdly, Ying et al. (2021) presented a Multi-level Multi-modal Cross-attention Network for batch fake detection. Furthermore, Jain et al. (2022) employed a Hierarchical Attention Network (han) and a Multi-Layer Perceptron (mlp) trained with creator-, content-, and context-based features. The final prediction (fake or non-fake) combines both classifier outputs through a logical or. Nonetheless, all these solutions work offline without explaining the outcomes. In contrast, our work exploits a wide variety of profiling features (creator, content, and context), operates online and is able to explain the classification outcomes.

Similarly to our research, Puraivan et al. (2021) combineed both unsupervised and supervised techniques, for feature extraction (Principal Component Analysis and t-Distributed Stochastic Neighbor Embedding) and classification (optimized distributed gradient boosting), respectively. However, this offline work disregards the textual content of the news and lacks transparency.

Finally, the sole online system found explores fake news detection with Gaussian Naive Bayes, mlp, and Hoeffding Tree base learners independently and in ensembles (Ksieniewicz et al., 2020). Unfortunately, this work uses another data set collected by the authors and automatically labeled by BS Detector Chrome Extension. Profiles are exclusively based on content features and the outcomes are not explained.

2.3 Research contribution

As previously stated, this work contributes with an explainable classification method to recognize in real-time fake news and, thus, promote trust in digital media. Particularly, the method implements online processing, updating profiles and classification models on each incoming event. First, user profiles are built using creator-, content- and context-based features engineered through nlp. Then, unsupervised methods are exploited to create clusters of representative features. Finally, interpretable stream-based ml classifiers establish the trustworthiness of tweets in real-time. As a result, the proposed method provides the user with a dashboard, combining visual data and natural language knowledge, to make tweet classification transparent.

3 Proposed method

The proposed online and explainable fake news detection system is described in Fig. 1. It is composed of three main modules: (i) the stream-based data processing module (Sect. 3.1) which comprises feature engineering (Sect. 3.1.1), and analysis and selection tasks (Sect. 3.1.2); (ii) the stream-based classification module (Sect. 3.2) composed of lexicon-based (Sect. 3.2.1), unsupervised and supervised (Sect. 3.2.2) classifiers; and (iii) the stream-based explainability module (Sect. 3.3). The explored data comprises two collections of tweets related to breaking news events released in 2016 (pheme) and augmented in 2018 (pheme-r).

Fig. 1
figure 1

System diagram composed of: (i) stream-based data processing, (ii) online classification, and (iii) stream-based explainability

3.1 Stream-based data processing

This module exploits nlp techniques to take full advantage of the ml models. Firstly, the feature engineering process generates new knowledge from the experimental data. Then, it analyses the resulting feature set to finally select the most relevant features for the classification.

3.1.1 Feature engineering

The proposed system computes features from a wide spectrum: (i) creator-, (ii) content- (lexical and syntactical features, stylistic features, and visual features), and (iii) context-based (network, distribution and temporal) features.

The creator-based features specify whether the user has an account description, a profile image and if the account has been protected and/or verified, the timezone, the number of followers and friends, the ratio between friends and followers, as well as the number of favourite tags received by the user. In the end, the time span in days between user registration and tweet post is calculated along with the weekly post frequency of the user.Footnote 7

The linguistic and syntactic content-based features include the word n-grams from the processed tweet and whether the content is duplicated in the experimental data set. The physical style features comprise the adjective, auxiliary, bad word, determiner, difficult word, hashtag, link (also repeated), noun, pronoun, punctuation, uppercase word and word counters. The sentiment-related features comprise emotion (anger, fear, happiness, sadness and surprise) and polarity (negative, neutral and positive). The non-physical style-based features are based on the Flesch reading ease metric (see Table 2), the McAlpine eflaw readability score for English foreign speakersFootnote 8 and the reading time in seconds. Concerning visual-based features, the system verifies if the tweet contains links to images and videos.

The generated context-based features consider whether the tweet has been retweeted and/or favourited, the depth of the retweet distribution network and the number of first-level retweets. Finally, the distribution pattern is analysed through the retweet and favourite counters.

Table 2 Flesch reading ease score and difficulty

The specific techniques applied to compute the aforementioned features will be described in Sect. 4.2.1 along with the data processing details.

3.1.2 Feature analysis and selection

Prior to feature selection, the system computes the variance of the features to establish their relative importance and, finally, discard those with low variance. Thus, the feature space dimension is reduced to minimize the computational load and time needed by ml models to classify tweets.

3.2 Stream-based classification

The proposed method involves lexicon-based (Sect. 3.2.1), unsupervised and supervised classification 3.2.2 in both the predict and train steps of each incoming tweet.

3.2.1 Frequency-based lexicon

The adopted frequency-based lexicon is applied to the content of each incoming tweet. Algorithm 1 provides the corresponding pseudo-code. The lexica allow swift prediction followed by updating (training) based on the tweet content. The training stage considers the target class the n-grams represent and their frequency. More in detail, it defines three thresholds: (i) the n-gram range to extract the words; (ii) the number of elements to be included in the resulting lexica; and (iii) the frequency used as insert condition.

Algorithm 1
figure a

Frequency-based lexicon generation

3.2.2 Unsupervised and supervised classification

First, the unsupervised classification creates clusters of comparable spatial extent, by splitting the input data based on their proximity. It applies k-means clustering (Sinaga and Yang, 2020; Vouros et al., 2021) to minimise within-cluster variances, also known as squared Euclidean distances. Then, for each discovered cluster, one supervised classifier is trained.

The method involves several well-known stream-based ml models, selected according to their good performance in similar classification problems (Aphiwongsophon and Chongstitvatana, 2018; Silva et al., 2021b; Xiao et al., 2020).

  • Adaptive Random Forest Classifier (arfc) (Gomes et al., 2017). It induces diversity using re-sampling, random feature subsets for node splits and drift detectors per base tree.

  • Hoeffding Adaptive Tree Classifier (hatc) (Bifet and Gavaldà, 2009). It uses a drift detector to monitor branch performance. Moreover, it presents a more efficient and effective bootstrap sampling strategy compared to the original Hoeffding Tree classifier.

  • Hoeffding Tree Classifier (htc) (Pham et al., 2017). It is an incremental decision tree algorithm which quantifies the number of samples needed to estimate the statistics while guarantying the prescribed performance.

  • Gaussian Naive Bayes (gnb) (Xue et al., 2021). It enhances the original Naive Bayes algorithm by exploiting a Gaussian distribution per feature and class.

Algorithmic performance is determined with the help of classification accuracy, F-measure (macro and micro-averaging) and run-time metrics, following the prequential evaluation protocol (Gama et al., 2013).

3.3 Stream-based explainability module

Transparency is essential to make results both understandable and trustworthy for the end users. This means that outcomes need to be accompanied by explanatory descriptions. The designed fake news classification solution relies on interpretable models to obtain and present the relevant data in an explainability dashboard. The explanation of each prediction includes:

  • Relevant user, content and context features selected by the supervised ml models.

  • Predicted class (fake and non-fake) together with confidence.

  • K disjoint elements ordered by their appearance frequency extracted from the fake and non-fake lexica.

  • K features that surround the centroid of the cluster to which the entry belongs.

The latter is completed with natural language descriptions of the corresponding tree decision path.

4 Experimental results

All experiments were performed using a server with the following hardware specifications:

  • Operating System: Ubuntu 18.04.2 LTS 64 bits

  • Processor: Intel@Core i9-10900K 2.80 GHz

  • RAM: 96 GB DDR4

  • Disk: 480 GB NVME + 500 GB SSD

4.1 Experimental data sets

The experiments were performed with temporally ordered data streams created from the pheme and pheme-r data setsFootnote 9 and, for additional testing, from the Nikiforos et al. (2020) data set.Footnote 10 The pheme collections comprise 6424 tweets created by 2893 users between August 2014 and March 2015. All tweets were manually labeled as fake and non-fake. The data set from Nikiforos et al. (2020) contains 2366 tweets posted by 51 users between April 2013 and December 2019. This data set was exclusively used to confirm the performance of the proposed method (see Sect. 4.3.3). Table 3 details the number of users and tweets per class in each experimental data set.

Table 3 Classes, number of users and tweets of the experimental data sets

4.2 Stream-based data processing

As previously mentioned, data processing applies nlp techniques to ensure the competing performance of the ml models. The procedures used for online feature engineering, analysis and selection are presented below.

4.2.1 Feature engineering

Firstly, tweet content is purged from url, redundant blank spaces, special characters (non-alphanumerical items, like accents and punctuation marks) and stop-words from the list provided by the Natural Language Toolkit (nltk).Footnote 11 The remaining content is lemmatised with the English en_core_web_md modelFootnote 12 of the spaCy libraryFootnote 13 and content polarity is established with TextBlob,Footnote 14 a sentiment analysis component for spaCy. The tweet emotion is calculated using Text2emotion Python library.Footnote 15

The creation of non-physical style features relies on the TextDescriptivesFootnote 16 spaCy module (features 13, 14, 17, 26, 28 and 29 in Table 4) and on the TextstatFootnote 17 Python library (features 18, 20, 25 and 30 in Table 4). The bad word count (feature 15 in Table 4) depends on the list provided by Wikimedia Meta-wiki.Footnote 18

Given the importance of hashtags within tweets, hashtags are decomposed into their elementary constituents, i.e., words. This is applied to the cases where the hashtag is not represented in title format.Footnote 19 This splitter uses a freely available English corpus, the Alpha lexicon,Footnote 20 along with the English corpus by García-Méndez et al. (2019). It employs a recursive and reentrant algorithm to minimise the number of splits needed to decompose the hashtag into correct English words. As an example, the proposed text decomposition solution splits hatecannotdriveouthate as hate cannot drive out hate.

The word n-grams are extracted from the accumulated tweet textual data using CountVectorizerFootnote 21 Python library. Listing 1 shows the ranges and best values for the CountVectorizer configuration parameters based on iterative experimental tests with GridSearchFootnote 22 meta transformer wrapper for the hatc classifier.

Listing 1
figure b

Parameter ranges for the generation of n-grams (best values in bold)

Table 4 shows the creator-, content- and context-based features selected for the detection of fake news. An additional pair of features is created for each user and numerical feature in Table 4 (features 6–9, 13–18, 21–24, 26, 28, 29, 31–33, 39 and 40): the user incremental feature average and latest feature trend, a Boolean feature that compares the last user feature value with the current user feature average.Footnote 23

Table 4 Features considered for the classification by profile (creator, content, context) and data type (Boolean, categorical, numerical, textual)

4.2.2 Feature analysis and selection

The method analyses the variance of features in Table 4 to compute their relative importance. Those features with low variance are discarded. Particularly, feature selection is performed at each incoming event using the VarianceThresholdFootnote 24 algorithm from RiverFootnote 25 library to improve the fake class recall metric.

4.3 Stream-based classification

Online classification involves prediction and training for each incoming sample. This section presents the results obtained by the lexicon-based, unsupervised and supervised classification procedures.

4.3.1 Frequency-based lexicon

The building of dynamic frequency-based lexicon starts after accumulating 5% of the samples. More in detail, the system extracts 700 from 2- to 4-word-length unique elements for each target class (fake and non-fake). Listing 2 provides the configuration parameter ranges. Best values were obtained once again from iterative experimental tests and using the hatc classifier.

Listing 2
figure c

Parameter ranges for the generation of the frequency-based lexicon (best values in bold)

4.3.2 Unsupervised and supervised classification results

As described in Sect. 3.2, the first step applies unsupervised clustering. The latter uses the widely known k-means model.Footnote 26 Then, for each of the discovered clusters, one supervised classifier is trained using the following implementations:

Hyperparameter optimisation is performed for the aforementioned ml algorithms. Listings 3, 4, 5 and 6 show the configuration ranges and best values (in bold) for each algorithm.

Listing 3
figure d

Hyperparameter ranges for the arfc model (best values in bold)

Listing 4
figure e

Hyperparameter ranges for the hatc model (best values in bold)

Listing 5
figure f

Hyperparameter ranges for the htc model (best values in bold)

Listing 6
figure g

Hyperparameter ranges for the gnb model (best value in bold)

Table 5 shows the performance of the ml models. Set a of features includes those in Table 4 except for word n-grams, whereas, set b includes set a plus the latter textual features. Finally, set c is composed of set b plus the frequency-based lexicon. The proposed solution exhibits a processing time of 0.42 s/sample in the worst scenario (arfc model and the set of features a), which can be considered real time.

In light of the results, arfc exhibits the best performance with all feature sets and for all evaluation metrics. The use of word n-grams results in significant improvement across all algorithms. The highest boost occurs for the gnb model (+ 12% percent points in accuracy and micro F-measure for the fake class). Despite the promising results, micro F-measure values for the target fake class remain under the 70% threshold with feature sets a and b. Finally, the solution reaches accuracy and macro F-measure about 80% with all engineered features (set c).

Table 5 Online fake detection results in terms of accuracy, macro and micro F-measure (best values in bold) and run-time for the arfc, hatc, htc and gnb models by feature set

4.3.3 Discussion

Since the majority of the competing works implement batch rather than stream processing and use different data sets, result comparison may not be straightforward. Batch and stream results are only directly comparable if obtained with the same data samples. This means that, ideally, the comparison should be made with a chronologically ordered data set, and the evaluation should consider only the test partition samples. In the case of stream processing, this is achieved by setting the dimension of the sliding window to the number of samples of the test partition and then processing the data set as a stream.

The batch classification works by Zubiaga et al. (2017), Akinyemi et al. (2020) and Ying et al. (2021) explore the same pheme data set with cross-folded validation, using 80% of the samples for training and 20% for testing. The related online fake news classification system of Ksieniewicz et al. (2020) employs another data set, preventing direct comparison.

Table 6 provides the theoretical comparison results of the most related works together with those of the proposed solution with a sliding window holding 20% of the data (for offline comparison) and a sliding window comprising all data (for online comparison)Footnote 31. The proposed solution with a sliding window of 20% of the data achieves an improvement in macro F-measure of 20.12 and 17.42 percent points with respect to the work of Zubiaga et al. (2017) and Jain et al. (2022), respectively. Moreover, it attains + 4.62 percent points in fake F-measure compared to Akinyemi et al. (2020). When compared with the batch and online deep learning approaches of Ying et al. (2021) and Ksieniewicz et al. (2020), the proposed solution exhibits slightly lower performance but grants algorithmic transparency with lesser memory and computation time. Finally, for a fair comparison with the most related work by Ksieniewicz et al. (2020), due to the fact the authors provided the implementation of the solution, we were able to run the experiments with the pheme data set and the accuracy obtained in this regard is 74.10% (\(-\)6.16 percent points than our proposal).

Table 6 Fake detection theoretical comparison in terms of accuracy, macro and micro F-measure between related works and the proposed solution

Originally, Nikiforos et al. (2020) achieved an accuracy of 99.79% and 99.37% with Naive Bayes and rf offline classifiers, respectively. Both models were trained with a synthetic minority over-sampled set generated from 80% of the original data (to overcome the class imbalance of the original data) and tested with the 20% of the original data. To compare with these results, the experiment was repeated with a sliding window comprising 20% of the total number of samples and the best arfc model. In this case, the current solution attained 99.14% accuracy, macro F-measure of 97.54%, and micro F-measure of 99.52% and 95.56% for non-fake and fake classes, respectively. This means that the proposed online method achieves, without oversampling and in real time, a comparable accuracy.

4.4 Stream-based explainability module

Figure 2 shows the user explainability dashboard, which aims to make the model outcome comprehensible. In the upper part, it displays the classification of the tweet sample. The user name is Zone 6 Combatives and the timezone Canadian. The top center displays the tweet content and the center presents the creator-, content- and context-related features selected by the ml classifier. Feature warnings are shown when a feature deviates from the user average as is the case of reading ease and time feature. Otherwise, the features include an ok symbol as in the case of the 5-years post-registration span feature. The classifier singled out the word pilot as relevant. The tweet was classified as fake with an 81% of confidence, according to the Predict_Proba_OneFootnote 32 from River ml library. In the end, the most representative features for both the frequency-based lexicon and the clustering procedure are provided.Footnote 33

The bottom part of the dashboard displays the decision tree path (obtained using debug one and drawFootnote 34 libraries) and the corresponding natural language description. Particularly, the first decision is based on the surprise feature (see feature 19 in Table 4). If its value is lower or equal to 0.55, the reasoning continues through the left branch. Otherwise it goes to the right branch.

Fig. 2
figure 2

Explainability dashboard comprising: (i) selected features from the content, context, and creator, (ii) the prediction, (iii) representative entries of the frequency-based lexicon and the clustering procedure, and (iv) the decision path and its natural language transcription

5 Conclusion

Social media is becoming an increasing source of breaking news. In these platforms, information is shared regardless of the context and reliability of the content and creator of the posted information. This instant news dissemination and consumption model easily propagates fake news, constituting a challenge in terms of transparency, reliability, and real-time processing. Accordingly, the proposed solution addresses transparency through explanations, reliability through fake news detection, and real-time processing through incremental profiling and learning. The motivation for the current work relies on the early detection, isolation and explanation of misinformation, all of them crucial procedures to increase the quality and trust in digital media social platforms.

More in detail, this work contributes with an explainable classification method to recognise fake news in real-time. The proposed method combines both unsupervised and supervised approaches with online created lexica. Specifically, it comprises (i) stream-based data processing (through feature engineering, analysis and selection), (ii) stream-based classification (lexicon-based, unsupervised and supervised classification), and (iii) stream-based explainability (prediction confidence and interpretable classification). Furthermore, the profiles are built using creator-, content- and context-based features with the help of nlp techniques. The experimental classification results of 80% accuracy and macro F-measure, obtained with a real data set manually annotated, endorse the promising performance of the designed explainable real-time fake news detection method.

Analyzing the related work, this proposal is the first to jointly provide stream-based data processing, profiling, classification and explainability. Future work will attempt to mitigate further the impact of fake news within social media by automatically identifying and isolating potential malicious accounts as well as extend the research to related tasks like stance detection, by exploiting new creator-, content- and context-based features.