1 Introduction and research motivation

Nowadays, social media platforms displace traditional ways of communication and information exchange. This trend continues due to critical factors that online social media provide: no cost, easy access, and is always available. Moreover, they are in standby mode by default, bombing users with real-time notifications about online neighborhood updates. It makes users feel on time with all the information without the necessity for searching for them as all the updates are given to them immediately. In one moment, people are bombed with news, pictures, videos, etc. They cannot correctly assess the content and verify if it is true or false. They make this decision based on their subjective feeling about processed information (Pennycook & Rand 2021). Having mentioned properties of social media platforms create an excellent opportunity to share information containing malicious content. Recently, it has been observed that these incidents are increasing and can affect different aspects of life, i.e., impacting the election results or financial and mental situations (Higdon 2020; Market chaos after fake Obama explosion tweet—ABC News (Australian Broadcasting Corporation), b.d.). Identifying the source of information is crucial as it can reduce disinformation and consequently avoid more severe problems (Frąszczak 2021a, 2022; Khan et al. 2021).

In most cases, fake news or rumors on social media platforms are initiated by groups of users located in separate neighborhoods. Many of those users are not real, just computer bots propagating appropriate messages easily attached or detached from the network. They are placed in different parts of the network as the organization aims to make immense desolation among the users and wants to cover the network with rumor content as soon as possible. It is reported that people believe in information and share it faster when it can be confirmed from multiple sources (Higdon 2020; Khan et al. 2021). Furthermore, it is harder to classify given content as fake when it is spread in numerous groups simultaneously than in one. Recently, it can be observed there have been developed multiple methods for identifying rumor sources in networks (Frąszczak 2021a; Jiang et al. 2017; Shelke & Attar 2019). Most of them are usable for single-source detection problems. Only some of them can be used for multi-source tasks. Unfortunately, most of them need a valid number of sources, which is impossible in real-life cases (Frąszczak 2021a; Shelke & Attar 2019). This paper investigates the problem by analyzing the available community detection methods and evaluating their accuracy. Furthermore, it also validates the accuracy of the well-known source detection problems over identifies outbreaks.

This paper is divided into six main parts. The first one introduces online social networks and their mathematical background. Furthermore, it also contains some essential information regarding rumor propagation and source detection. The second analyzes the techniques used for multi-source detection problems and their approach to finding rumor outbreaks. The third one introduces the simulation environment used to carry out the analysis of current methods. The fourth contains the simulation conditions, and the fifth describes the performed examinations and presents the obtained results in the simulation environment. The last one concludes the paper, summarizes the problems, and indicates possible future development directions in that area.

2 Online social networks, rumor propagation, and source detection

Social media platforms are broadly used to exchange information by milliards of people worldwide. Each day people share a lot of their updates and opinions on various types of topics. Their popularity is due to critical factors, including quick and accessible communication via the Internet, availability, or free. The number of active users is still growing. For Facebook, this number in the five years increased twice, giving in 2020 about 2740 mln active users using that platform (Digital News Report, 2016, b.d.). These conditions make it easy to spread information from one user to another in close neighborhoods and around the whole social network on the given platform (Frąszczak 2021b; Guille et al. 2013; Mei Li et al. 2017). Unfortunately, it is reported that it has been recently increasingly used for malicious purposes, e.g., rumor propagation. As mentioned vicious incidents can impact different aspects of people’s lives, including election results, stock changes, or people’s lives (Higdon 2020). Identifying the source of malicious information is crucial as it can reduce disinformation and avoid more severe problems.

Online social networks are represented with the graph theory. Its structure is defined by the graph \(G = (V,E)\), where \(V\) represents a countably infinite set of nodes (users) and \(E\) is a set of edges (relations between them) connected via an adjacency matrix. In most cases, the particular cells of this matrix contain “1” as a value for connected nodes “0” is used otherwise. The edges can represent the one-directional relationship between nodes representing an independent relationship, i.e., Twitter following the relation between two users (Raj et al. 2018). This type of association is called “directed.” In contrast, the two-directional relationship between nodes is called “undirected” and is used to model mutual relationships, i.e., Facebook friendship (Frąszczak 2021a).

Malicious information in online social networks can be initiated from a single or set of nodes, the rumor sources \(v \subseteq G\). These nodes are called active or susceptible (Frąszczak 2021b; Mei Li et al. 2017; Rossetti et al. 2019; Tarapata & Kasprzyk 2010) and actively participate in a rumor-spreading process. It is achieved by passing information to its neighbors and encouraging them to participate in this action. Each user (node) that has started giving information further in a network becomes an active node, moving forward an information propagation process. An infection graph is created as time passes and more nodes become infected.\(G_{I}\) is a subgraph of \(G\) and consists of infected nodes \(V_{I}\) which have taken part in information propagation via edges \(E_{I}\)[4], [7], [8]. The task of source detection methods is to find original rumor sources based on a given infected graph \(G_{I}\)(Frąszczak 2021a; Jiang et al. 2017; Shelke & Attar 2019). For problems dedicated to single sources, the whole infection graph is analyzed. For the multi-source ones, the number of sources is provided as input parameters or is estimated with community detection methods able to identify the number of them based on some metrics. Then the infection graph is either divided into smaller areas where each contains one source or is analyzed as one part, and estimated sources with the most considerable value are considered real ones (Shelke & Attar 2019; Zang et al. 2015). In this paper, the first approach is analyzed.

Multiple approaches exist to detect rumor sources and identify outbreaks based on the given infection graph. The most popular in recent research are maximum likelihood (ML) and maximum posteriori (MAP) estimations (Frąszczak 2021a; Shelke & Attar 2019). This paper analyzes the accuracy of those methods based on the identified outbreaks. One of the most straightforward and most used ML estimations to detect rumor sources is centrality measures broadly used to assess nodes in a network based on its structure (Ali et al. 2020; Das et al. 2018; Das & Kumar Sinha 2018). There is an assumption that information should be initiated from the most valuable nodes to reach as big as possible network coverage in the shortest time. Therefore new centrality metrics are still discovered and used for various problems (Chebotarev & Gubanov 2020). The most famous six in rumor source detection are used for this paper and introduced in Table 1. The usefulness of well-known single-source detection methods such as NetSleuth (Prakash et al. 2012), Rumor Center (Dong et al. 2013; Shah & Zaman 2010, 2011), and Jordan Center (Zhu & Ying 2013) applied with the divide and conquered approach is also analyzed.

Table 1 Most famous centrality metrics used in rumor source detection research

3 Current solutions for multi-source detection problems

The main research interest for source detection problems is focused on single-source issues. Unfortunately, fake news propagation in real-world situations is initiated by multiple sources (Jiang et al. 2017; Nguyen et al. 2016; Zang et al. 2014). The multi-source detection problem is more complex than the single one because the node evaluation must be computed for all possible subsets of infected nodes. The complexity of generating all possible subsets of possible source nodes assumes that source nodes are equal to \(\left( {\begin{array}{*{20}c} {\left| {V_{I} } \right|} \\ m \\ \end{array} } \right)\). When there are many infected nodes, \(V_{I} = O(n)\) there would be \(O(mn)\) possible source subsets. This fact makes the multi-source detection problem computationally hard for large networks, even for small values \(m\). Most of the available solutions for multi-source detection techniques utilize the divide and conquer approach to divide multi-source detection problems into a single source and then evaluate well-known methods for single-source detection. This division is obtained via network partitioning or community detection algorithms (Luo et al. 2014; Zang et al. 2014). Moreover, most cases use network partitioning or community detection methods that require an expected number of partitions, making the solution not usable for real-life problems where it is not known. Some solutions utilize different approaches to introduced ones: ranking and approximation based. More details about them can be found in (Choi et al. 2020a, b; Shelke & Attar 2019; Zhang et al. 2017).

The main task for both community detection and network partitioning methods is to find groups (clusters/communities) in the network, so each node belongs to one group. Desirable groups are densely connected to the nodes in the same group and sparsely connected to nodes in others. Sociology researchers noticed that individuals in the same community share similarities, such as gender, age, common interests, professional activity [32]. Therefore, the main aim of those methods is to detect groups of nodes that share similar properties and differ from other nodes concerning certain criteria. Those criteria are different for both approaches, as they emerge from different origins. Network partitioning techniques are based on graph theory, whereas community detection where developed based on sociology. Mathematically, network partitioning and community detection methods aim to divide \(G\) into \(q\) disjoint sub-graphs \(C_{i} = (V_{i} ,E_{i} )\), in which \(\forall i \ne j:C_{i} \cap C_{j} = \emptyset\) and \(\bigcup\limits_{i = 1}^{k} {C_{i} = V}\). The evaluated communities are then estimated with quality functions. All evaluation metrics used to assess the obtained communities are introduced in Table 2. Those problems are NP-complete problems (Fortunato 2010) as there are an exponential number of various alternative partitions. Moreover, not all community detection methods can be used in all cases, as some are only dedicated to undirected and unweighted structures, while having such a complex input network structure is not recommended to abandon such detail, leading to inappropriate results.

Table 2 Community detection evaluation metrics

The method presented in (Luo & Tay 2012) utilized the Voronoi partitioning method to divide a network into multiple partitions, whereas the classical rumor center detection method is used to find a single source. In (Jiang et al. 2015) Capacity Constrained Network-Voronoi Diagram (CCNVD) (Yang et al. 2013) network partitioning method, together with a new metric of effective distance (Brockmann & Helbing 2013), was used for identifying multiple sources. Zang et al. (2014) used three different network division methods to detect numerous sources: leading eigenvector-based approach, edge betweenness, and mixed membership block model methods. The leading eigenvector-based method divides nodes into groups that satisfy two characteristics: sparse edges between different groups and abundant edges within the same group. This approach is also called a modularity-based one. Network partition utilizing edge betweenness divides nodes into groups by focusing on the boundaries of communities instead of their cores. The mixed membership block model divides nodes into groups because nodes infected by the same source are more likely to link, while nodes infected by different sources have less contact. An extension of the leading eigenvector-based method with modularity metrics was presented in (Zang et al. 2015). This paper also introduced a heuristic algorithm for estimating the number of sources utilizing the community detection algorithm.

Besides the methods used in the current research, some more community detection and network partitioning methods can be used to detect rumor outbreaks without the necessity of an expected number of groups. The following techniques can be modularity-based like the Louvain method (Blondel et al. 2008), label propagation (Cordasco & Gargano 2011), Clauset–Newman–Moore (Clauset et al. 2004) or Girvan-Newman (Girvan & Newman 2002), and much more (Frąszczak 2022). All used methods in the analysis are introduced further in the article.

Clauset–Newman–Moore’s (CNM) (Clauset et al. 2004) method utilizes both modularity and hierarchical agglomerative approaches. It is also called the fast greedy one due to a standard greedy way and is significantly quicker than other algorithms. It starts with each node in its community and joins the communities that introduce the most significant modularity increase at each step. The procedure is repeated until no such pair exists.

The Girvan–Newman (GN) (Girvan & Newman 2002) method identifies communities by iteratively removing edges from the original graph. It takes an edge based on its score that, in most cases, the edge with the most significant betweenness centrality value is taken at each step.

Louvain's (LV) (Blondel et al. 2008) method maximizes a modularity score for each community. It is done in two steps: local nodes moving and network aggregation. Each node is transferred to the community that yields the most significant impact on the quality function. Afterward, an aggregated network is created utilizing the partitions computed in the first step. Each community in this partition becomes a node in the aggregate network. The procedure is finished when the quality function (modularity) cannot be further improved.

The Leiden (LN) (Traag et al. 2019) method is an improvement of the Louvain algorithm. It consists of three phases: local moving of nodes, partition refinement, and network aggregation based on the refined partitions. The non-refined division is used to create an initial partition for the aggregate network.

The label propagation (LP) (Cordasco & Gargano 2011) method identifies node groups utilizing only the network structure. It does not need a pre-defined objective function or prior information about the communities. According to the following flow, communities are discovered: Each node gets a unique label—some identifier; then, the simulation is carried out. Each node updates its label to the most popular among the neighbors at each iteration. The procedure stops when each node has the majority label of its neighbors. It is not as deterministic as each time and can provide different results. Multiple simulations should be done, and the most popular division should be used.

Walktrap (WP) (Pons & Latapy 2005) utilizes random walks to detect communities in a network. It is based on the idea that the walks are more likely to stay within the same community because only a few edges lead outside a given community. It conducts short random walks and uses a hierarchical agglomerative approach to merge separate communities bottom-up.

The belief propagation (BF) (Zhang & Moore 2014) community detection method tries to obtain a consensus of many high-modularity partitions. It is achieved by utilizing a scalable message-passing algorithm based on the modularity metrics treated as a Hamiltonian and applying the cavity method.

Infomap (IP) (Rosvall & Bergstrom 2008) is based on information theory. It uses the random walk probability flow on a network as a proxy for information flows in the real system. That information divides the network into modules, compressing the probability flow description.

GA (Pizzuti 2008) is a genetic-based method to find communities in networks. It detects communities by structure, classifying densely connected nodes into a group. It optimizes a productive but straightforward fitness function to identify densely connected groups of nodes with sparse connections between groups.

Gemsec (GM) (Rozemberczki et al. 2019) is based on a graph embedding algorithm that learns a clustering of the nodes simultaneously with computing their embedding. It places nodes in an abstract feature space where the vertex features minimize the negative log-likelihood of preserving sampled vertex neighborhoods. It also incorporates known social network properties through a machine learning regularization.

Kcut (Ruan & Zhang 2007) is a spectral-based algorithm for community detection. It is helpful for undirected and non-overlapping social networks. It provides a unique combination of recursive partitioning and direct k-way methods, guaranteeing the efficiency of a recursive approach while also having the same accuracy as a direct k-way method.

The Markov clustering algorithm (MCL) (Enright 2002) utilizes simulation stochastic-based flow in graphs. It discovers clusters with a mathematical bootstrapping procedure. It computes random walk probabilities through the network and merges them using two transforming operations: expansion and inflation. It is achieved utilizing Markov (stochastic) matrices that contain the mathematical concept of random walks on a graph.

Paris (PS) (Bonald et al. 2018) is a hierarchical graph clustering algorithm inspired by modularity-based clustering techniques. It uses a distance between clusters induced by the probability of sampling node pairs to follow up the agglomerative approach to merge communities. The algorithm's output is a regular dendrogram, which reveals the multi-scale structure of the graph.

Spinglass (SPS) (Reichardt & Bornholdt 2006) relies on an analogy between Potts spin glass’s viral statistical mechanic model and the community structure. The network’s community structure is interpreted as the spin configuration that minimizes the energy of the spin glass, with the spin states being the community indices. It applies the simulated annealing optimization technique to optimize the modularity.

Surprise (SRC) (Traag et al. 2015) method to discover communities is based on a dedicated metric to evaluate them called a surprise. This quality metric assumes that edges between vertices emerge randomly according to a hypergeometric distribution. Partitions get a better score if it is less likely to result from a random realization.

Scalable Community Detection (SCD) (Prat-Pérez et al. 2014) is a disjoint community detection algorithm combining different strategies. It partitions the graph by maximizing the Weighted Community Clustering (WCC). Firstly, clusters are built around highly clustered nodes, and then the initial partition is refined using the approximate WCC.

SBM DL (SBM) (Peixoto 2014) extends the stochastic block model (SBM) approach to finding communities. It utilizes Monte Carlo and the greedy heuristic to infer the stochastic block model. It fits the non-overlapping stochastic block model by description length minimization with an agglomerative heuristic.

4 Simulation environment

The presented research has been carried out with RPaSDT (Rumour Propagation and Source Detection Toolkit) (Frąszczak 2022). It is an open-source toolkit available with an MIT license that simulates and analyzes the accuracy of the most common source detection methods. It is based on Python and well-known computing libraries. It provides a highly configurable, easily reusable, and user-friendly GUI-based application to simulate and analyze real scenarios for source detection problems. It allows the preparation of a rumor propagation experiment under any network topology, along with the well-known literature diffusion models, and identifies potential diffusion sources based on the propagation graph. It is worth mentioning that the described toolkit provides a set of additional tools to perform sophisticated network analyses to select different sources and verify how the diffusion under a given topology and origins set could behave. The software can also simulate propagation and source detection for other domains like epidemics or virus detection. This propagation can be manufactured with available models in the toolkit and the source identification process with known methods. It is worth mentioning that a set of auxiliary graph analysis tools includes various available community detection methods. The said properties make it easy to simulate and analyze the wide range of source detection methods utilizing different community detection methods under other conditions.

The simulation environment has been implemented in a window-based approach to visualize and manage different analysis aspects simultaneously. Each window contains a separate model, which does not affect the others. It means that the user can run multiple separate analyses experiments on the initial network structure and compare them, as presented in Fig. 1. All windows can be rearranged any way, making the analysis easier. Each window has a separate toolbar that provides different operations to carry out dedicated functions for the window context. In the presented example the software displays three windows: the initial network structure, degree analysis, and performed community detection with Louvain algorithms. The presented scenario is the basic step in the classical network analysis task. Fig. 2 presents further analysis of the introduced network. On the right side, the result of the four community detection methods is presented, whereas the left one presents the results of the simulation of the rumor propagation initiated by nodes ‘0’ and ‘33.’

Fig. 1
figure 1

RPaSDT use case—network analysis: initial network structure, degree centrality, and Louvain communities analysis

5 Simulation conditions

Simulation experiments were conducted to assess the accuracy of the well-known community detection methods in the literature that have not been applied yet to find rumor outbreaks in online social networks. This issue is essential due to the possibility of using the divide and conquer rule to divide the task of finding multiple sources in the network. As presented in the previous sections, this area of interest has not been well studied yet. The simulation case study has been carried out with multiple scenarios. The simple one is described in detail, and only simulation results are presented graphically with RPaSDT. The sources are chosen according to their centrality metrics that indicate the most valuable nodes in information propagation (Britt et al. 2021; Das & Kumar Sinha 2018; Frąszczak 2021b). For all cases, the node betweenness centrality metric is used.

The dataset used in the analysis includes both synthetical and real-world networks. Synthetical ones have been generated according to some scheme. The most popular social network analyses are small-world (SW), scale-free (SF), including Barabasi–Albert (BA) and Watts–Strogatz (WA), and Erdos–Renyi (ER) (Frąszczak 2021a; Shelke & Attar 2019). The graphical representation of the mentioned networks presents Fig. 3. Moreover, tree-based networks are also used for the rumor source detection problem. For the simulation, SW and BA networks are used, sequentially representing uniform and non-uniform networks, where nodes represent the real individuals, and the edges represent the connection between them in the network. The networks are generated based on the (Ju et al. 2022) used for COVID-19 rumor propagation. WS is generated with p = 0.4 and k = 10, whereas BA with k = 10, m = 5.

Fig. 2
figure 2

RPaSDT—visualization of different community detection methods for the given infection graph

Fig. 3
figure 3

Different network topologies: a random tree, b regular d-tree, c small-world, d scale-free network

Real-world datasets have been built upon real social network analysis. A dedicated tool was often prepared to get all the necessary data from the most popular social platforms. Real-world datasets mainly come from Twitter, Facebook, Wiki-vote, Chinese microblogging, Sina Weibo, and Enrol Email for rumor source detection. The datasets described below were used (Ryan & Nesreen 2015; Shu et al. 2019; Stanford Large Network Dataset Collection, b.d.). The research was conducted for different types of networks to determine their pros and cons based on network structural properties. The properties of the networks used in the study are presented in Table 3.

Table 3 Networks and their analysis used in the study

Rumor diffusion in social media platforms can propagate in various ways, but this process’s aim is always the same: to cover as many nodes of the networks as soon as possible. The researchers have developed various models to simulate different behaviors by utilizing the gathered data and analyzing past events. Nowadays, the most famous and willing applied for multiple domains are epidemic models. They have been developed based on an epidemic spread analysis in society. They utilize compartments, mutually exclusive groups based on their disease status. Each individual is located in one compartment at a given time but can move to another depending on the model parameters. There are two main hypotheses around this approach: Each node can be classified into a distinct state (compartment), and each individual has the same opportunity to meet an infected node. It is proven that information spread via online social networks can follow the same rules. The simulation propagation is conducted with the SIR model as it better imitates the rumor diffusion in online social networks. This approach allows nodes to get “healed” and stop the propagation after realizing that passed information is malicious. A summary of the most popular ones is presented in Fig. 4 (Cheng et al. 2013; Kasprzyk et al. 2011; Kasprzyk & Najgebauer 2021; Mei Li et al. 2017).

Fig. 4
figure 4

Epidemic models in information spread context

The SIR model has been used for the simulation with the configuration presented in (Ju et al. 2022). This configuration was used for rumor propagation of COVID-19 fake information. As introduced in that paper, such a model is still eligible to simulate rumor diffusion in social networks. The probability of the node getting “infected” (transition from S to I state) is 0.1, and the likelihood of the node stopping propagation (transition from I to R state) is 0.05 too. The primary purpose of this research is to measure the accuracy of the network partitioning methods to identify rumor outbreaks. The infection graph should be connected, which would be hard to achieve considering only currently infected nodes. The infection graph is computed on nodes either in the infected or recovered state.

6 Simulation

Rumor propagation simulation has been conducted with data sources described in Table 3 with the SIR model. The infection graph \(G_{I}\) is computed based on both recovered and infected nodes. The simulation is examined for the number of source nodes representing 0.1%, 1%, and 10% of all nodes in the network, respectively. The node betweenness centrality metric indicates the best ones (Şen et al. 2016), and 50 iterations simulate the process. That number can be exceptionally increased to obtain an infected-connected graph. The detection process could be disturbed if some communities were disconnected and initially indicated. The case with 100% network coverage is omitted as it would make it harder to identify real sources in the given network, as the whole network would be analyzed. This approach has some real-case scenario premises, as in general, fake news is not propagated across the entire network, only it’s part. After simulating the expected number of rumor diffusion interactions, community detection, and network partitioning algorithms are applied to the given \(G_{I}\). Their accuracy is evaluated, and then utilization of the well-known source detection methods is also verified (Table 4).

Table 4 Metrics used for the evaluation of rumor outbreaks detection

In the simplest case, the Karate club network is analyzed and visualized. Only the simulation results are presented for the other scenarios, as for most networks, it is hard to picture them due to their size. It is a social network of a university Karate club. More about that network can be found in Table 3. Firstly, the expected number of sources with the selected method is selected, e.t. nodes 2 and 33. Then the rumor propagation process is simulated by the expected number of iterations with the SIR model. The obtained infection graph \(G_{I}\) is then analyzed with rumor outbreak detection methods. The described flow is presented in Figs. 5 and 6. The first one presents the initial structure—the right part and the situation after simulating the propagation from the source nodes, whereas Fig. 6 presents the results of the outbreak detection methods. It can be noticed that the visual analysis for all the methods can be performed simultaneously.

Fig. 5
figure 5

Initial network with selected sources and rumor propagation over it

Fig. 6
figure 6

Visualization of the rumor outbreak detection based on infection graph with different methods

Tables 5, 6, 7, 8 present the results of each algorithm execution under different networks. The summary shows the detected communities, their size, and computational time. The table contains data averaged over ten experiments per case to remove a random bias. The simulation was performed with Intel Core i7-10,700 CPU 2.90 GHz, supported by 64 GB RAM and SSD disk running on the Linux Ubuntu platform. The analysis was performed on the host machine with the runnable package of RPaSDT (Frąszczak 2022). It also provides a Docker-based runtime environment, but the standalone package dedicated to the Linux platform was used to omit any redundant load. Computational time was rounded up to a decimal part of the second. The results for the algorithms, which took over 120 s, are omitted as they would not be a good fit to solve such problems for much bigger networks containing billions of nodes. Moreover, they could be ineffective for processing networks in real time (Table 9).

Table 5 Simulation results for Karate and Football real social networks
Table 6 Simulation results for Social and Facebook real social networks
Table 7 Simulation results for Small-World synthetical social networks
Table 8 Simulation results for Scale-Free synthetical social networks
Table 9 Outbreak detection methods summary ranking

The evaluation of the accuracy of the well-known network partitioning and community methods in identifying rumor outbreaks has been performed with metrics specific to community partitioning introduced in Table 1. They were extended with classification-based methods presented in Table 4 to provide a better evaluation for rumor outbreak identification. For the assessment, the information about propagation and source infection was used to define ground-truth communities used by metrics.

Based on obtained coverage results, it can be observed that analyzed networks can be divided into two groups based on their topology. Scale-Free (Facebook, Social, SF-1, SF-2) for which even a few sources (but very important) can reach a great network coverage. Adding new rumor sources for such networks does not improve the diffusion process. Small-world (Karate, Football, SM-1, SM-2) like adding more rumor sources makes a difference in the context of better network coverage in a shorter time.

Unfortunately, not all examined methods could detect outbreaks in the maximum expected time (the ‘N\A’ value provided in the table), so they were removed from further analysis. The visual representation of the completed detections is presented in Fig. 7. Including them during the research could lead to incorrect conclusions. Moreover, they are not eligible for big networks. Methods removed from the further analysis are SPS, Kcut, BF, GA, and SPL.

Fig. 7
figure 7

Ex. 1 Completed outbreaks (communities) detection experiments per outbreak detection method (more is better)

In the case of the average detection error presented in Fig. 8, three methods delivered worse results than others, and in the case of the SRC, this difference was significant. The NMI metric for evaluated methods is similar and close to 0.3, but SRC and MCL have more than 0.4 surpassing other approaches. Based on that it looks like only about 30% of all nodes are correctly assigned to a correct outbreak for most of the tested methods. In the context of the rumor source identification, this value is too low as it can lead to inappropriate results or make it harder for such analysis because nodes are assigned to different outbreaks than they belong to. It is also worth mentioning that SRC, MCL, and WP, which are the best in the NMI benchmark, are the worst in ADE—Fig. 8. It means they were better at assigning nodes to correct outbreaks but not so precise in estimating the correct number of them. It can be observed in the above tables that they provided much more groups that are less numerous than other methods causing their NMI to be higher but increasing the error. That thesis is confirmed in Fig. 9 where the SRC method found the biggest number of empty (without the real source) outbreaks which is not good. The best in that metric is PS, although also GN, LV, LN, CNM, SBM, and LP received good results. The well-performed outbreak detection method should have that value as low as possible allowing a reduction in the number of false positives (Fig. 10).

Fig. 8
figure 8

Ex. 2 Average detection error per method (less is better)

Fig. 9
figure 9

Ex. 3 Average number of identified empty outbreaks per method (less is better)

Fig. 10
figure 10

Ex. 4 Average NMI per method (more is better)

In most cases, the tested techniques overestimate the total number. This trend can be better noticed by removing the biggest outbreaks number from the evaluated dataset. This assumption is confirmed in real-life cases as fake news sources work in small and separate groups (Jin & Wu 2021; Li et al. 2019; Shelke & Attar 2019). What is also worthy of mentioning is that the outcome of correctly identifying the rumor outbreak number is rare and happens mostly for small networks.

For all algorithms, the size of the identified outbreaks varies greatly, often leaving several-node groups with multi-node. Moreover, the difference between the average outbreak size and its outliers is significant.

This feature is not expected as it automatically can lead to incorrect results in the context of the rumor source identification. It is unlikely that fake news sources in real scenario started propagating malicious content from an isolated group of users. They aim to reach as many users as possible in the shortest time. In the rumor source identification process, merging such groups with the biggest ones is recommended as it can provide better information context for the detection method. Unfortunately, none of the examined methods has provided results containing outbreaks of similar size. The confirmation of the above thesis will be presented with the effects on the effectiveness of identifying sources further in this paper. Figure 11 illustrates the number of completed source detection tasks per outbreak detection method. It can be observed that only some of them were able to process detection in the expected execution time, which again was set to 120 s. Only some combination of the presented methods can be used to resolve real scenarios with bigger networks. Methods like LP, IP, or CNM could almost perform the expected number of detection besides the ones for the biggest ones. Methods considered for further analysis are SRC, WP, LV, LN, GN, LP, IP, and CNM. The method “C” marked on the figures refers to standard betweenness centrality, “UC” unbiased version, “CM” traditional betweenness centrality based on outbreaks, and “UCM” unbiased version based on outbreaks. Based on the results presented in Fig. 12, it can be noticed that each source detection algorithm has been evaluated with a different number of successes, so further evaluation is conducted per source detection method. Moreover, it can be observed that providing the rumor outbreaks detection part to source detection based on the centralities makes such methods faster as they were able to perform more experiments.

Fig. 11
figure 11

Ex. 5 Completed source detection experiments in identified outbreaks per outbreak detection method (more is better)

Fig. 12
figure 12

Ex. 6 Completed source detection experiments in outbreaks per source detection method (more is better)

The results illustrated in Figs. 13, 14, 15, 16 present the examined source detection methods working on outbreaks identified by different techniques. The results are sorted decreasingly based on the F-score being a harmonic mean of the precision and recall. Each graph's last column—“REAL”—contains the source detection results found on the real outbreak. The best results are obtained by source detection based on betweenness centrality, as this metric was used to select propagation origins. A simple ranking has been used to find the best method for detecting correct outbreaks. The rank is the position sum of the technique in the analyzed metrics, and the method with the lowest ranking value is considered the best one. In other words, this measure shows how many given algorithm has been the best among the others. Based on the introduced metric, the Leiden algorithm was the best based on all analyses, but GN was the best for the only source detection experiments. Summarizing both LN and GN methods overperformed the other methods and should be taken into further analysis and used as the reference for the new methods.

Fig. 13
figure 13

Ex. 7 Source detection evaluation based on Rumor Center with outbreak detection methods

Fig. 14
figure 14

Ex. 8 Source detection evaluation based on Jordan Center with outbreak detection methods

Fig. 15
figure 15

Ex. 9 Source detection evaluation based on NetSleuth with outbreak detection methods

Fig. 16
figure 16

Ex. 10 Source detection evaluation based on betweenness centrality with outbreak detection methods

7 Conclusions

Social media platforms are broadly used to exchange information by milliards of people worldwide. Unfortunately, they are increasingly used with malicious intent. Finding a rumor source is a crucial attempt at controlling, preventing, and learning about the propagation of falsified information in networks. Unfortunately, the networks' size and complex structure make the problem of correctly identifying real sources harder. The current research in rumor source detection methods is mostly oriented toward single-source issues, which is inappropriate for real-life scenarios. Most available multi-source detection approaches require the exact number, whereas such information is not generally known a priori. To alleviate the mentioned problem, the presented paper has been introduced. To the best of our knowledge, this is the first comprehensive survey and analysis that focuses on the techniques of seeking propagation outbreaks in various networks without the exact number of them. It presents a variety of well-known network partitioning and community detection methods applied in a simulation case study to identify the best ones and provide their drawbacks and guidelines for new ways straightly oriented toward rumor multi-source identification problem.

The presented results highlight the issue of the lack of methods that can estimate the real number of sources correctly. All of the techniques were designed to use in another context than rumor source detection, with specific conditions. Moreover, the NMI metrics results indicate that only a small part of nodes are correctly assigned to a correct outbreak. Most of the presented methods share the same trend—they overestimate the total number of potential sources, leaving small groups of nodes with huge ones that strongly impact the accuracy of the source detection methods. Another drawback of them is the fact of detecting “empty” outbreaks that do not contain the real source. Moreover, evaluating the source detection methods applied to over-identified outbreaks confirmed the assumption that correct outbreak detection is crucial in finding many sources. The accuracy of all examined source detection methods increased significantly after applying them to the actual propagation outbreaks. However, the various network partitioning approaches gave a great overview, indicating the best ones that should be used in future research. Detecting fake news outbreaks has different properties than network partitioning and requires reliable methods to improve the accuracy of the current procedures. It should be able to recognize the most significant areas of propagation that should be consumed by detection methods to improve their accuracy. Moreover, it should be characterized by short computation time and be eligible for use in real-life scenarios for networks with huge nodes.