In this section we present the results of detecting and characterizing the major clusters in the network of retweets, i.e., the online party organization networks.
Community detection
To detect the online organization network of each political party, we apply the N-Louvain method. This new version has been designed to detect clusters which only include nodes that are reliably assigned to them. We apply the method by running the Louvain method 100 times and assigning to each cluster only the nodes that fall into that cluster more than 95 times (\(N=100, \varepsilon =0.05\)). By inspecting the results of the 100 executions, a constant presence of eight major clusters, much bigger than the other clusters, is observed. The composition of these clusters is also quite stable: 4973 nodes (82.25%) are assigned to the same cluster in over 95 executions.
We examine the most relevant nodes of every cluster, according to PageRank, and find a single cluster for almost each party: \({\rm ERC ^{\rm{rt}}}\), \({\rm CUP ^{\rm{rt}}}\), \({\rm Cs ^{\rm{rt}}}\), \({\rm CiU ^{\rm{rt}}}\), \({\rm PP ^{\rm{rt}}}\), and \({\rm PSC ^{\rm{rt}}}\). The only exception for such rule is that BeC is composed of two clusters. The manual inspection of the users from these two clusters reveals that one cluster is formed by the official accounts of the party (e.g., @bcnencomu, @ahorapodemos), allied parties (e.g., @ahoramadrid), the candidate (@adacolau), and a large community of peripheral users. In contrast, the other cluster is composed of activists engaged in the digital communication for the campaign (e.g., @toret, @santidemajo, @galapita), i.e., party activists, many of whom are related to the 15M movement. For this reason, from now on, the analysis distinguishes these clusters as \({\rm BeC{\text{-}}p ^{\rm{rt}}}\) and \({\rm BeC{\text{-}}m ^{\rm{rt}}}\): party and movement, respectively.
Table 2 shows the top five users with highest PageRank in each cluster, and their role with respect to the corresponding party: candidate (the account of the candidate for mayor), party (official accounts of parties associated with the candidacy), activist (party activists), institution (institutional accounts), media (accounts of media or journalists). It should be noted that we also considered the category politician to distinguish professional politicians from activists; however, no politician with an institutional position was found among the top five users from each cluster. While the topmost relevant users tend to correspond to each party’s candidate and official accounts, which is partly caused by the data collection criteria, it is interesting to note the presence of other very central nodes in these clusters, including media or institutional accounts (the municipality account, in the cluster of the outgoing mayor’s party). BeC-m is the only cluster for which the top users are mostly activists.
Table 2 Top 5 users for the 8 largest clusters according to their PageRank in the overall network, with their role with respect to the corresponding party
The boundaries between ideological online communities are visible in Fig. 2. As one could expect in any polarized scenario, the largest number of retweets occur within the same cluster. There exists, however, a notably large number of links between the two clusters of BeC (\({\rm BeC \text{-}p ^{\rm{rt}}}\), and \({\rm BeC\text{-}m ^{\rm{rt}}}\)). Figure 3 presents the sub-network formed by the nodes and links of both clusters. To further prove the low levels of interactions between major parties, an interaction matrix A is defined, where \(A_{i,j}\) counts all retweets that accounts from cluster \(i^\mathrm{rt}\) made for the tweets from users of cluster \(j^\mathrm{rt}\). Since the clusters have different sizes, \(A_{i,j}\) is normalized by the sum of the all retweets made by the users assigned to cluster i. Figure 4 shows matrix A for all the clusters and confirms that a vast majority of retweets were made between users from the same cluster (main diagonal). This is also true in the case of the two clusters of Barcelona en Comú although there is a presence of communication between movement and party clusters, with a prevalence from the movement to the party \(({\rm BeC \text{-}m ^{\rm{rt}}} \rightarrow {\rm BeC \text {-}p ^{\rm{rt}}} = 0.18)\), the largest value out of the main diagonal.
As mentioned above, the new version of the Louvain method proposed in this article only assigns a node to one of the eight largest clusters only if it falls to a particular one of these clusters more than 95 of 100 times. The final inclusion/exclusion of the most relevant nodes to a cluster was manually inspected in order to assess the performance of this new version. For preserving the political preference of non-public users, Table 3 only presents the 20 most relevant nodes which were not assigned to any cluster, their role, and how many times they fall into each cluster over the 100 executions. The results prove that N-Louvain method effectively prevented the inclusion of media accounts in the intra-network of political parties, e.g., @btvnoticies, @elperiodico, @elsmatins, etc. Also, for better readability, when a node falls in different political clusters more than 20% each, we highlight the corresponding values in Table 3. First, we observe that the Catalan pro-independence media outlet @naciodigital and two journalists from that outlet (@bernatff, @jordi_palmer) fell in \({\rm ERC ^{\rm{rt}}}\) and \({\rm CUP ^{\rm{rt}}}\), i.e., clusters of Catalan pro-independence parties. Second, we find that the TV show @puntcattv3 fell in \({\rm ERC ^{\rm{rt}}}\) and \({\rm PSC ^{\rm{rt}}}\) and the media outlet @xriusenoticies in \({\rm CiU ^{\rm{rt}}}\) and \({\rm PSC ^{\rm{rt}}}\). Results also show that @mariamariekke, a citizen who created drawings for the BeC campaign, fell between the two clusters of the party (party and movement). Finally, we also find of interest the appearance of civic organizations: Plataforma de Afectados por la Hipoteca mostly in \({\rm BeC \text{-}m ^{\rm{rt}}}\) (organization to stop evictions which was co-founded by the candidate of BeC), and Vaga de Totes (feminist labor organization), which lies between the left parties \({\rm BeC \text{-}m ^{\rm{rt}}}\) and \({\rm CUP ^{\rm{rt}}}\).
Table 3 Most relevant nodes, according to PageRank, which could not be reliably assigned to any of the major clusters indicating the number of executions in each cluster
Comparison to the Clique Percolation Method
The design of the N-Louvain method is motivated by the fuzzy community structure of political networks, as one of the campaigns for the 2015 Barcelona City Council election. These networks are usually formed by overlapping communities and the proposed algorithm improves the standard Louvain method by identifying clusters in a more stable way. However, we should note that there are some community detection methods in the state of the art for overlapping communities. In particular, the Clique Percolation Method (CPM) is the most popular one according to [21]. This method is applied on the network of retweets with the CFinder software packageFootnote 8 to detect k-cliques, i.e., complete (fully connected) sub-graphs of k nodes. Figure 5 presents the number of k-clique graphs obtained through the CPM at every value of k. As expected, the number of k-clique graphs tends to decrease as k increases. At its maximum value (\(k=13\)), the method only detects two k-clique graphs: one formed by users from BeC and another formed by users from CiU.
While the Louvain method was able to identify every party cluster, CPM at its maximum value only detects two party clusters. This is explained by the different size and structure of the party networks. For this reason, the communities at different values of k have been examined. When \(k=9\), CPM identifies seven k-clique graphs. The inspection of the nodes of each of them reveals that two of them are related to BeC, one is related to a municipal police trade union and the rest are related to each of the political parties CiU, CUP, Cs, and PP. For PSC and ERC, CPM identifies k-clique graphs when \(k=8\) and \(k=7,\) respectively. To compare these results with the clusters from the N-Louvain method, Table 4 indicates how many nodes of the each k-clique graph occurred in each cluster, and reveals that:
-
All the nodes of the k-clique graphs related to CiU, Cs, CUP, ERC, and PSC are part of the corresponding clusters from the N-Louvain method.
-
Only one node from PP k-clique graph was not in PP political cluster.
-
The nodes from the k-clique graph related to a trade union of municipal police (\(\rm GU\)) were not in a political cluster.
-
The largest BeC k-clique graph (\(\rm BeC_1\)) is mainly formed by nodes from the BeC movement cluster. The smallest k-clique graph (\(\rm BeC_2\)) is composed of two nodes from the BeC party cluster and seven nodes from the BeC movement cluster.
Figure 6 presents all these k-clique graphs to better understand their composition. The figure shows an overlap between the two BeC k-clique graphs which is composed of three nodes: @bcnencomu (party account), @adacolau (candidate), and @ciddavid (party member). It is interesting to observe that, although the rest of the nodes of the smallest k-clique graph belongs to the movement cluster, all of them are related to Iniciativa per Catalunya Verds, the main pre-existing party that converged in Barcelona en Comú. In other words, CPM also identifies a k-clique graph related to the institutional elite of BeC and a much larger k-clique graph related to the grassroots elements of BeC.
Table 4 Clusters obtained through Clique Percolation Method, k value of k-clique graph, and number of nodes which occur in the clusters obtained through the N-Louvain method
In conclusion, the results from applying CPM are consistent with the ones obtained through the community detection algorithm proposed in this article. However, the N-Louvain method has two substantial advantages over CPM:
-
The different size and structure of the political networks make that CPM at the maximum value of k only detects two major clusters. On the other hand, the new method is able to identify every party cluster.
-
The clusters obtained through CPM are k-cliques and, therefore, such clusters are dense graphs formed by the core of the party network structure. Social networks are characterized by their heavy-tailed degree distribution so the k-clique graphs exclude the large amount of less active users. Recent studies have proved that these are the nodes which compose the critical periphery in the growth of protest movements [6]. For this reason, the inclusion of these nodes, as the new method does, becomes essential for the following characterization of clusters.
Cluster characterization
The eight clusters detected by the community detection algorithm are then characterized in terms of hierarchical structure, small-world phenomenon, and coreness.
Hierarchical structure
To evaluate the hierarchical structure, the in-degree inequality of each cluster is measured with the Gini coefficient. In-degree centralization, originally suggested in [23], is also computed.
From results in Table 5 a notable divergence between both metrics is seen: the inequality values of \({\rm CiU ^{\rm{rt}}}\) and \({\rm PP ^{\rm{rt}}}\) are similar (\(G_\mathrm{in}=0.893\) and \(G_\mathrm{in}=0.876\), respectively), but the centralization of \({\rm PP ^{\rm{rt}}}\) (\(C_\mathrm{in}=0.378\)) is far from the maximum centralization value exhibited by \({\rm CiU ^{\rm{rt}}}\) (\(C_\mathrm{in}=0.770\)). For Barcelona en Comú, \({\rm BeC \text{-}m ^{\rm{rt}}}\) emerges as the least inequal and the least centralized structure, while \({\rm BeC \text{-}p ^{\rm{rt}}}\) forms the most inequal cluster (\(G_\mathrm{in}=0.995\)). The results in Table 5 confirm that the in-degree centralization formulated in [22] is almost equal to the ratio between the maximum in-degree and the number of nodes. In conclusion, this metric is not a good one to capture hierarchical structure for social diffusion graphs, and the Gini coefficient for in-degree inequality represents a more reliable measure. Finally, the Lorenz curve of the in-degree distribution of the clusters is presented in Fig. 7 to visually validate the different levels of inequality among clusters.
Table 5 Inequality based on the Gini coefficient (\({ G_{\rm{in}}}\)) and centralization (\({C_{\rm {in}}}\)) of the in-degree distribution of each cluster in the network of retweets, and ratio between the maximum in-degree and the number of nodes (r)
Small-world phenomenon
Broadly speaking, the efficiency of a social network is explained by its small-world phenomenon, i.e., phenomenon of users being linked by a mutual acquaintance. To assess the small-world phenomenon in each party, the average path length and the clustering coefficient are computed.
Table 6 reveals that \({\rm BeC \text{-}m ^{\rm{rt}}}\) has the highest clustering coefficient (\(\mathrm{Cl}=0.208\)) closely followed by \({\rm PP ^{\rm{rt}}}\) and \({\rm PSC ^{\rm{rt}}}\), the two smallest clusters by size. On the contrary the clustering coefficient of \({\rm BeC \text{-}p ^{\rm{rt}}}\) is almost 0. This finding is explained by the topology of \({\rm BeC \text{-}p ^{\rm{rt}}}\), roughly formed by stars whose center nodes are the most visible Twitter accounts of Barcelona en Comú: the party accounts and the candidate.
No remarkable patterns regarding the average path length are observed. It is lower than 3 for the majority of the party clusters with the \({\rm PSC ^{\rm{rt}}}\) cluster having the lowest value (\(l=2.29\)). At the same time \({\rm ERC ^{\rm{rt}}}\), \({\rm CiU ^{\rm{rt}}},\) and \({\rm BeC \text{-}p ^{\rm{rt}}}\) expose the longest average path length (5.43, 4.66, 3.35, respectively) that might signal the lower information especially in the case of \({\rm ERC ^{\rm{rt}}}\).
Table 6 Number of nodes (N) and edges (E), clustering coefficient (Cl), and average path length (l) of the intra-network of each cluster in the network of retweets
Coreness
The coreness of a network is closely related to its social resilience, i.e., the ability of a social group to withstand external stresses [23]. To measure social resilience for a social network, the k-core decomposition of each cluster is performed in order to evaluate the distributions of the nodes within each k-core. The more nodes are in the most inner cores, i.e., the ones with the larger k-indexes, and the larger is the maximal k-index, then the more resilient the cluster is.
Table 7 presents the maximal and average k-indexes for each cluster and Fig. 8 visually shows the corresponding distributions. As in the case of hierarchical structure and small-world phenomenon, \({\rm BeC \text{-}m ^{\rm{rt}}}\) (\(k_\mathrm{max}=17\), \(k_\mathrm{avg}=5.90\)) and \({\rm BeC \text{-}p ^{\rm{rt}}}\) (\(k_\mathrm{max}=5\), \(k_\mathrm{avg}=1.33\)) are the highest and lowest values, respectively. In comparison to the other parties there are clear differences between node distributions for both, \({\rm BeC \text{-}m ^{\rm{rt}}}\) and \({\rm BeC \text{-}p ^{\rm{rt}}}\), and the rest (the largest concentration of the nodes is in the first k-cores and considerable part is in the most inner cores). Therefore, the movement group of Barcelona en Comú is an online social community with an extreme ability to withstand or recover. At the same time the party group of Barcelona en Comú seems to only focus on the core users.
Table 7 Maximal and average k-index (standard deviation in parentheses) for the intra-network of each cluster in the network of retweets