Keywords

1 Introduction

The popularity of social media studies in the context of Russian politics started to take off during the 2011–2012 civil protests against what are widely seen as fraudulent results of the 2011 parliamentary elections. In the absence of impartial and objective coverage of elections in the traditional media (Golos 2011) various Social Network Sites (SNS) appeared to be instrumental in circulating information among citizens, effectively serving as an alternative source of trustworthy information on the election process. The key feature of social media functionality during the 2011–2012 protest was its multi-channel and multi-hierarchical structure. Spontaneously emerging information posted online about fraud and other infringements over the course of the elections was picked up and popularized by famous bloggers and popular public channels. It helped build awareness of the magnitude of committed acts and reaffirm large public discontent regarding the validity of the election process and the pronounced winner—the pro-Kremlin Edinaâ Rossiâ (United Russia) party. Furthermore, social media were instrumental in organizing and coordinating the protest as a key means of information circulation (for more on digital activism, see Chap. 8). Earlier academic accounts were positive on the crucial role of SNS in “mobilizing the discontent of citizens under the conditions of a semi-authoritarian political regime” (Lonkila 2012, 9).

This type of research received impetus from a chain of protests, ranging from the Occupy movement in mostly developed countries to the Arab Spring in the Middle East and North Africa region. In these cases, the internet in general and social media in particular were the key factors in the scope and magnitude of the protests. The events spurred a vigorous research into the phenomenon of social media in the context of social capital and civic engagement (Agarwal et al. 2014; Fuchs 2014), mobilization of protests (Earl et al. 2013; Greene 2013; Breuer et al. 2015), as well as methodological boundaries of political science, associated with new and growing computational methods (Tremayne 2014; Sinclair 2016; Tucker et al. 2016).

The events of the Arab Spring further reinvigorated the discourse of democratization of authoritarian states, claiming social media to be the key underlying technology promoting political change (Tufekci and Wilson 2012). These studies have fallen on the fertile soil of the Russian protest realities, determining the key theme of research. Therefore, in the context of politics, social media and digital social networks, Russian studies explored the main democratization hypothesis (Greene 2013), looking for answers as to why the Russian case did not result in a tumultuous upheaval akin to the Arab Spring (White and McAllister 2014; Reuter and Szakonyi 2013; Pallin 2017). In their answers, authors outlined several key features of social media in Russia. First of all, they agreed that up until the 2011–2012 protests the Russian digital public sphere had been developing relatively freely, without the tight oversight of the government. Secondly, recognizing the importance of Internet technologies, the authorities preferred a rather flexible model of domination over the rigid regulatory framework, which are popular in other authoritarian countries (with China being the exemplary case). On the one hand, an active and popular pro-government audience was cultivated and exhibited to the entire political spectrum, on the other, any anti-government sentiment was disrupted by various means, including the use of bots and trolls. Finally, the government undertook measures to domesticate the ownership over key social network sites and to ensure compliance of the large international ones through excessive regulation.

Following the emergence and controversy around the cyber activities of Russian government-affiliated organizations outside Russia’s territorial border, the research agenda and discourse then shifted towards a deeper study of trolls and bots (Jensen 2018; Stukal et al. 2017). Therefore, given the public interest in these specific topics, Russian social media studies have been dominated by a rather narrow research agenda, mainly referring to abnormal and critical situations. Obviously, patterns of users’ behavior would differ during these events, from their behavior under normal circumstances.

There have been earlier attempts to depict the topology of Russian digital social networks (Barash and Kelly 2012; Kelly et al. 2012) as well as the contents of key discussions (Nagornyy and Koltsova 2017; Maslinsky et al. 2013), using advanced computational methods of Social Network Analysis (SNA) and topic modeling. Yet, apart from specific and quite narrow-focused contributions from other fields and especially computational linguistics, such studies remain on the periphery of contemporary policy research.

2 Goals and Data

This chapter aims to fill this gap by analyzing the intra- and intergroup structure of politically engaged users of Russian social media and presenting the dynamics of ongoing political discussions. It builds on already conducted research and applies key SNA methods and approaches to the corpora of social media data from the Russian segment of Twitter. Therefore, the goals of the chapter are twofold: (1) to demonstrate the potential of SNA to analyze political discussions in Russian social media, and (2) to establish online political communities, determine their internal structure as well as measure their interconnectedness and detect key influencers. The chapter explores a number of hypotheses regarding the role of social media in contemporary Russian politics, which were partially inspired by the previous research in the field.

We propose and test a three-tier analytical strategy, outlining the macro-, meso-, and micro-levels of network analysis. We check whether Russian virtual society is generally divided across the same ideological lines as the public sphere, representing two scattered, yet distinct groups representing the pro-government sphere on the one hand, and, mainly, “non-systemic” opposition forces on the other (Gel’man 2015). Applying automated community-detection methods, we determine and visualize existing online communities. We further analyze their characteristics and, comparing their user structure with the contents of selected discussions, determine their ideological basis. Finally, we investigate relationships between and within communities, establish key leaders and influencers, as well as test the possibility of a dialogue between existing online political communities (for another case of network analysis, see Chap. 29).

While other social media platforms are more utilized by the wider public in Russia, such as VK (formerly VKontakte) or OK (formerly Odnoklassniki), we focus on Twitter due to three factors. Firstly, as a micro-blogging platform, it is determined by its inherently public nature, allowing the sharing and viewing of content without a restrictive permission structure, even permitting the viewing and following of all public content without a Twitter account. In part due to this, for public figures the platform has become a sort of modus operandi, both amongst the pro-government and “non-systemic” opposition. In this regard, Twitter, to a certain extent, has complemented LiveJournal—another very popular blogging platform as a means of reaching to a wider audience, yet with shorter announcements and “punchier” messages. Twitter remains the sixth most important social network platform in Russia with 9.9 million of unique monthly users.Footnote 1 In absolute numbers, Russian Twitter segment is the fifth largest in the world in terms of active user accounts.Footnote 2 Moreover, the Russian segment appears to be the most politicized, as a high proportion of the top 100 most followed accounts are political figures and media accounts, compared to other countries.Footnote 3 Secondly, Twitter has been a platform highly utilized for political information dissemination and event coordination, including for protests, both internationally, such as during the Arab spring, and in Russia, particularly during the 2011–2012 protests (Lotan et al. 2011; Wolfsfeld et al. 2013; Spaiser et al. 2017). Thirdly, it has been argued that foreign-owned social media (Facebook and Twitter) have a greater impact on the patterns of circulation of anti-government and pro-protest information than domestically owned platforms (VK, OK), due to greater state control over the latter ones (Reuter and Szakonyi 2015). Taken together, these factors demonstrate that Twitter in Russia is an important and contested space for the politically engaged segment of the Russian population and is a relevant and valuable platform for the political analysis of the country. Therefore, given its inherently open and public nature and high nominal politicization, Twitter in Russia is regarded as a valuable object for analysis of politically active social networks and political communication in the country.

To perform the empirical assessment, we collected six samples of Twitter data on topics of international or national political importance. Given their extensive coverage in traditional media altogether with higher than average Twitter activity, each event demonstrated resonance in Russian political society (see Table 30.1 for list). Each sample was collected individually using Twitter’s Search feature of the REST API, which allows the retroactive extraction of recent popular tweets containing specific keywords and returning a sample of tweets made in the preceding 7 to 9 days. The advantage of this approach is that it allows the collection of content preceding, during and following each specific informational event. To construct and evaluate user communities for each event, which are commonly understood to be based on who each user chooses to follow (Colleoni et al. 2014; Barberá et al. 2015; Halberstam and Knight 2015), we collected data on all friends/followers of users who participated in the sampled political discussions. This approach results in the collection of event data, participant users, and relationships between them. The total corpora of all six samples included 175K users and 978K tweets and retweets.

Table 30.1 Six events and size of captured networks

3 Assessment of SNA Methods

3.1 Macroscopic Methods: Visualizing Russian Online Political Communities

The common starting point of network analysis is the detection of network structure and visualization of the resultant communities. Visualizationallows at-a-glance assessment of the patterns present in between captured entities and the identification of which subsequent methods are relevant to investigate specific details of relationships of interest. Among various graph-visualization methods, force-directed layouts have become highly popular for practitioners in part due to the fact that they are aesthetically pleasant and intuitive (Koren 2003). Force-directed layouts, such as the commonly utilized ForceAtlas2, simulate a natural physical system of forces acting upon each other, with nodes repulsing each other like charged particles and edges attracting nodes like springs (Bastian et al. 2009; Jacomy et al. 2014). Applied to social networks such as the Twitter follower network, the method visually clusters well-connected users and segregates loosely affiliated groups. It furthermore maps the network, practically visualizing the distance between users and user groups.

Modularity and community-detection methods quantify key network parameters, complementing visualization layouts. The modularity statistic measures how divided a network is into segregated groups, ranging from −0.5 to 1, with the upper range indicating stronger module segregation (Brandes et al. 2007). Community-detection algorithms analyze the network structure and assign nodes to communities, providing a statistical analysis that complements the visual representation of force-directed graphs. The established network parameters support the evaluation of sociological theories and research hypotheses, such as the presence of “echo chambers” in social networks. Based on the hypothesis that most engagement happens amongst likeminded and connected individuals (Bakshy et al. 2015; Colleoni et al. 2014), this phenomenon has been widely investigated in international cases (Colleoni et al. 2014; Barberá et al. 2015); however it is hitherto under researched for the Russian case.

Choosing an appropriate community-detection method depends on the network type as well as computational resources, with particularly large-sized networks presenting a challenge. For directed networks, as in this case, with edges representing follower relationships or communication patterns like retweets or mentions, the Infomap method is appropriate (Lancichinetti and Fortunato 2009).Footnote 4 The Infomap method (Rosvall and Bergstrom 2008) simulates a random walk along the edges of the network and categorizes communities where information can flow quickly amongst well-connected users and is unlikely to leave to another group (Rosvall et al. 2009). We apply the Infomap method to categorize user communities on the six captured political samples, calculate modularity, and visualize each using the ForceAtlas 2 (Jacomy et al. 2014) force-directed layout (see Fig. 30.1).

Fig. 30.1
An infomap categorizes user communities like pro government, opposition on the six captured political samples in Crimea, Eurovision, Medvedev, Savchenko, Belykh, and Turkey with different modularity, in a gradient of colors.

The structure of political communities on Twitter by event

We observe that the political Russian Twitter space contains a highly stable community structure that parallels the real political landscape in the country, with two major political communities and a multitude of smaller ones, reacting to all political events in the country. The collected data allows us to assess various characteristics of established communities. Particularly, the basic follower method is complemented with an evaluation of network structures based on typical Twitter activities such as retweets and mentions. Both are useful as a retweet can be a symbolic representation of the consonance of opinions or importance of specific information, whereas mentions provide a wider spectrum of reactions and relationships between users. As hypothesized, there is division into two major competing political forces (Gel’man 2015), with the two major communities being (1) the pro-Kremlin (pro-government) supporters and Russian nationalists (community 0 or purple), and (2) the liberal and non-systemic opposition (community 2 or teal).

We also found that the “echo chamber” theory applies well to the Russian Twitter network, as the follower-based communities were highly polarized. Modularity varied between the events, from a relatively low statistic of 0.2101 on the Crimea sample to a moderately strong statistic of 0.4732 for the Eurovision sample (see Fig. 30.1).Footnote 5 Furthermore, users in each community showed a strong preference to retweet, mention, and communicate with users in their own community and low preference to do the same for users in other communities. As Tables 30.8, 30.9, and 30.10 (Annex A) show, on average 75% of all mentions, retweets, and replies happened within, and only 25% happened between users of different groups. Specifically looking at the pro-government and Opposition communities, we see that they are very highly polarized, as they retweet on average only 5% of the content created in the rival group. Interestingly, the Opposition group is less polarized of the two, possibly due to being on average three and half times smaller than the pro-government group.

3.2 Mesoscopic Methods and Russian Political Communities: Similar or Different?

Upon detecting and visualizing a macro structure of the whole network, it is useful to detail each of the detected communities through the two methods of density and transitivity. Both demonstrate the compactness of each community network, showing whether a group’s users are only loosely connected or highly interrelated and hence likely to be ideologically contiguous. Whereas density approaches the network holistically, measuring the proportion of connections that are present in the network against the total number of possible connections, transitivity measures the proportion of triangles (or three users connected to each other) against all possible triangles, a stronger indicator of interrelationships. Therefore, networks that have high density but low transitivity will be relatively interlinked, but not all users will know each other.Footnote 6 In practical terms, naturally built tight communities signify the presence of numerous multi-user interactions and the sharing of social trust and social capital within the group (Coleman 1990).

A further method is to detect cliques in a network, or a subset of nodes that are all connected to all other nodes in the clique. In social networks, cliques are sometimes referred to as clique communities, where groups of users are completely interconnected, with larger communities often containing many cliques. A benefit of clique analysis is that the prevalence and average size of cliques in a community network provides insight into the structure of the political group. A community with a large group of tri-node cliques (triads) demonstrates a relatively dispersed community, whereas the presence of several cliques with a large number of nodes in each hint at relevant sub-community structure for further analysis. Furthermore, as information is disseminated on Twitter through follower relationships, cliques represent a method of evaluating information propagation through a community, as well as a detailed analysis of the behavior of users in one versus another clique, as individuals tend to be highly influenced by the clique they belong to (Borgatti et al. 2009).

We find that the identified main political communities in Russian Twitter have vastly different characteristics and vary by event. The opposition community is a relatively dense and closely knit group, generally having stronger ties between individuals and likely sharing more meaningful interpersonal relationships. The pro-government community on the other hand is a more loosely related group of independent mini-communities, possessing more sporadic links between the sub-groups. In all six samples, the density of the opposition community exceeded that of the pro-government group (Table 30.2). Looking at transitivity, the pattern is repeated, although not as strongly and not for every sample. Clique distribution further underlies the social structure of both, as cliques in the opposition tend to be much smaller (Figs. 30.2 and 30.3). The looser amalgam of large cliques in the pro-government group also underlies the importance of public opinion leaders to reach each of these larger mini-communities.

Table 30.2 Density and transitivity of the network in its entirety and within its main communities
Fig. 30.2
A bar graph compares the clique size frequency distribution for pro government and opposition in Crimea. Opposition communities 9 and 10 have a high distribution of approximately 20.00%. Pro government community 16 has the highest distribution of approximately 17%.

Clique size frequency distribution by community—Crimea sample

Fig. 30.3
A bar graph compares the clique size frequency distribution for pro government and opposition in Medvedev. Opposition community 13 has the highest distribution of approximately 17%. Pro government community 13 has the highest distribution of approximately 5%.

Clique size frequency distribution by community—Medvedev sample

An important remark concerning the utilization of meso methods that is applicable to both Russian and international contexts, has to be made, however. Both, density and transitivity could be sensitive to the quality of sample data. For instance, keyword or hashtag searches could miss statements and, hence users, that indirectly reference the political event. This would inevitably affect the subnetwork structure. Furthermore, the rate limits and index algorithms, used by Social Media Platform APIs, could also seriously impact meso methodologies (Pfeffer et al. 2018). One way to alleviate sample issues is to use multi-sample approaches to demonstrate cross-event consistency, as done in this study. Another is sampling using only general limitations, such as language or location. While Twitter’s free API does not offer location filtering, language filtering has a potential for Russian political analysis, as fewer (compared to international languages) users outside the country would engage in online political discussion.Footnote 7

3.3 Microscopic Methods: Opinion Leaders in Russian Online Political Networks

Following the evaluation of network structure and community sub-structure, scholars often turn to the identification and measurement of the impact of network’s “influencers,” as well as the comparison of these influencers to offline opinion “leaders.” Traditional elites, who have always had the ability to shape the political narrative, have seen their power greatly expanded with Twitter and other social media spaces (Jungherr 2014). Previous research, both internationally (Bakshy et al. 2011) and on Russia (Roesen and Zvereva 2014) has found that traditional “leaders” can be cumulatively overshadowed on social media by “ordinary influencers” (Bakshy et al. 2011, 8), or median public figures with an average “offline” influence.

3.3.1 Identifying and Evaluating “Influencers”

The analysis is based on a sample of 469 accounts, which comprises public personalities and organizations as well as traditional media. These users were selected if they: (1) actively post on politically relevant events; (2) have at least ten thousand followers; and (3) either occupy positions in a government/non-government organization, or are well-known media personalities. The sampling technique adapted the “snowballing” approach but required several stages in order to improve the validity of the outcome. First, a top tier of politically relevant users was manually selected from the list of top 100 most popular accounts in the Russian segment. Secondly, from all samples collected, the 1000 most followed accounts were selected and manually sorted in order to identify politically relevant ones. These two steps together resulted in a list of 240 accounts. Among these accounts, only those that followed no more than 500 others, were selected. Subsequently, the list of friends of each was obtained, but only those who themselves had at least 10,000 followers were selected. Qualitative filtering of this list resulted in the creation of the master sample of 469 active Twitter public personalities.

The selection process inherent with this type of sampling technique can be considered as establishing a representative collection of user accounts. Given the “echo chamber” effect, it can be assumed that those who use Twitter as an interactive platform, and not only spread but also receive information, will strategically connect with (or themselves follow) a limited amount of personalities, many themselves public figures and involved in analogous activities (politics in the case of this study). Although the 10,000-follower threshold is rather arbitrary, it allows the selection of only those accounts that have the potential to efficiently create and/or disseminate political information. Similarly, the 500-friend threshold excludes those personalities who apply a tactic of following any account that interacts with them, and whose inclusion would not improve the sample.Footnote 8 To support the assessment of the endurance and impact of content created by opinion leaders, the last 3200 tweets of each were downloaded using the REST API and ranked in terms of their impact on political discussions. Four types of politically relevant leader accounts can be identified in the Russian Twitter segment. The first type are personal accounts of top politicians, media, and public personalities. Many of these accounts can be regarded as official, as they are verified “de-jure,” while others produce content that corresponds with ideological views of their nominal owners and therefore can be regarded as “de-facto” genuine. The second cluster comprises of accounts of traditional media sources, which utilize the platform predominantly to reach a wider user audience. In most instances, tweets produced by these types of accounts contain links to materials issued on these media’s websites, sometimes with opinionated comments that reflect the editors’ ideological preferences. These accounts appear to be the most interconnected within as well as outside the ideologically bounded communities they belong to. The third type includes official accounts of government agencies, which were selected for analysis on the basis of multiple premises. Twitter has been actively used by private sector companies and entrepreneurs for marketing purposes. Indeed, there is a growing body of research on the subject matter, which explores and analyses strategies of efficient public relations and marketing for businesses. If used efficiently, Twitter could boost a company’s performance. The same logic is applied to political organizations (Waters and Williams 2011; Towner and Dulio 2012), who adopt advanced technologies of governance within the Government 2.0 paradigm. This approach was officially adopted in Russia in the context of the Federal Program “Information Society 2011–2020” (Zherebtsov 2019). Accounts that produce and circulate political satire and politically relevant entertainment content comprise the fourth type of accounts, which we conventionally refer to as the parody group. While they themselves are not sources of official information or representatives of certain political groups, such accounts appear at the epicenter of selected discussions and disseminate certain sentiments. Moreover, they are quite popular not only among regular users but also among top political influencers.

The analysis of content produced by the leaders reveals several remarkable trends. There is a certain consistency between the groups in terms of retweeting and liking messages. The parody group outperforms all others in the combined popularity of its messages. Needless to say, all accounts in our sample that belong to this group produce and share oppositional sentiment. Personal accounts of political leaders comprise the second most popular group on Twitter. Interestingly, the content produced by these types of accounts is as often retweeted (or shared and thus actively endorsed) as it is liked (or passively endorsed). The Twitter activity of traditional media appears to be much lower than the first two types of accounts. To some extent, this demonstrates a quite remarkable pattern of social engagement in the Russian Twitter segment. While entertainment purposes are prevalent even in the context of political discourse (as demonstrated by the overwhelming popularity of parody accounts), users tend to get involved in political discussions and favor opinionated statements of political pundits and media personalities over factual information circulation. This being said, it was to be expected that official accounts appeared to be the least publicized in our sample; a trend best explained by the nature of content produced and shared by the accounts of this group. As official accounts tended to share links to digests and press releases, produced by the press-services of their respective agencies, this information is regarded as the least entertaining (or “infotaining”) to users.

Table 30.3 illustrates the ranking of leaders’ accounts by popularity in terms of both active (retweets) and passive (likes) endorsements, both on average for all accounts over the entire sample collected, and using a subsample of the top 10,000 most popular tweets authored by the leaders. With the former, the picture is quite consistent. The latter, however, demonstrates that for retweets, the group of official accounts ranks higher—second rather than fourth—as compared to likes, while parody accounts rank lower—fourth. A cursory evaluation of this shift, based on content analysis, revealed an unusually high activity of automated Twitter accounts (i.e. bots), indicating an evidence of selective strategy of boosting certain topics.

Table 30.3 Leaders’ impact metrics using all data collected or focused on the top 10k popular tweets

The types of leaders also differ from one another in terms of their capacity to influence the content and sentiment of online conversations. To perform this task, the most critical metrics of individual tweets—likes and retweets—were queried from a sample of 3200 most recent tweets, authored by the leaders. These metrics were aggregated and the average number of “likes” and retweets per leader was calculated. Used independently, it provides a good estimate of the “average power of a tweet” of the given user, although it does not consider the issue of outliers—accounts with relatively short lifespan and yet, quite high performance metrics. To address this, the maturity of accounts was estimated by multiplying the “tweet power” metric by the average number of tweets per day (1). Given the fact that all accounts in the leaders sample are real and used actively, the issue of automated content generation did not affect the overall calculations. Assuming that bots are less likely to be followed by leaders, the sample showed no evidence of the presence of unusually and/or suspiciously active accounts. Therefore, the average leader account generates approximately 16.03 (+/−2.3) tweets per day and the most active account, quite expectedly belonging to the media group, generates on average 170 tweets per day.

$$ {\displaystyle \begin{array}{c}(1)\; Average\kern0.17em tweet\kern0.17em power\kern0.17em of\kern0.17em Leade{r}_i=\left({\overline{Favorites}}_i+{\overline{Retweets}}_i\right)\\ {}\ast {\overline{NumTweetsPerDay}}_i\end{array}} $$

The overall list of candidate impact obtained by (1) was sorted from most to least impactful, and the overall list of 469 was broken down into quantiles. Figure 30.4 represents the breakdown of account types per quantile. Obviously, the parody group accounts generate content ordinary citizens are eager to react to: 53.1% of such accounts in the sample appeared in the first quantile. Approximately a quarter of personal accounts demonstrate the tendency to generate highly resonant content (23.5% in the first quantile). Interestingly, this most populous group is almost evenly distributed. Official accounts follow a somewhat normal distribution, peaking in the third quantile, hence generating relatively impactful content. The discrepancy between this distribution and the high performance of these account types in the top 10,000 sub-sample (Table 30.2) raises the importance of future in-depth content analysis of messages produced by this group. On the one hand, this content could be artificially “boosted”; on the other, top “tweets” could actually discuss politically crucial issues and be genuinely shared alongside the network, which to some extent, supports the thesis of the bursty nature of Twitter networks (Myers and Leskovec 2014). Finally, and rather surprisingly, the most prolific group of media accounts tends to be distributed towards the lower part of the scale.

Fig. 30.4
A grouped bar chart represents the breakdown of account types like Parody, Personal, Media and Official per quantile. 53.1% of accounts in the sample are parody in the first quantile. 24.5% of accounts in the sample are Personal in the second quantile. 32.7% of accounts in the sample are Official in the third quantile. 25.6% of accounts in the sample are Media in the fourth quantile. 53.5% of accounts in the sample are Media in the first quantile.

Influence of leaders’ content, distribution across quantiles

3.3.2 Developing an Index of “Influence”

The average tweet power metric for Twitter data, while indicative of certain patterns in ongoing and historical discussions, does not take into consideration network “influence” and the leaders’ ability to disseminate content throughout the network and in their particular communities. While the topic has gained significant attention in the research community, no established and widely accepted method of identifying Twitter influencers exists. Time-invariant approaches tend to compute influence on the basis of either centrality (network-dominated approach), or content impact (retweet-dominated approach). At the same time a combination of both methods could be quite productive. We propose a method utilizing network centrality and demonstrated ability to disseminate content.

Influence can be defined as the ability to seed discussions and spread content throughout the network. It can be seen as a derivative of two major parameters: the importance of content and its ability to meet the aspirations of ordinary users and the capacity of this content to spread through the network and be visible to a wide audience. The former is marked by users’ reaction to content, similar to the approaches taken in evaluating “influencers” above. The latter, on the other hand, evaluates the placement of the leader within a network or community, as a central placement creates a better opportunity to disseminate content amongst a wider audience. As such, we determine PageRank centrality on the “follow” relationship of Twitter, which is seen as both as an indicator of information-gathering, as well as social connection between two users, especially if it is reciprocated (Myers et al. 2014; Frederick et al. 2012).

Centrality is the most commonly used approach to determine the importance of nodes in a network (Livne et al. 2011; Romero et al. 2011). PageRank Centrality (Page et al. 1999), most famously used in Google search, assigns a probability distribution to the network, representing the chance of randomly picking a specific node. When applied to social networks, it allows the ranking of users by importance relative to each other. Centrality was combined with the aggregated average number of “likes” and “retweets” obtained from the 3200 tweets authored by each individual “leader.” Combining both parameters yields an index of identified leaders’ influence, which represents the potential to have an impact, rather than a bona fide substantiation of influence. Adopting the average tweet power metric (1), we multiply it by PageRank centrality of each candidate to generate “influence” index (2).

$$ (2)\; Leader{s}^{\prime} influenc{e}_i= Average\kern0.17em tweet\kern0.17em power\kern0.17em of\kern0.17em Leade{r}_i\ast Centralit{y}_i $$

Introducing centrality and combining the data with leaders’ assigned InfoMap communities alters the observed distribution considerably. Firstly, the first two quantiles of the most influential Twitter users are comprised predominantly of opposition accounts (Fig. 30.7, Annex B). In the first quantile, two-thirds (or 60%) of accounts belong to the opposition and only one-third to the pro-government community. A similar situation is observed in the second quantile, where 71% of accounts can be referred to as belonging to the opposition, and only 29% to a pro-government group. The first quantile included such popular opposition leaders as Aleksei Navalny, Leonid Volkov, Oleg Kashin, media outlets TV Rain (Dožd’), Èho, Moskvy, and Meduza, as well as highly influential parody accounts. The pro-government group, although outnumbered by its opponents, is represented by its most outspoken pundits (Vladimir Solovyev and Alexei Pushkov) and notable media sources (RIA Novosti, Vesti News). Interestingly, the most followed political accounts of Vladimir Putin and Dmitry Medvedev, although appearing in the top quantile, are located in the middle and in the end of it respectively. Secondly, the distribution of account types across quantiles is more flat, with a decline in proportion of parody and personal accounts in the first quantile and an increase in media and official accounts, and a gain in parody and media accounts at the expense of personal and official accounts, in the second quantile (Fig. 30.8, Annex B). This can be understood as indication that media and official accounts, while not impactful in terms of content, are central to the network and hence have a higher ability to distribute their content. The shift of a certain proportion of parody accounts from the first to the second quantile, as well as the relative decline of personal accounts, is a further validation of the presence of the “echo chamber.” As parody and personal content is usually popular in specific audiences, these accounts are not highly followed by opposing communities and hence do not share central position in the whole network.

Furthermore, such dominance of opposition accounts in the top half of the influence index speaks of the higher importance of this form of communication for the opposition and also supports evidence of the greater structuration and network sophistication from the network analysis. The opposition not only focuses on social media as its main form of reaching the audience but also emphasizes the role of opinion leaders. In this regard, Alexei Navalny is the major actor and the greatest influencer not only within his own political community, but also in the entire network. Pro-government pundits, like Vladimir Solovyev and Alexei Pushkov outperform their own formal leaders in terms of influence in the virtual community, and accounts of traditional federal mass media are instrumental in the dissemination of the pro-government content. This establishes a new framework of evaluating Russian political Twitter, which is quite different from Kelly et al. (2012) in terms of network structure and from Greene (2018) in terms of content.

3.3.3 Cross-Validation of the Proposed “Influence” Index

The proposed index (2) requires further testing and validation. Given the nature of the research topic, where outcomes are easily predictable on the basis of traditional theories and concepts of Russian politics, the best way to test reliability of a new instrument would be the utilization of another approach. Given the fact that this new method is a derivative of major other influence indicators, reusing them would result in unfavorable procedural overlap and, thus, similarity of outcomes. To overcome this issue, and avoid complex dynamic methods, this research adapts the principle that utilized the Hirsch index (h-index) of academic impact (3).

Hirsch is rather unexpectedly suitable and productive for measuring leaders’ performance on Twitter, and even overcomes deficiencies visible in the context of scholarly work. Firstly, leaders on Twitter are akin to scholars in academia, producing content aimed at specific audiences and seek endorsement for their work in terms of citations or “likes” and “retweets.” Secondly, both academic papers and blog messages increase their value through references, with the growth being well documented and easily accessible. Thirdly, academics and leading bloggers both tend to increase their visibility by producing the maximum possible high-quality content. Moreover, the ample quantity of blog posts overcomes the limitations of academic work, where the number of contributions is usually lower.

$$ {\displaystyle \begin{array}{c}(3)\; Leader{s}^{\prime} influence\kern0.17em with\kern0.17em Hirsc{h}_i= hirsch\kern0.17em method\left(Favorites+ Retweets\right)\\ {}\kern2.64em \ast Centralit{y}_i\end{array}} $$

Therefore, the use of the h-index seems justified, as it addresses the issue of outliers (i.e. highly popular tweets) as well as the lifespan of accounts (immature, yet highly popular accounts) and provides a weighted rank of significant contributions. To put it simply, the h-index algorithm finds an “ideal point” between the number of contributions and their relative popularity, which for Twitter can be considered as the sum of “likes” and “retweets” for each users’ post (hirsch method(Favorites + Retweets)). All leaders were ranked according to the obtained indices and the resultant list was compared with the ranked list of leaders, obtained through the index method proposed by this research. Spearman’s rank correlation coefficient (ρ) was utilized to establish whether both methods were concordant. It demonstrated a high correlation coefficient of 0.69 between the proposed influence index (2) and the modified h-index (3). Notably, this coefficient was calculated when the h-index did not refer to the centrality parameter of each leader account. Including the centrality indicator increased the correlation coefficient to 0.80.

As a ranking algorithm, the h-index provides a useful method for establishing the most influential contributors and can be used for ranking leaders. It also confirms the validity of the proposed time-invariant influence rank (2). As any other methods, the h-index for Twitter is not without deficiencies and potentially may not be used for samples where leaders are highly popular and produce a large quantity of tweets. As the Twitter REST API limits access to 3200 most recent posts, the h-index will not be able to produce an index higher than the quantity of posts. Yet in the case of current measurements of Russian Twitter, this issue was not a problem, as the most popular user—Alexei Navalny—scored only 902 points on the scale. Furthermore, both indices (2 and 3) are consistent and consonant with common wisdom that the actual disposition of actors and organizations within the political arena should be correlated with their political influence.

4 Beyond the Score: Cross Validation of Detected Patterns

4.1 Further Validating “Echo Chambers”

Focusing on intra- and cross-community conversation, we observed homophilous conversation patterns between the various communities as users tended to share content with like-minded individuals inside their own community, most especially between the pro-government and Opposition groups. Nominal homophily however can be misleading as users in small communities are more likely to converse across community lines simply because their community is small, and users in large communities are unlikely to converse outside their group due to its relative proportion. We adopt a method developed by Currarini et al. (2007) to validate the nominal homophily observed. Specifically, nominal homophily, or the proportion of conversations a community has within itself (Hi) is compared to its relative size within the network (wi).

$$ (4)\;{H}_i={w}_i;(5)\;{H}_i>{w}_i;(6)\;{H}_i<{w}_i $$

Baseline homophily (4) occurs when the proportion of user conversations within a community equals the relative size of the community, indicating that on aggregate, users in that community show no special preference or bias for their own friends. Inbreeding homophily (5) indicates that users are biased and converse more often within their own group than is expected on the basis of its relative demographic size. Finally, if a community shows heterophilous patterns (6), the number of conversations within the group will be less than the relative size of the group.Footnote 9 To enable comparisons between communities of various sizes and different conversation types on Twitter, we standardize homophily indicators for (7) baseline homophily, (8) inbreeding homophily, and (9) heterophily.

$$ (7)\;\frac{H_i}{w_i}=1;(8)\;\frac{H_i}{w_i}>1;(9)\;\frac{H_i}{w_i}<1 $$

Investigating standardized homophily indicators (\( \frac{H_i}{w_i} \)), we find that each community demonstrated strong in breeding homophily (Table 30.4). Interestingly, the non-systemic opposition is more homophilous than the pro-government community. A few communities, such as communities 3 and 4, while quite small, demonstrated excessively high standardized homophily indicators.

Table 30.4 Relative community sizes and standardized homophily indicators

4.2 Makeup of Two Main Political Communities and Their Reactions to Political Events

Individually assessing the pro-government community, we observe that it was by far the largest political group, always actively participating in all events. The community displayed strong pro-Putin, pro-government, anti-western (including anti-US [United States], anti-Europe and anti-Ukraine), and, in some instances even nationalist sentiment. While sometimes a little critical of the regime, the users in this community (community 0 or purple in Fig. 30.1 above) generally disseminated information in line with a patriotic narrative and demonstrated two patterns of Twitter use. If the informational event was not negative to Russia or the government, such as the two-year anniversary of the accession of Crimea as part of the Russian Federation, then reactions were usually event-specific and generally positive. However, if the informational event was inherently negative to the government, reaction was usually split between a certain proportion of anti-government content, and neutral or positive pro-government reaction. In certain cases, a pattern is evident whereby positive content was coordinated around specific keywords that were trending negatively in order to coopt the term and spin it positively, distracting the conversation to unrelated pro-government content.

Reactions to Prime Minister Medvedev’s comment of “there is no money, but you hang in there” to Crimean pensioners, posted to YouTube on May 23, 2016, demonstrate these two patterns well (Fig. 30.5). While some users derided the Prime Minister’s comments, factual and neutral reactions were quite prevalent. A large amount of disseminated content focused on unrelated positive topics to distract and mitigate the initial negative reaction inside the community. Two stories were widely mentioned on May 23 and 24 focusing on specific keywords. The first focused on the word “money” by distributing a story on the Prime Minister promising to find money for museums in Crimea. The second focused on the terms “economy” and “investment,” disseminating content about the release of a government plan, approved by the Prime Minister, aiming to increase domestic demand for the products of Russian chemical and petrochemical industries. Other reactions also included the factual reporting of the Prime Minister’s comments or presenting the information in a neutral fashion, with tweets such as “Medvedev admitted that there is no money to index pensions.”Footnote 10 Interestingly, such neutral posts usually did not include links and were composed of just text.

Fig. 30.5
A graph of the proportion of conversations on a topic and the number of tweets per hour versus time. It plots 2 fluctuating curves of the proportion of tweets on topic per hour and the number of tweets per hour.

Pro-government community reaction to Medvedev’s comment to pensioners in Crimea

The solid line indicates the number of tweets on an hourly basis (right axis) in the pro-government community, and the dashed line indicates the proportion of conversation (i.e. of the tweets and retweets made during that hour) that had to do with Medvedev’s comment (left axis). Tweets containing “money,” “hang in there,” “pensioners,” “have a good day” (“deneg,” “den’gi,” “deržitesʹ,” “pensii,” “pensij,” “nastroenie”) were used to calculate the proportion. Specifically, lemmatized words in each tweet were checked against the lemmas of desired keywords. Tweets that contained keywords known to be used in the counter strategy were excluded.

The second of the two main political groups, the Opposition community (community 2, marked teal in Fig. 30.1 above), was on average 3.5 times smaller. Its users displayed negative, ironic, and critical assessments of the Russian government and also disseminated Ukrainian-friendly, pro-US, and pro-Western content. While users in this community also shared content on liberal values, such as opposition to authoritarian government or support for democracy, a majority focused on vilifying the government, with users spreading negative memes or ridiculing government strategies or statements. Indeed, particularly virile ridicule and even contempt of the government followed Medvedev’s comments in Crimea. The community is also made up of a sizable proportion of Russian-speaking Ukrainians, which seemed to influence how the group reacts to informational events. Indeed the Savchenko affair and the Eurovision contest, both highly interrelated with Ukrainian politics, are the largest samples of captured Opposition group users, with the former being the largest by number of tweets and the latter, the largest in quantity of engaged users.

Similar to the pro-government community, content shared within the Opposition group followed a dual pattern. If the informational event was neutral or negative to the government, and hence in line with community expectation, users either discussed the topic in a neutral fashion or spread content negative to the government. However, when the informational event ran counter to community expectations, then reaction was split as users reacted in different ways. The three-pronged reaction following the arrest of the liberal governor of the Kirov Oblast, Nikita Belykh, well demonstrates this trend. Factual and neutral tweeting was predominant; however, genuine shock and disbelief, often including statements that Belykh was set up, was also prevalent (see Fig. 30.6). Finally, a third opinion expressed by the community was that of anti-Belykh statements, believing that he betrayed the liberal movement by becoming a systemic politician.

Fig. 30.6
A graph of the proportion of conversations on a topic and the number of tweets per hour versus time. It plots two increasing and decreasing trend curves of the proportion of tweets on topic per hour and the number of tweets per hour.

Opposition community reaction of disbelief to Belykh’s guilt

Represents data from June 25 to June 28. The solid line indicates the number of tweets per hour (right axis) and the dashed line indicates the proportion of conversation believing Belykh was set up (left axis). Tweets containing lemmatized words including “setup,” “don’t believe” and “provocation” (“podstava,” “podstavili,” “podstavit’,” “podstav,” “ne verû,” “poverit,” “provokaciâ,” “provocirovali”) were used to calculate the proportion.

Comparing the standardized rate of tweets per hour between the two main communities, we see that the pro-government group reacted very differently than the Opposition group to several events, most notably during the Medvedev event (see Medvedev Chart in Annex C).Footnote 11 The initial spike of tweeting activity in mid-day on May 24 lagged by a few hours the initial and larger reaction by the Opposition group, indicating that the pro-government group was less likely to immediately react to the negative information. The secondary and larger spike of activity in the latter half of the day is particularly interesting, given its size and the content shared during it was mostly not about Medvedev’s comment at all. Comparing the (nominal) number of tweets every two hours with the proportion of the tweets that have to do with Medvedev’s comments, we see that the conversation shifted to discussing other topics during this second spike (see Fig. 30.5). Following this, the proportion of unrelated content on Medvedev continued to dominate discussions within the community, with the proportion mentioning the pensioner comment mostly remaining in the 20–30% range. Moreover, given that the keyword used to collect this sample was “Medvedev,” we hypothesize that this pattern of distraction specifically had to do with positively portraying the Prime Minister.

4.3 Finding Bots Within Russian Twitter

SNA is undeservedly neglected in the context of the mainstream topic of “bot” and “troll” impact on Russian “online” political discussions. Recent research points that the proportion of content created on Russian Twitter has a 50% probability of being produced by “bots” (Stukal et al. 2017). As stated elsewhere (Murthy et al. 2016), bots can have an impact on simple indicators such as follower counts or hashtag boosting. This may impact users who follow others indiscriminantly and network metrics as a whole, which assess all tweets in the network without taking into account the structure of social networks (Ferrara 2018). As ordinary users tend not to follow “bots” and segregate themselves into isolated “echo-chambers,” bots are likely to be segregated to isolated communities that have little influence on real politically engaged users.

To evaluate the potential impact and ensure the validity of our findings, we apply a three-method strategy to measure the prevalence of bots within the identified network structure. First, we evaluate the proportion of duplicate and highly similar content created in each community, as bots are known to repeat (not retweet) identical tweets (Lawrence 2015). As tweets can have very minor purposeful variation, applied to them by bot designers, such as adding an extra hashtag or changing the beginning of the tweet, we compare the similarity of tweets by excluding tweet extremities. Secondly, to validate the results of the first method, we qualitatively assess a sub-sample of tweets shared in suspect communities. Finally, we apply the popular Botor Not method, also known as botometer, to score the likelihood of each user in our samples to be a “bot” (Davis et al. 2016), a feature-based method that evaluates a set of behaviors of a Twitter account and assigns it a score (probability) of being a “bot” (Ferrara 2018). A tried and relatively accurate approach (Yang et al. 2019), it is appropriate for cross-validating other methods utilized.

We find that in the two main political communities, the proportion of duplicate and similar tweets varied by event; however, the pro-government community demonstrated a much larger proportion of both in all events (Table 30.5). For instance, in the Medvedev sample, 43% of tweets are very similar in the pro-government community, many spreading the positive new story of government support of Russia’s petrochemical industry on May 24. The low proportion of duplicate content in the opposition community is intriguing, as is the relatively high proportion of similar content. A possible explanation could be that the opposition is a more dynamic group of users who follow more sophisticated bots that utilize more complex natural language or image methods to spread content. We propose this fascinating puzzle as a question for further research in the area.

Table 30.5 Duplication and similarity of content in two main communities (% identical; % similar)

Outside the main political communities, qualitative and duplicate/similarity analysis revealed that many visually segregated communities were made up to a very large degree of “bots,” or at least accounts sharing very similar content (Table 30.6). The more unsophisticated groups posted identical or very similar content, including as high as 83% of all tweets for an event (community 7, dark orange). Others showed more complex approaches, such as tweeting news headlines or factual statements, with or without a corresponding link in the tweet. Interestingly, when links were present, they often pointed to Yandex News or even more commonly to heterodox news or blogging sites. While the information captured in the samples of the six events for these communities were often political, the public timelines of the “bot” accounts often included completely unrelated content, such as for commercial purposes and advertising. We assume that these communities of “bots” are owned by marketing or consulting firms and tweet specific content based on the requirements of their clients, without any particular impact on actual political discussions.

Table 30.6 Sample of “bot” communities detected (% identical; % similar)

Evaluating the average automation probability of users in each community, as reported by botometer, reinforces our findings (Table 30.7). Communities of real users had on average low probability of being automated, with a relatively small proportion of users removed or suspended. By the same token, communities, previously identified as highly likely as being “botnets” or having a large prevalence of “bots” (Communities 3, 7, 8 and others) had much higher average probability of being automated. Moreover, a large proportion of accounts in these communities have since been suspended by Twitter. Recently the company expanded its activities to diminish the impact of bots and trolls by suspending multiple accounts.Footnote 12 The results of these actions reinforced our findings, as the proportion of suspended accounts in the communities we identified as botnets was much higher than in real user communities. Indeed, entire “botnets” have all but been suspended by Twitter in the two years since the samples were originally collected (see Table 30.7). On the other hand, some remain relatively untouched. From all above indicators, we conclude that “bots,” while undeniably highly numerous and often verbose on Russian Twitter, are often segregated to isolated mini communities that have little impact on real politically engaged users. Real political communities do likely possess a certain proportion of bots, however, as identified in the literature (Kollanyi 2016; Murthy et al. 2016), these bots are likely to be complex and highly sophisticated, making their study challenging but their potential impact on shifting real conversations greater.

Table 30.7 Botometer (Bot or Not) results (average universal probability of automation; proportion of accounts no longer present two years after samples were originally collected)

5 Conclusion

Data on public and political engagement of Russian citizens, active political discussions and debates, and even protest coordination activities, are readily available to researchers studying Russian politics. This chapter illustrates how SNA can be instrumental and ineluctable in evaluating key research hypotheses utilizing such data. Using six resonant political discussions collected from Twitter over the summer of 2016, we validate multiple political and sociological theories important for Russian studies. Firstly, we find that Russian online society is divided among the same ideological lines as the public sphere, representing two distinct and consistent communities of users, one supporting the Kremlin, and a “non-systemic” opposition that opposes it. Secondly, we validate the presence of “echo-chambers” in Russian social networks, identifying polarization between the two main political groups. Thirdly, we observe that “influencers” on Russian Twitter are not generally traditional political elites, but “interesting” and highly informative users such as that of famous pundits, parody accounts, or news sources. Furthermore, given regular users’ isolation into self-created separate ideological communities as well as further mini-communities, we comment that the expected impact of information control strategies by the government are likely quite limited.

We obtain our results through the application of a thorough and holistic approach in evaluating the network structure, focusing on three levels of analysis. Macroscopic methods, such as community detection and network visualization, supports the evaluation of the “overall” picture of online Russian political society. Mesoscopic methods validate the detected structures and provide insight into the sub-structure of each detected community. Finally, microscopic methods identify the “influential” users who are able to widely disseminate content and impact political discussions.

We find that SNA is also an economical method to detect “bots” in social networks and evaluate their impact on real political users. Currently a hotly debated topic both internationally (Murthy et al. 2016; McKelvey and Dubois 2017; Ferrara 2018) and in Russia (Stukal et al. 2017; Lawrence 2015), our findings demonstrate that the impact of “bots” on Russian social networks is likely quite negligible. We find that while numerous and often verbose, “bots” are mostly isolated to mini-communities far removed from real politically engaged users, and as such are unlikely to impact real political discussions. We conclude by noting that the efficiency of SNA in extracting real and valid structures in Russian social networks makes it a prerequisite and fundament for the application of further advanced methods, such as topic or sentiment analysis, when studying Russian politics (for more on sentiment analysis, see Chap. 28).