Journal of Computational Social Science

, Volume 2, Issue 2, pp 183–205 | Cite as

The intertwined cyberbalkanizations of Facebook pages and their audience: an analysis of Facebook pages and their audience during the 2014 Hong Kong Occupy Movement

  • Chung-hong ChanEmail author
  • Junior Yuner Zhu
  • Cassius Siu-lun Chow
  • King-wa Fu
Research Article


This study tests a hypothesis that information sources (e.g., Facebook pages) that share information more frequently with each other have high level of audience overlapping. This association is also hypothesized to be politically motivated. To test the empirical relationship, a Facebook pages sharing network was created using the information shared between 1453 Facebook pages during a social movement in Hong Kong. The sharing frequency between two pages was denoted as the page-level edge weight. The audience of Facebook pages—commenters and likers of the page’s posts—were collected. The Jaccard similarity coefficient between two pages was measured as the audience-level edge weight. Using network regression analysis, the page-level and audience-level edge weights were significantly associated. To show this relationship is politically motivated, 1076 audience members were randomly selected and with their political preferences labeled by inferring from their Facebook profile pictures. Using machine learning models, the repertoires of Facebook pages that they have interacted with can predict their political preferences. Our study demonstrated that selective sharing between information source is associated with the division of their audiences into enclaved subgroups with similar political ideologies.


Cyberbalkanization Social media Political polarization Audience analysis 







Engagement similarity


Audience overlapping


Mush has been written about fragmentation on the social media, especially during controversial and significant public debate. Since the large-scale collective action Hong Kong Occupy Movement struck in 2014, the city’s public opinion was seriously divided over the views about the relationship between Hong Kong and Mainland China governments, as well as the way how Hong Kong police brutally suppressed the protests. Background about the Hong Kong Occupy Movement can be found elsewhere [14, 42].

The fragmentations were observed both online and offline. The offline world was divided into groups with different opinions about the pro-democracy protests in Hong Kong. For example, the Yellow Ribbon group supports the pro-democracy protests but Blue Ribbon supports the police crackdown of the protests. In the online world, a similar fragmentation was observed. This coexisting online and offline fragmentations enable researchers to study the relationship between the two [13].

The observable online fragmentation is driven by a process called Cyberbalkanization, which is defined as groups/individuals self-sort into like-minded communities between which each community ignores or hates each other [10, 32, 64]. Sunstein [58, 59, 60, 61] suggests such phenomenon as a cause of divergence in political attitudes of citizens to ideological extremes, i.e., political polarization, but evidence for such relationship is weak. For example, Sunstein argues self-sorted, like-minded communities on the social media prevent users from exposing to alternative viewpoints [61]. Many studies, however, found that social media encourage user exposure to more diverse political information (e.g. [3, 4]) and facilitate cross-ideological interactions (e.g. [23]).

However, cyberbalkanization is more evident during social movements and allegedly associated with polarized opinion. Using post-sharing data collected from Facebook pages, a previous study has substantiated a temporal precedence of cyberbalkanization over opinion polarization during the 2014 Hong Kong Occupy Movement [13]. In that study, daily frequencies of post-sharing between like-minded Facebook pages and non-like-minded Facebook pages were operationalized as an index of cyberbalkanization, which was found to be a leading indicator for the polarized rating of the city leader obtained from telephone survey. However, such association is ecological, the theoretical mechanism between cyberbalkanization and opinion polarization in individual level is still lacking. One important theoretical question is how can the posts shared between Facebook pages lead to opinion polarization among individuals? One essential first step, according to a proposed model [13], is how the fragmentation of Facebook pages is related to the fragmentation of information seekers. However, such relationship has not been substantiated. To explore such relationship, we propose that cyberbalkanization actually involves two distinct yet intertwined groups of social media actors: information sources and information seekers [43, 48, 57]. To put this under the Facebook context, there are Facebook pages (abbr.: P) and Facebook Audience (abbr.: A), respectively. Under the current conceptualization, the two groups of actors are self-sorted into groups (i.e., cyberbalkanized) in different ways. Page administrators seek for information from other pages and share them with the page’s audience while users (audience) seek for information from pages to be informed. Because there is a self-sorting at the page-to-page (P2P) and audience-to-page (A2P) information seeking relationships, cyberbalkanization should be observed. The reader should be noted that it is possible to have the third level of cyberbalkanization which is the audience-to-audience (A2A) information seeking relationship. However, it is not feasible to study A2A without access to proprietary Facebook data. Page-to-audience (P2A) is not feasible because most of the user accounts have closed timeline and, therefore, most of their messages are not permitted to be publicly shareable onto a page’s timeline.

The relationships of P2P and A2P cyberbalkanization are hypothesized to be intertwined. The theoretical foundation of this intertwined relationship is based on the contemporary one-step flow model proposed by Bennett & Manheim [7]. Facebook pages are understood as the “mass media” in the one-step flow model in which (p. 213):

the mass media in the one-step flow are increasingly fragmented and differentiated, they contribute to the individualizing process through shrinking audiences, demographically driven programming, and transmitting targeted political advertising and news spin.

As the Facebook audience are also selective in choosing their preferred Facebook pages to seek for cognitive consonant information (A2P), they are segregated into enclaved audience groups. A key assumption is that, Facebook users usually engage with information congruent with their own political ideologies [9]. Sometimes, this assumption does not hold because some users might express their political disagreement by commenting on information (e.g. [6]). A recent study has showed this deliberate engagement with opposite view can increase polarization [2]. Nonetheless, empirical evidence from Hong Kong supports that social media users engage with political congruent information more often than information from information with opposite views [42].

To our knowledge, however, very few if not none has investigated how A2P and P2P are intertwined. To have a holistic view of cyberbalkanization phenomenon, the question should be tested against empirical social media data.

A previous study has shown that Facebook pages sharing network is P2P cyberbalkanized and can be organized into multiple distinct communities mainly defined by political ideology [13]. However, whether or not P2P cyberbalkanization can be extended to A2P remains uncertain. This is the key step needed to explain the relationship between P2P cyberbalkanization and opinion polarization, because it is related to the role played by like-minded-organized Facebook pages on dividing their audience into isolated groups.1 The current study is going to systematically study the relationship using two different approaches [68]: the user-centric approach of engagement similarity of Facebook audience and the audience-centric approach of audience overlapping of Facebook pages. To analyze the data using the above two approaches, a method to collecting audiences of Facebook pages was first derived.

Audiences of Facebook pages

Dictionary definition of audience reads, “the (number of) people watching or listening to a particular television or radio programme, reading a particular book, or visiting a particular website”,2 assuming audience are passive consumers of information, i.e., a collection of spectators. On the social media, however, being a passive information receiver cannot reflect the whole set of online activities across a full spectrum of internet experience. Those who interacted with the content should also be a part of experience of internet audience in the age of social media because of the wider spectrum of activities one is able to do with the online content [8]. Online engagement is a concept to illustrate a broader range of user’s online experience enabled by technological features such as interactivity, content production, and distributed networked environment. For example, a Facebook page enables user engagement by giving “likes” as a way to subscribe to the page’s contents; a Facebook user can click “social buttons” to share, give “like”, or make a comment to an individual post on a Facebook page. One’s sharing, liking, and commenting acts can, if specific privacy options are enabled, be “broadcasted” to the one’s social graph or appear in public. Facebook’s “social buttons” have been known as a way for individual’s participation in political communication, and the tool is known as “tools of political voice” [25]. Such form of activities can simplify the process of involvement of civic engagement via social connectivity, and it can be considered as a political participation [29, 30]. In summary, audience on the social media should not simply be a “collection of spectators” but a “collection of media users” who use the media by viewing and/or engaging with the information provided. This study utilizes a large data set of audience and page relationships to study the P2P and A2P cyberbalkanizations. As theorized previously, A2P cyberbalkanization might be an associated phenomenon of P2P cyberbalkanization but to the author’s knowledge no previous study has explored such relationship.

Measurement of A2P

How the A2P cyberbalkanization can be measured is not straightforward. Drawing from the literature of audience fragmentation, there are two measurement approaches: user-centric approach and media-centric approach.

User-centric approach

The unit of analysis of the user-centric approach is individual social media account. The fragmentation of media users is often measured by studying their media repertoires. Media repertoire is a set of media sources that people regularly consume and it has been used to understand media user’s response to overwhelming abundance of media outlets [69], positing that instead of exploring an expanding set of available outlets, people consume media contents that gratify their needs or are similar to their preferences. Early literature on media repertoire usually focuses on the composition of repertoires (e.g. [44]). Recent studies explore the political implication of repertoire formation. For example, studies have shown political interest and political knowledge (but not political ideology) is a predictor of the composition of media repertoire [31, 62, 65, 71]. These findings also suggest a broad pattern of “news-seekers” and “avoiders” [41] where half of the people (“news avoiders”) in the US opted to not reading news.

The repertoire-based analysis has been extended to the investigation of online news exposure and social media usage, but it is usually branded into a broader term as “news consumption”. Flaxman et al. [24] deployed a web tracker to study the web browsing history of 1.2 million US web users and their frequency of browsing slanted online news outlets. Based on the browsing frequency distribution of each user, a political polarity score is assigned and studied the distribution of that polarity score for all the users. They indeed identified a slight bimodal distribution pattern of the polarity score of all the users, but most of which were still centrists. In another study by Dvir-Gvirsman [22], a small sample of 397 Israeli users was surveyed for their political ideologies and have their web browsing history recorded. Using a social network analysis approach, she used subjects as nodes and co-visiting the same online news outlet for at least two times as edges. She found that users with a more extreme political ideology were associated with a higher homophilic tendency in news website exposure, i.e., similar media repertoire. Dvir-Gvirsman [22] coined “audience homophily” to describe “one’s preference for partisan media websites catering to a homogeneous, like-minded consumership.” Despite the contribution to conceptualization, her approach of dichotomizing a continuous edge to binary outcomes using an arbitrary threshold could be improved by analyzing the edges as continuous instead. Also, the selection of cases of her study was based on existing political ideologies and could possibly introduce selection bias.

In the study, a similar network analysis approach of Dvir-Gvirsman [22] is adopted. Instead of using co-visiting news outlets as binary weighting of edge between nodes, a continuous edge value of “engagement repertoire similarity” is defined to assess the similarity between users’ pattern of pages they engaged with. To demonstrate the audience homophily in such engagement repertoire similarity network, the triad census approach was used to quantify the microstructure in the network. This census approach can provide the structural information about the network. In comparison with the census information from randomly generated benchmark networks, we can observe the organized nodes which are tricky to visualize [12].

Triad census

Triad is subgraph of three nodes in a network. As per the nomenclature by Davis and Leinhardt [21], each configuration of triad has its own label. Based on the Heider’s Social Balance Theory [33], some triad configurations are considered to be structural balanced [11], such as the 102 (‘enemy of my friend is my enemy’ or ‘mutual ignoring’) and 300 (‘friend of my friend is my friend’). The distribution of balanced triads in a network has been used in previous study to characterize the property of a network. For example, Wang and Thorngate [66] studied the distribution of triads in a network and they found that the formation of only balanced triads in a network can lead to the division of the network into two isolated faction which they named ‘social mitosis’. In this study, the engagement similarity network is analyzed with the same methodology of triad census. Formation of excessive balanced triads in the engagement similarity can be served as an evidence of audience homophily.


Audience is more cyberbalkanized than natural.

Audience-centric approach

Traditionally, the second approach to measure audience fragmentation is called media-centric approach. The unit of analysis of the media-centric approach is the media (information sources). Traditionally, fragmentation of audience using the media-centric approach is presented by plotting the distribution of audience number of each media and is often visualized as a long-tailed distribution [1]. Anderson ([1], p. 183) proclaims that such long-tailed distribution of audience numbers for media outlets is ‘‘leading to an explosion of variety and abundant choice in the content we consume are also tending to lead us into tribal eddies. When mass culture breaks apart it doesn’t reform into a different mass. Instead, it turns into millions of microcultures.’’ Even so, such assertion has been criticized by Webster and Ksiazek [68] as being over-simplistic because a static view of long-tail distribution cannot capture the dynamic of fragmentation process, for example, user preference towards niche media and audience loyalties. Instead, Webster and Ksiazek [68] advocate an “audience-centric approach” to study fragmentation using social network analysis [40], in which a “social network of media audience overlapping” was constructed based on data of viewers’ habits. In that network, media outlets are denoted as nodes and edge between nodes is calculated by audience duplication (or audience overlapping). Webster and Ksiazek found that audience overlapping between media outlets was very high and, therefore, they claimed media outlets with isolated audience do not exist (“myth of enclaves”). The readers should be noted that the media outlets included in Webster and Ksiazek’s study were 236 mainstream media, and Facebook was considered to be one of these media outlets. One probably alternative explanation for Webster and Ksiazek’s finding [68] of highly overlapped audience pattern among media outlets might be their threshold of overlapping audience is too low and, therefore, media outlets were densely connected. Their approach could be improved, similar to Dvir-Gvirsman [22], by analyzing edges as continuous values.

In this study, the Webster and Ksiazek’s approach was adopted to investigate the fragmentation pattern of audiences of Facebook pages. However, how the audience overlapping in such network be calculated is still a hotly debated topic. A very nice overview of how audience overlapping should be measured is available in Mukerjee et al. [49], pp. 31–35. In this study, a network of pages’ audience overlapping was constructed based on Jaccard coefficient, where the nodes denote Facebook pages and the edges denotes the level of audience overlapping between two pages with continuous proportional weights ranging from 0 to 1. The audience overlapping measurement we used can adjust for the size of the audience. Our measurement is very similar to Mukerjee et al. [49] Phi coefficient because both Jaccard and Phi are normalized distance measures of two sets of audience from two sources.3 However, we did not filter the edges as in other studies [46, 47] due to different in research goal. In this study, we aim to study the correlation between two distance metrics between pages. In the previous study, the distance between two pages is measured by the frequency of sharing for the P2P cyberbalkanization [13]. For the A2P cyberbalkanization, the distance is measured by the number of audience overlapping. To test whether P2P and A2P cyberbalkanizations are related, two distances, i.e., frequency of sharing and audience overlapping, were measured and network regression analysis was applied to disentangle their inter-relationship in this study.


Cyberbalkanization of pages (P2P) is associated with the cyberbalkanization of their audience (A2P).

Politically motivated information seeking

​ One key question of audience fragmentation is how to explain its formation. As pointed out by previous studies, the fragmentation of media users can be driven by differences in interest, e.g., interest in politics or not [41]. Although previous studies have suggested that the variation of media repertoire among audience can indicate political polarization [37], whether or not the fragmentation is caused by difference in political ideology remains unknown. Similar to the approach of Dvir-Gvirsman [22], if only the audience with known political ideologies were selected and a subset analysis was conducted, a greater level of audience fragmentation could be observed by contrast with the analysis of more diverse audience. This can serve as an indirect evidence to support that cyberbalkanization can have a greater effect on those with explicit political ideologies.

Previous studies have used the social media data to predict the human traits such as personality [38, 70] and mental wellbeing [15]. Some studies estimate the political preference of social media users using the follower–followee relationship on Twitter [5, 18]. Barberá et al. [5] attributed the power of social media data for political preference prediction to “similarity and dissimilarity in their use of social media when it comes to sharing information about politics and current events”. In the language used in this paper, the prediction power comes from the selective seeking of information sources by the audience due to A2P cyberbalkanization. With the broader definition of audience in the current study, we should also be able to predict the political preference of audience with only the pages they engaged with. Moreover, such prediction is an evidence of political motivated audience homophily.


How accurate is the prediction of audiences’ political ideologies using the Facebook pages they engaged with?

Collection of political preference

In the study by Dvir-Gvirsman [22], political ideologies of audience were determined by a survey and, therefore, were laborious to collect. In this study, a proxy is determined among a subset of randomly selected audience. Informational use of social media was found to be positively associated with individuals’ engagement in civic and political activities [26]. Some kinds of online political involvement could be superficial in forms of so-called clicktivism [29, 30] or slacktivism [17]. Examples of these forms of involvement include changing Facebook profile picture to show support of a political ideology or changing the main color of Twitter avatar to demonstrate attitudes on political events, forwarding e-mails about some political incidents to other people, etc. Whether clicktivistic types of online political participation can represent one’s actual political ideologies is a topic interested by social media researchers.

During the Hong Kong Occupy Movement in 2014, the supporters of the movement changed their Facebook profile picture into yellow ribbons, and likewise, as a response to these supporters’ action, their counterpart, the pro-government camp, also changed their Facebook profile picture into “blue ribbons”. Interestingly, besides these two dichotomous political spectrums, the ideology of “Localism” was emerged under this period. Some Hong Kong people masked their profile pictures with the slogan “I am a HongKonger” to present their discontent on the Central Government of People’s Republic of China’s decision on restricting the nomination of electoral candidates in the universal suffrage of Chief Executive in Hong Kong. The political spectrum of Hong Kong began to undergo a gradual change and a prominent political faction is newly established, commonly named by many people (mainly mass media) collectively as the “Localists” and its ideology “Localism” [35]. The change of profile pictures in Facebook was symbolic label to people’s supporting camps in the protests, and the antagonistic relationship between “yellow” and “blue” was favorable for unequivocal classification of political preferences. This political background of Hong Kong allows us to retrieve one’s political preferences from his/her profile picture on Facebook, and examine its relationship with social media engagement.


Sharing network between Hong Kong Facebook pages

Detailed procedures of the Facebook data collection have been reported in Chan and Fu [13]. A brief summary is provided here. Between July 1, 2014 and June 30, 2015, a total of 2983 Hong Kong-based Facebook public pages were iteratively sampled by snowball sampling approach. The iteration started from five seed pages with overtly different political stance, namely scholarism, supporthktv, passiontimes, salutetohkpolice and supportnationaleducation. A well-trained human rater manually went through this list to check whether the listed page mainly published information related to Hong Kong and discarded the irrelevant ones. After that, a total of 2983 pages were retained, whose publicly accessible posts with sharing records were then retrieved through Facebook Graph API. From these sharing records, a sharing network was constructed. In such network, nodes are Facebook pages. Weighted, directed edges between two nodes denote that one Facebook page shared information from another page with the edge value being sharing frequency in the entire study period. This edge value was used previously as the tie strength between two pages [13]. Next, using walktrap community detection algorithm [51], ten communities with at least 30 member nodes were extracted from the sharing network. All member pages of these communities were included into further analysis. The included pages were not all related to politics, e.g., a lot of entertainment pages were included. The reason for conducting a community detection algorithm is to extract the cyberbalkanized portion of the entire Facebook sharing network, because the original definition of cyberbalkanization involved formation of isolated, observable communities (see the definition in the first paragraph).

Sampled audiences of Facebook pages

First, we develop a scheme to sample representative audience members of a Facebook page. Intuitively, subscribers to Facebook page seem to be a right target. However, Facebook graph API does not provide a function to obtain the full list of all subscribers of a Facebook page (in Facebook terminology, confusingly, called ‘likers’). Besides, non-subscriber can also actively interact with shared content published by a Facebook page. Thus, we derived a method to estimate the audience of Facebook pages [13].

The sampling procedures are as follows:
  1. 1.

    A single day was randomly selected from each week of the study period;

  2. 2.

    On each selected day, we extract the messages with the highest comments count and the highest like counts, respectively, from each Facebook page;

  3. 3.

    All unique identity user ids of post-likers and post-commenters were retrieved;

  4. 4.

    For each of the 52 weeks in the study period (i.e., 52 × 2 posts = 104 posts at most), all unique post-likers and post-commenters were extracted and they are defined as the audience of that Facebook page.


Following the above procedures, we obtained a group of likers and commenters of the 1644 Hong Kong Facebook pages collected from snowball sampling. Some Facebook pages were no longer available when we attempted to retrieve likers and commenters data in October 2015. At last, only the data of 1453 pages were obtained. To further exclude audience members who did not engage a diverse range of pages, only those who commented across at least three different Facebook pages were included for further analysis.

The network of Facebook pages and their audience is a two-mode network with two types of nodes (pages and audience) and two types of edges (P2P sharing and A2P information seeking). This two-mode network was converted into two separated one-mode networks for further analysis, namely engagement similarity network and audience overlapping network. They were created to verify our hypotheses, H1 with engagement similarity network and H2 with audience overlapping network. The matrix version of the two-mode network was used to study the RQ1.

Engagement similarity network

The engagement pattern of a user u is represented by a feature vector Γ(u) of engagement frequency, either liking or commenting, on all Facebook pages. The ith element Γi(u) denotes the frequency of engagement on a Facebook page indexed by i, where i ranges from 1 to 1452, i.e., all available Facebook pages.

For a pair of Facebook users, their engagement similarity is calculated by evaluating the distance function between their engagement feature vectors. The engagement similarity ES(u1, u2) is computed at the dyadic level between users u1 and u2, as the cosine similarity between Γ(u1) and Γ(u2) (see Eq. 1). The magnitude of ES ranges from 0 (completely different) to 1 (exactly the same) which quantifies the similarity of Γ(u) between the two users.
$$ {\text{ES}}(u_{1} ,u_{2} ) = \frac{{\left\langle {\varGamma i(u_{1} ),\,\varGamma i(u_{2} )} \right\rangle }}{{(||\varGamma (u_{1} )||\,\cdot\,||\varGamma (u_{2} )||)}} $$

ES was then used as the tie strength between users in a network. An undirected weighted network, in which a node represents a user and an edge between two users carries their ES as weight, was constructed. The goal of the current study is to analyze this engagement similarity network and conduct a triad census to show that balanced triads [36], i.e., 300 and 102, are dominant triad configurations in the network when compared with the control networks whose engagement pattern were randomly generated. H1 posits that either one edge or three edges with very high ES value are more likely to be found in the observed ES network than the randomly generated one.

Estimated triad census of engagement similarity network

Since an evaluation of engagement similarity between all combination of dyads and triads is computationally exhaustive,4 we use another method to estimate the triad census result using a repeated random sample approach. The repeated random sampling of network has been previously used by Granovetter [28]. This approach creates a sampling distribution of parameter ρn (where n = 1, 2, 3 …. total number of samples drawn) from repeated sampling of the entire ES network with replacement, and then estimate the population parameter P (proportion of certain triad) with the mean value of the sampling distribution. Drawing on the Central Limit Theorem [34], the mean of sample’s parameter approximates the population parameter P as n approaches ∞.
$$ P \simeq \frac{{(\rho_{1} + \rho_{2} + \rho_{3} + \cdots + \rho_{n} )}}{n} $$
The following procedure was replicated for 1000 times, i.e., n = 1000.
  1. 1.

    A random sample of 300 users were selected;

  2. 2.

    For each pair of users u1 and u2 in the random sample, the dyadic ES(u1, u2) was calculated;

  3. 3.

    All possible triad configurations of the 300 users were constructed. The triad were censused based on the arrangement of the three valued edges, i.e., vector x = {ES(u1, u2), ES(u2, u3), ES(u1, u3)}, where u1, u2 and u3 form a triad.

  4. 4.

    As the networks were weighted undirected network and there is no methodology to conduct triad census on such network, I derive an operational definition for the triad configurations which are analogous to the equivalent version in the signed network. The four possible configurations are

(i) 000—all three edges are zero;
$$ ES(u_{i} ,u_{j} ) = ES(u_{j} ,u_{k} ) = ES(u_{i} ,u_{k} ) = 0 $$
(ii) 102—The edge with the highest value, say the pair (ui, uj), is at least three times the average of the remaining two edges, i.e., pair (uj, uk) and (ui, uk) pairs;
$$ {\text{ES}}(u_{i} ,u_{i} ) > 3 \times \frac{{{\text{ES}}(u_{j} ,u_{k} ) + {\text{ES}}(u_{i} ,u_{k} )}}{2} $$
(iii) 300—The coefficient of variation, i.e., standard deviation (SD) divided by mean (\( \mu \)), of the edge values is less than 20%;
$$ \frac{{{\text{SD}}[{\text{ES}}(u_{i} ,u_{i} ),{\text{ES}}(u_{j} ,u_{k} ),{\text{ES}}(u_{i} ,u_{k} )]}}{{\mu [{\text{ES}}(u_{i} ,u_{i} ),{\text{ES}}(u_{j} ,u_{k} ),{\text{ES}}(u_{i} ,u_{k} )]}} \le 0.2, \,\,\,\,\,{\text{where}} \,\mu [{\text{ES}}(u_{i} ,u_{i} ),\,\,{\text{ES}}(u_{j} ,u_{k} ),{\text{ES}}(u_{i} ,u_{k} )] \ne 0. $$

If condition b and c are both satisfied, the triad is counted into condition c.

(iv) xxx—Other configurations.

Some examples of balanced and unbalanced triads are displayed in Fig. 1.
Fig. 1

Examples of balance and unbalanced triads in network with continuous edge values

The estimated triad census was deployed to the likers and commenters samples. To make comparison with benchmarks [67], corresponding random benchmarks of estimated triad census were conducted on both the likers and commenters data such that each of all items of vector Γi(u) are randomly shuffled along the index of all Facebook pages, i.e., each user has same overall intensity of engagement frequency but engages with a random selection of pages. The same method was used in Conover et al. [19] to generate random benchmarks.

Using the engagement similarity networks constructed with liking and commenting, the census results were generated from 1000 times of replications. The summary statistics of the observed data and randomly generated data were compared. The 2.75th, 50th and 97.5th percentiles of the census results for each set of data were also computed for comparison.

Audience overlapping network

An audience overlapping network was constructed with nodes were the Facebook pages. The audience of Facebook page p is represented by a feature vector Γ(p) of engagement of users on that page. The ith element of Γi(p) denotes the engagement (yes = 1, no = 0) on the page by user indexed by i, where i ranges from 1 to the total number of all unique users of all pages. The audience overlapping between a pair of Facebook pages AO(pi, pj) is computed at the dyadic level between pages pi and pj, as the Jaccard index between Γ(pi) and Γ(pj) (see Eq. 3). The magnitude of AO ranges from 0 (no overlapping) to 1 (completely overlapping) which quantifies the extent of overlapping between the two pages, i.e., Γ(p).
$$ {\text{AO}}(p_{i} , p_{j} ) = \frac{{\left| {\varGamma (p_{i} ) \mathop \cap \nolimits \varGamma (p_{j} )} \right|}}{{|\varGamma (p_{i} )| + |\varGamma (p_{j} )| - |\varGamma (p_{i} ) \mathop \cap \nolimits \varGamma (p_{j} )|}} $$

AO was then used as the distance between pages in a network. An undirected weighted network, in which a node represents a page and an edge between two pages carries their AO as weight, is constructed.

Relationship between audience overlapping network and sharing network

As the audience overlapping network and the sharing network were network of Facebook pages, their relationship between nodes can be studied by regression analysis. Exponential random graph model (ERGM) is the preferred method of analysis of small networks [27, 45]. However, empirical analysis suggested that ERGM is inaccurate and inefficient with networks more than few hundred nodes [55]. Although a scalable ERGM algorithm is recently available for moderately large networks and tested on networks under 3000 nodes by Schmid and Desmarais [54], we decided to use a simpler approach as an approximation. Nonparametric network regression [39] is an ordinary least square linear regression model with the dependent and independent variables being adjacency matrices. Since the network data structure violates the statistical independence assumption between data points and, therefore, statistical inference cannot be established based on parametric statistical test such as Wald test. Alternative hypothesis test, for example quadratic assignment procedure permutation test, was used in the current study to test the statistical significance of independent variables To test the H1, the association between audience overlapping network and sharing network was computed by network regression method provided by the R function netlm() from the sna package. As the sharing network was a directed but the audience duplication network was undirected, the sharing network was converted to undirected network using symmetrization (adding up the edge values of both edges between two nodes). For example, the two directed edges of A-8- > B and B-1- > A are symmetrized to an undirected edge of A-9-B. This symmetrization discards information about the discrepancy in sharing frequencies between two pages. Conceptually, this converted network downgrades the higher ties strength between two pages as indicated by mutual sharing but overestimate the ties strength of two pages when the discrepancy in sharing is higher. For example, an undirected edge of A-10-B can be a conversion from (1) A-5- > B and B-5- > A, or (2) A-10- > B and B-0- > A. Clearly, the former case indicates higher ties strength between A and B. As indicated by a previous study [12], mutual sharing is quite rare in a sharing network. Our method of symmetrization might not discard a lot of useful information.

Subset analysis of audience members with known political stances

To study the RQ1, a group of audience with known political ideologies were randomly selected. The political preferences of Facebook users were estimated based on their profile picture history.

The political preference of Facebook users were manually coded based on the public record of updated history of Facebook profile pictures and they were classified into three major political factions in Hong Kong, which are Yellow Ribbon, Blue Ribbon and Localism, that each Facebook user appears to be more likely to support.

The operational definitions of supporters of the three fractions were as follows: Yellow Ribbon—an account who makes at least one symbolic reference to supporting the 2014 Occupy Movement, such as posting icons of the movement, for examples yellow ribbon or umbrella, or supporting for a group of pan-democrat candidates; “Blue Ribbon”—an account who makes at least one symbolic reference to supporting law enforcement bodies to crack down the 2014 pro-democracy protests, for example blue ribbons, supporting for a group of pro-establishment candidates, Hong Kong SAR government and the Beijing government, or making emphasis on Chinese identity; “Localism”—an account who makes at least one symbolic reference to supporting democratic self-determination, Hong Kong nationalism and Hong Kong independence, supporting for a group of Localist candidates, or making emphasis on a local Hong Kong identity.

The classification of the above three fractions is assumed to be non-mutually exclusive, i.e., one can support more than one fraction. To establish the interrater reliability, a random subset of 150 Facebook users were obtained and then three coders coded the updated history of their Facebook profile pictures independently based on the above coding scheme. The kappa coefficients for the three fractions were 0.934 (Yellow Ribbon), 0.954 (Blue Ribbon) and 0.957 (Localism), respectively, indicating very high interrater reliability. Next, three coders manually coded 2815 randomly selected Facebook users and only those with known political preference were selected for further analysis.

Prediction of political ideologies of audience based on engagement pattern

Using the previously defined feature vector of Γi(u) (engagement frequency, either liking or commenting, on all Facebook pages), the political ideology of the ui was predicted. In this analysis, the ith element Γi(u) denotes the total frequency of engagement (comment plus like) on a Facebook page indexed by i, where i ranges from 1 to 1452, i.e., all available Facebook pages.

All audience with known political ideology were randomly split into training set and test set in a 70:30 basis. The training set was used to train a classifier using extreme gradient boosting (XGBoost) algorithm [16]. The hyperparameters of the XGBoost algorithm were tuned against the accuracy of prediction using tenfold cross validation with only the data from training set.5 Three XGBoost models were trained and tuned to predict the Yellow Ribbon, Blue Ribbon and Localism. The fully tuned models were finally tested with the test set data and report the prediction performance metrics such as precision, recall and F1 score. All statistical analysis was conducted with R.


Number of commenters varied widely among all 1452 pages. The mean number of unique commenters was 623.49, ranging from 1 to 53,111 users. In total, there were 533,534 unique commenters and only 64,886 commenters made comments on at least three pages.

Audience with known political ideologies

A group of 2815 randomly selected audience was selected from all audience and their political ideologies were coded independently by three coders. There are 1076 of them with explicit political ideologies as reflected by their history of profile picture updates. Among 1076 users, 962 (89%), 120 (11%) and 109 (10%) have been classified into Yellow Ribbon, Blue Ribbon and Localists camp, respectively.

Engagement similarity network: estimated triad census

The estimated triad census results of the engagement similarity network with commenting data are presented in Table 1. As shown in the triad census results, there were markedly fewer empty triads in the observed data when compared with the random benchmarks, while the liking ES networks were generally found to be more connected. The 300 and 102 configurations were at least 80 and 2 times more frequent in the observed data than the random benchmarks, respectively. It suggested that the two balanced triads (102 and 300 configurations) were considerably more prominent in the observed engagement network data than the random benchmarks. The 102 configurations were the second most common and the most common configuration in the commenting and liking ES network, respectively. Therefore, H1 is supported.
Table 1

The estimated triad census results from all audience, subset of audience with known political preference and random benchmark


All audience

Random benchmark



46.5 (40.64–52.33)

81.5 (79.4–83.4)


46.4 (42.3–50.3)

17.9 (16.0–19.8)


0.8 (0.5–1.2)

0.01 (0.01–0.02)


6.3 (4.7–8.2)

0.6 (0.5–0.8)



10.2 (7.5–13.8)

38.7 (33.2–44.2)


57.9 (53.4–62.2)

45.5 (40.7–49.6)


2.7 (2.0–3.7)

0.2 (0.1–0.3)


29.0 (23.4–35.0)

15.7 (10.3–22.0)

Numbers in the table are median percentage of all triads (2.5–97.5 percentile) from 1000 replications

Network regression

The audience overlapping network was regressed with the sharing network. The regression coefficients were presented in Table 2. The sharing network was a statistically significant regressor for the audience overlapping networks, indicating in general that higher frequency of sharing between two pages was associated with higher level of audience overlapping. Therefore, H2 is supported.
Table 2

Network regression models with valued sharing network as the dependent variable


Model 1

Model 2











Prediction of political ideologies

The prediction performances of the three XGBoost models are presented in Table 3. Using the engagement data, the audience sided with Yellow Ribbon and Blue Ribbon can be accurately predicted. However, the localist audience were difficult to predict from their engagement patterns.
Table 3

Performance metrics of the three XGBoost models


Training set

Test set




























Higher value indicates better performance


Using a basket of multiple methodologies, the current study untangled rarely studied relationships between information source and their audience. Using a user-centric approach to study audience fragmentation, the current study confirms the presence of audience homophily in which social media users are clustered with those who share similar online engagement pattern but distance themselves from those who engage differently. Using an audience-centric approach to study audience fragmentation, the results indicate that the disconnection between Facebook pages in terms of lack of information sharing is also reflected in the segregation of their audience. This relationship suggests that the process of cyberbalkanization appears to be an intertwined phenomenon between fragmentations of pages and their audience. To our knowledge, this is the first study to investigate this micro–macro linkage of cyberbalkanization, that is to say the macro-level phenomenon of P2P cyberbalkanization is associated with the micro-level segregation of A2P cyberbalkanization of audience.

To study whether or not the micro–macro linkage of cyberbalkanization is motivated by the differences in political ideologies, a subset of audience with holds explicit political expression was selected. In the predictive analysis, the finding suggests that audience with similar view in general engaged with similar set of pages and one can exploit such pattern to predict political ideologies of some Facebook users using only their engagement pages of choice, except for users with Localism political ideology.6

Early study of the Internet news site, say Slashdot, has pointed out the enabling power for readers to comment on the news content is a characteristic of public sphere [52]. It is because the mechanism of reader engagement welcomes new or previously marginalized discussants. In the current study, we found that in an environment where plenty of commenting opportunities are available, Facebook users are very selective regarding which channel or which post they want to comment. This selective engagement behavior echoes Dahlberg’s vision [20] of ‘issue publics’ (groups of people who pay attention to one particular issue) or what Papacharissi [50] suggests as ‘virtual sphere’ (‘several spheres of counterpublics that have been excluded from mainstream political discourse’). Our evidence seems to suggest that social media as a whole is mostly unsuccessful to converge discussions between different interest groups but it rather diverges them. Change in platform design might be helpful to encourage deliberative discussions between polarized online publics [53, 63]. Drawing on previous finding on social mitosis, the audience homophily we observed in the current study (excessive formation of balanced triads in real life social media) might further lead to “engagement mitosis”, i.e., emergence of only two groups in the entire engagement similarity network, with completely dissimilar engagement patterns between the two. Such phenomenon might have significant implications to both online political deliberation and online media business. Nevertheless, the possible effect of cross-cutting exposure was also observed and it can help to mitigate “engagement mitosis” in users with explicit political ideologies. We call for longitudinal study to monitor the longitudinal change in engagement similarity and the potential social impact.

“Divide et Impera”

Based on the finding, we posit that a group of active Facebook pages can take up a position that is similar to partisan media outlets in segregating their readers or viewers. Selective exposure plays a vital role in such user segregation where audience selectively expose themselves to social media contents which are consistent with political ideology of their own. We proposed here a “Divide et Impera” (“Divide and Conquer”)7 explanation of cyberbalkanization to integrate the two mechanisms of opinion polarization. Such explanation consists of two steps, namely “Divide” and “Conquer”. The first step “Divide” outlines the process that Facebook pages is associated with segregation of the audience into insulated communities; The second step “Conquer”, which needs to be ascertained in future studies, is that Facebook pages then feed messages to their audience who have been organized into insulated communities with biased political information.


First, the current study cannot establish cause and effect relationship between selective sharing of Facebook pages and audience segregation. The relationship can go either way or might be spurious. However, cause and effect relationship by definition cannot be established with observational study and online experiment is practically impossible to conduct. Even though the causality between the selective sharing and audience segregation cannot be established, the above “Divide and Conquer” mechanism is still valid as long as both conditions of segregated audience and selective sharing coexist.

Second, the current study design cannot study the non-engaged users, i.e., lurkers, since these users might also be segregated by the exposure to polarized information from pages. This analysis only included those who engaged with at least three pages so that we can study the similarity in engagement patterns between these Facebook users. Therefore, our result might not be applicable to casual users who only engage with less than three pages within the study period. Readers should be cautious to the study conclusion which might only applicable to frequent and outspoken users on the social media. Alternative research method, such as survey or digital ethnography, might be more useful to study the behavior of casual users.

Third, our estimation of audience of a page might not be comprehensive because our estimation is only based on a sample of content generated by Facebook pages. Without access to full raw data of Facebook and the limitation of Facebook API, we believe the sampling strategy used in this study is unbiased. Further study is still warranted to validate our method of estimation with complete data obtained from Facebook Inc.

In the estimated triad census, our definitions of 102 and 300 configuration might be arbitrary. When analyzing both the observed empirical ES networks and the random benchmarks, same set of operational definitions were imposed such that our comparisons between networks are valid and interpretable. Since the magnitude of the triad distribution can still be affected by the operational definitions used in this study, we performed a small-scale sensitivity analysis and found that the main conclusion is still robust even when there is small variation in the parameters.

In conclusion, the current study provided evidence to support an association between Facebook pages’ cyberbalkanization and segregation of audience. In the practical perspective, these similarity and dissimilarity in engagement pattern can be leveraged to predict the political ideology of outspoken Facebook audience members.

Availability of data and material

The anonymized data will be available on github upon acceptance.


  1. 1.

    Due to enclave deliberation, these isolated groups might develop more extreme views. However, the current study is not proposed to demonstrate this. For the effect of enclave deliberation on political polarization and how to avoid it, please refer to Strandberg, Himmelroos, & Grönlund [56].

  2. 2.
  3. 3.

    We conducted a simulation study with simulated datasets. The two metrics (Phi and Jaccard) are highly correlated.

  4. 4.

    Suppose the audience size is 60,000 and, therefore, they have 17,999,700,000 dyads and 3.60 × 1016 triads. Suppose 500 triads can be computed per second, the whole triad census will take 2,283,094 years. As a reference, 2,283,094 years ago dates back to the middle old stone age when the species Homo sapiens did not exist.

  5. 5.

    The model tuning was performed with the R package caret.

  6. 6.

    The low recall (i.e., those with localism political ideology are wrongly classified as non-localism by the XGBoost model) for the prediction of localism political ideology might be explained by the Localist is a very heterogenous group of audience with different agendas and, therefore, they have different engagement patterns. Localists encompass a diverse range of users such as (1) autonomists, (2) pro-independent activists, (3) pro-democractic self-determination activists and (4) those who are disappointed by the old style of social movement and social administration.

  7. 7.

    The quote around “divide and conquer” is important because the meaning here is deviated from the original divide et impera in one significant way: the original meaning implies a mastermind (such as Julius Caesar) behind the division and ruling processes but the situation here is completely self-organized and self-imposed.


Author contributions

Conceived and designed: CHC, JYZ, KWF; data collection: CHC, JYZ, CSLC; data analysis: CHC, JYZ; manuscript preparation: CHC, JYZ, KWF; all authors read and approved the final manuscript


This research project (Project Number: 2013.A8.009.14A) is funded by the Public Policy Research Funding Scheme of the Central Policy Unit of the Government of the Hong Kong Special Administrative Region. Part of the first author’s PhD studentship is supported by the HKU SPACE Postgraduate Fund.

Compliance with ethical standards

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.


  1. 1.
    Anderson, C. (2008). The long tail: why the future of business is selling less of more (Revised ed.). New York: Hachette Books.Google Scholar
  2. 2.
    Bail, C. A., Argyle, L. P., Brown, T. W., Bumpus, J. P., Chen, H., Hunzaker, M. F., et al. (2018). Exposure to opposing views on social media can increase political polarization. Proceedings of the National Academy of Sciences, 115(37), 9216–9221.CrossRefGoogle Scholar
  3. 3.
    Bakshy, E., Messing, S., & Adamic, L. A. (2015). Exposure to ideologically diverse news and opinion on Facebook. Science, 348(6239), 1130–1132. Scholar
  4. 4.
    Barberá, P. (2014). How social media reduces mass political polarization. Evidence from Germany, Spain, and the US. Job Market Paper, New York University. Retrieved from
  5. 5.
    Barberá, P., Jost, J. T., Nagler, J., Tucker, J. A., & Bonneau, R. (2015). Tweeting from left to right is online political communication more than an echo chamber? Psychological Science, 26(10), 1531–1542. Scholar
  6. 6.
    Barnidge, M. (2018). Social affect and political disagreement on social media. Social Media + Society, 4(3), 2056305118797721. Scholar
  7. 7.
    Bennett, W. L., & Manheim, J. B. (2006). The one-step flow of communication. The ANNALS of the American Academy of Political and Social Science, 608(1), 213–232. Scholar
  8. 8.
    Bernstein, M. S., Bakshy, E., Burke, M., & Karrer, B. (2013). Quantifying the invisible audience in social networks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 21–30). ACM. Retrieved from
  9. 9.
    Boutet, A., Kim, H., & Yoneki, E. (2013). What’s in Twitter, I know what parties are popular and who you are supporting now! Social Network Analysis and Mining, 3(4), 1379–1391.CrossRefGoogle Scholar
  10. 10.
    Brainard, L. (2009). Cyber-communities. In H. Anheier & S. Toepler (Eds.), International Encyclopedia of Civil Society. Berlin: Springer.Google Scholar
  11. 11.
    Cartwright, D., & Harary, F. (1956). Structural balance: A generalization of Heider’s theory. Psychological Review, 63(5), 277.CrossRefGoogle Scholar
  12. 12.
    Chan, C., & Fu, K. (2018). The “mutual ignoring” mechanism of cyberbalkanization: triangulating observational data analysis and agent-based modeling. Journal of Information Technology and Politics, 15(4), 378–387. Scholar
  13. 13.
    Chan, C. H., & Fu, K. W. (2017). The relationship between cyberbalkanization and opinion polarization: Time-series analysis on Facebook pages and opinion polls during the Hong Kong Occupy Movement and the associated debate on political reform. Journal of Computer-Mediated Communication, 22(5), 266–283.CrossRefGoogle Scholar
  14. 14.
    Chan, J. (2014). Hong Kong’s umbrella movement. The Round Table, 103(6), 571–580.CrossRefGoogle Scholar
  15. 15.
    Chen, L., Gong, T., Kosinski, M., Stillwell, D., & Davidson, R. L. (2017). Building a profile of subjective well-being for social media users. PLoS One, 12(11), e0187278. Scholar
  16. 16.
    Chen, T., & Guestrin, C. (2016). XGBoost: a scalable tree boosting system. [Cs], pp 785–794.
  17. 17.
    Christensen, H. S. (2011). Political activities on the Internet: Slacktivism or political participation by other means? First Monday, 16(2). Retrieved from
  18. 18.
    Conover, M. D., Gonçalves, B., Flammini, A., & Menczer, F. (2012). Partisan asymmetries in online political activity. EPJ Data Science, 1(1), 6. Scholar
  19. 19.
    Conover, M., Ratkiewicz, J., Francisco, M. R., Gonçalves, B., Menczer, F., & Flammini, A. (2011). Political polarization on Twitter. ICWSM, 133, 89–96.Google Scholar
  20. 20.
    Dahlberg, L. (2001). Computer-mediated communication and the public sphere: A critical analysis. Journal of Computer-Mediated Communication, 7(1), JCMC714.Google Scholar
  21. 21.
    Davis, J. A., & Leinhardt, S. (1967). The structure of positive interpersonal relations in small groups. Retrieved from
  22. 22.
    Dvir-Gvirsman, S. (2016). Media audience homophily: Partisan websites, audience identity and polarization processes. New Media and Society, 1461444815625945.Google Scholar
  23. 23.
    Feller, A., Kuhnert, M., Sprenger, T. O., & Welpe, I. M. (2011). Divided they tweet: The network structure of political microbloggers and discussion topics. In Fifth International AAAI Conference on Weblogs and Social Media.Google Scholar
  24. 24.
    Flaxman, S., Goel, S., & Rao, J. M. (2013). Ideological segregation and the effects of social media on news consumption. Available at SSRN. Retrieved from
  25. 25.
    Gerodimos, R., & Justinussen, J. (2015). Obama’s 2012 Facebook campaign: Political communication in the age of the like button. Journal of Information Technology and Politics, 12(2), 113–132.CrossRefGoogle Scholar
  26. 26.
    Gil de Zúñiga, H., Jung, N., & Valenzuela, S. (2012). Social media use for news and individuals’ social capital, civic engagement and political participation. Journal of Computer-Mediated Communication, 17(3), 319–336. Scholar
  27. 27.
    Goodreau, S. M. (2007). Advances in exponential random graph (p*) models applied to a large social network. Social Networks, 29(2), 231–248. Scholar
  28. 28.
    Granovetter, M. S. (1973). The strength of weak ties. American Journal of Sociology, 1360–1380.Google Scholar
  29. 29.
    Halupka, M. (2014). Clicktivism: A systematic heuristic. Policy and Internet, 6(2), 115–132. Scholar
  30. 30.
    Halupka, M. (2014). Clicktivism: A systematic heuristic. Policy and Internet, 6(2), 115–132. Scholar
  31. 31.
    Hasebrink, U., & Popp, J. (2006). Media repertoires as a result of selective media use. A conceptual approach to the analysis of patterns of exposure. Communications, 31(3), 369–387. Scholar
  32. 32.
    Hebenstreit, J. (2014). Cyberbalkanization. In Proceedings of the 2nd Conference for EDemocracy and Open Government (CeDEM) Asia.Google Scholar
  33. 33.
    Heider, F. (1946). Attitudes and cognitive organization. The Journal of Psychology, 21(1), 107–112.CrossRefGoogle Scholar
  34. 34.
    Heyde, C. C. (2014). Central limit theorem. Wiley StatsRef: Statistics reference online. Hoboken: Wiley. Scholar
  35. 35.
    Kaeding, M. P. (2017). The rise of “Localism” in Hong Kong. Journal of Democracy, 28(1), 157–171.CrossRefGoogle Scholar
  36. 36.
    Khanafiah, D., & Situngkir, H. (2004). Social balance theory: revisiting Heider’s balance theory for many agents. Retrieved from
  37. 37.
    Kim, S. J., & Webster, J. G. (2012). The impact of a multichannel environment on television news viewing: A longitudinal study of news audience polarization in South Korea. International Journal of Communication, 6, 19.Google Scholar
  38. 38.
    Kosinski, M., Stillwell, D., & Graepel, T. (2013). Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences, 110(15), 5802–5805.CrossRefGoogle Scholar
  39. 39.
    Krackhardt, D. (1988). Predicting with networks: Nonparametric multiple regression analysis of dyadic data. Social Networks, 10(4), 359–381.CrossRefGoogle Scholar
  40. 40.
    Ksiazek, T. B. (2011). A network analytic approach to understanding cross-platform audience behavior. Journal of Media Economics, 24(4), 237–251. Scholar
  41. 41.
    Ksiazek, T. B., Malthouse, E. C., & Webster, J. G. (2010). News-seekers and avoiders: Exploring patterns of total news consumption across media and the relationship to civic participation. Journal of Broadcasting and Electronic Media, 54(4), 551–568.CrossRefGoogle Scholar
  42. 42.
    Lee, F. L. (2016). Impact of social media on opinion polarization in varying times. Communication and the Public, 1(1), 56–71.CrossRefGoogle Scholar
  43. 43.
    Liu, Z., & Weber, I. (2014). Is twitter a public sphere for online conflicts? A cross-ideological and cross-hierarchical look. In L. M. Aiello & D. McFarland (Eds.), Social Informatics (pp. 336–347). Springer, Berlin.
  44. 44.
    Lochte, R. H., & Warren, J. (1989). A channel repertoire for TVRO satellite viewers. Journal of Broadcasting and Electronic Media, 33(1), 91–95. Scholar
  45. 45.
    Lubbers, M. J., & Snijders, T. A. B. (2007). A comparison of various approaches to the exponential random graph model: A reanalysis of 102 student networks in school classes. Social Networks, 29(4), 489–507. Scholar
  46. 46.
    Majó-Vázquez, S., Cardenal, A. S., & González-Bailón, S. (2017). Digital news consumption and copyright intervention: evidence from Spain before and after the 2015 “Link Tax”. Journal of Computer-Mediated Communication, 22(5), 284–301. Scholar
  47. 47.
    Majó-Vázquez, S., Nielsen, R. K., & González-Bailón, S. (2018). The backbone structure of audience networks: A new approach to comparing online news consumption across countries. Political Communication. Scholar
  48. 48.
    Morales, A. J., Borondo, J., Losada, J. C., & Benito, R. M. (2015). Measuring political polarization: Twitter shows the two sides of Venezuela. Chaos: An Interdisciplinary Journal of Nonlinear Science, 25(3), 3114.CrossRefGoogle Scholar
  49. 49.
    Mukerjee, S., Majó-Vázquez, S., & González-Bailón, S. (2018). Networks of audience overlap in the consumption of digital news. Journal of Communication, 68(1), 26–50. Scholar
  50. 50.
    Papacharissi, Z. (2002). The virtual sphere: The internet as a public sphere. New Media and Society, 4(1), 9–27. Scholar
  51. 51.
    Pons, P., & Latapy, M. (2005). Computing communities in large networks using random walks (long version). Retrieved from
  52. 52.
    Poor, N. (2005). Mechanisms of an online public sphere: The website slashdot. Journal of Computer-Mediated Communication. Scholar
  53. 53.
    Rose, J., & Saebø, Ø. (2010). Designing deliberation systems. The Information Society, 26(3), 228–240.CrossRefGoogle Scholar
  54. 54.
    Schmid, C. S., & Desmarais, B. A. (2017). Exponential random graph models with big networks: Maximum pseudolikelihood estimation and the parametric bootstrap. Retrieved from [Stat].
  55. 55.
    Schweinberger, M. (2011). Instability, sensitivity, and degeneracy of discrete exponential families. Journal of the American Statistical Association, 106(496), 1361–1370.CrossRefGoogle Scholar
  56. 56.
    Strandberg, K., Himmelroos, S., & Grönlund, K. (2019). Do discussions in like-minded groups necessarily lead to more extreme opinions? Deliberative democracy and group polarization. International Political Science Review, 40(1), 41–57. Scholar
  57. 57.
    Suhay, E., Blackwell, A., Roche, C., & Bruggeman, L. (2015). Forging bonds and burning bridges polarization and incivility in blog discussions about occupy wall street. American Politics Research, 43(4), 643–679. Scholar
  58. 58.
    Sunstein, C. R. (2000). Deliberative trouble? Why groups go to extremes. The Yale Law Journal, 110(1), 71–119. Scholar
  59. 59.
    Sunstein, C. R. (2007). Ideological amplification. Constellations, 14(2), 273–279.CrossRefGoogle Scholar
  60. 60.
    Sunstein, C. R. (2009). 2.0. Princeton: Princeton University Press.Google Scholar
  61. 61.
    Sunstein, C. R. (2017). #Republic: divided democracy in the age of social media. Princeton: Princeton University Press.CrossRefGoogle Scholar
  62. 62.
    Taneja, H., Webster, J. G., Malthouse, E. C., & Ksiazek, T. B. (2012). Media consumption across platforms: Identifying user-defined repertoires. New Media and Society, 14(6), 951–968. Scholar
  63. 63.
    Towne, W. B., & Herbsleb, J. D. (2012). Design considerations for online deliberation systems. Journal of Information Technology and Politics, 9(1), 97–115.CrossRefGoogle Scholar
  64. 64.
    Van Alstyne, M., & Brynjolfsson, E. (1996). Electronic communities: Global village or cyberbalkans. In Proceedings of the 17th International Conference on Information Systems. New York: Wiley. Retrieved from Scholar
  65. 65.
    van Rees, K., & van Eijck, K. (2003). Media repertoires of selective audiences: the impact of status, gender, and age on media use. Poetics, 31(5), 465–490. Scholar
  66. 66.
    Wang, Z., & Thorngate, W. (2003). Sentiment and social mitosis: Implications of Heider’s Balance Theory. Journal of Artificial Societies and Social Simulation, 6(3). Retrieved from
  67. 67.
    Wasserman, S. S. (1977) Random directed graph distributions and the triad census in social networks. The Journal of Mathematical Sociology, 5(1), 61–86. Scholar
  68. 68.
    Webster, J. G., & Ksiazek, T. B. (2012). The dynamics of audience fragmentation: Public attention in an age of digital media. Journal of Communication, 62(1), 39–56. Scholar
  69. 69.
    Webster, J., & Phalen, P. F. (1996). The mass audience: Rediscovering the dominant model. Mahwah, NJ: Routledge.Google Scholar
  70. 70.
    Youyou, W., Kosinski, M., & Stillwell, D. (2015). Computer-based personality judgments are more accurate than those made by humans. Proceedings of the National Academy of Sciences, 112(4), 1036–1040.CrossRefGoogle Scholar
  71. 71.
    Yuan, E. J., & Webster, J. G. (2006). Channel repertoires: Using peoplemeter data in Beijing. Journal of Broadcasting and Electronic Media, 50(3), 524–536. Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Mannheimer Zentrum für Europäische SozialforschungUniversität MannheimMannheimGermany
  2. 2.Journalism and Media Studies CentreUniversity of Hong KongHong KongChina

Personalised recommendations