Skip to main content
Log in

Measuring and moderating opinion polarization in social networks

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

The polarization of society over controversial social issues has been the subject of study in social sciences for decades (Isenberg in J Personal Soc Psychol 50(6):1141–1151, 1986, Sunstein in J Polit Philos 10(2):175–195, 2002). The widespread usage of online social networks and social media, and the tendency of people to connect and interact with like-minded individuals has only intensified the phenomenon of polarization (Bakshy et al. in Science 348(6239):1130–1132, 2015). In this paper, we consider the problem of measuring and reducing polarization of opinions in a social network. Using a standard opinion formation model (Friedkin and Johnsen in J Math Soc 15(3–4):193–206, 1990), we define the polarization index, which, given a network and the opinions of the individuals in the network, it quantifies the polarization observed in the network. Our measure captures the tendency of opinions to concentrate in network communities, creating echo-chambers. Given this numeric measure of polarization, we then consider the problem of reducing polarization in the network by convincing individuals (e.g., through education, exposure to diverse viewpoints, or incentives) to adopt a more neutral stand towards controversial issues. We formally define the ModerateInternal and ModerateExpressed problems, and we prove that both our problems are NP-hard. By exploiting the linear-algebraic characteristics of the opinion formation model we design polynomial-time algorithms for both problems. Our experiments with real-world datasets demonstrate the validity of our metric, and the efficiency and the effectiveness of our algorithms in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://en.wikipedia.org/wiki/The_dress.

  2. http://www.people-press.org/2014/06/12/political-polarization-in-the-american-public/.

  3. https://networkdata.ics.uci.edu/data.php?id=105.

  4. https://networkdata.ics.uci.edu/data.php?id=8.

  5. https://networkdata.ics.uci.edu/data.php?id=102.

References

  • Adamic LA, Glance N (2005) The political blogosphere and the 2004 u.s. election: Divided they blog. In: International workshop on link discovery, LinkKDD

  • Akoglu L (2014) Quantifying political polarity based on bipartite opinion networks. In: International conference on weblogs and social media, ICWSM

  • Amelkin V, Singh AK, Bogdanov P (2015) A distance measure for the analysis of polar opinion dynamics in social networks. arXiv:1510.05058

  • Bakshy E, Messing S, Adamic L (2015) Exposure to ideologically diverse news and opinion on Facebook. Science 348(6239):1130–1132

    Article  MathSciNet  MATH  Google Scholar 

  • Bessi A, Zollo F, Vicario MD, Puliga M, Scala A, Caldarelli G, Uzzi B, Quattrociocchi W (2016) Users polarization on Facebook and Youtube. PLoS ONE 11(8):e0159641

    Article  Google Scholar 

  • Bindel D, Kleinberg JM, Oren S (2015) How bad is forming your own opinion? Games Econ Behav 92:248–265

    Article  MathSciNet  MATH  Google Scholar 

  • Cambria E, Poria S, Bisio F, Bajpai R, Chaturvedi I (2015) The CLSA model: a novel framework for concept-level sentiment analysis. Springer International Publishing, Cham. doi:10.1007/978-3-319-18117-2_1

    Google Scholar 

  • Cambria E, Poria S, Bajpai R, Schuller BW (2016) SenticNet 4: A semantic resource for sentiment analysis based on conceptual primitives. In: 26th International conference on computational linguistics (COLING 2016), Proceedings of the conference: Technical Papers, Osaka, Japan, December 11–16, 2016, pp. 2666–2677

  • Chen T, Xu R, He Y, Xia Y, Wang X (2016) Learning user and product distributed representations using a sequence model for sentiment analysis. IEEE Comp Int Mag 11(3):34–44. doi:10.1109/MCI.2016.2572539

    Article  Google Scholar 

  • Conover M, Ratkiewicz J, Francisco MR, Gonçalves B, Menczer F, Flammini A (2011) Political polarization on Twitter. In: International conference on weblogs and social media ICWSM

  • Dandekar P, Goel A, Lee DT (2013) Biased assimilation, homophily, and the dynamics of polarization. Proc Natl Acad Sci 110(15):5791–5796

    Article  MathSciNet  MATH  Google Scholar 

  • Davis G, Mallat S, Zhang Z (1994) Adaptive time-frequency decompositions with matching pursuits. Opt Eng 33(7):2183–2191

  • Del Vicario M, Scala A, Caldarelli G, Stanley HE, Quattrociocchi W (2017) Modeling confirmation bias and polarization. Sci Rep 7:40391. doi:10.1038/srep40391

    Article  Google Scholar 

  • Feige U (2003) Vertex cover is hardest to approximate on regular graphs. Technical report MCS03-15 of the Weizmann Institute

  • Friedkin NE, Johnsen E (1990) Social influence and opinions. J Math Soc 15(3–4):193–206

    Article  MATH  Google Scholar 

  • Garimella K, Morales GDF, Gionis A, Mathioudakis M (2016) Quantifying controversy in social media. In: ACM international conference on web search and data mining, WSDM, pp 33–42

  • Garimella VRK, Morales GDF, Gionis A, Mathioudakis M (2017) Reducing controversy by connecting opposing views. In: ACM WISDOM international conference on web search and data mining

  • Garrett RK (2009) Echo chambers online? Politically motivated selective exposure among internet news users1. J Comput Mediat Commun 14(2):265–285. doi:10.1111/j.1083-6101.2009.01440.x

    Article  MathSciNet  Google Scholar 

  • Gionis A, Terzi E, Tsaparas P (2013) Opinion maximization in social networks. In: SIAM international conference on data mining, pp 387–395

  • Guerra PHC, Jr, WM, Cardie C, Kleinberg R (2013) A measure of polarization on social media networks based on community boundaries. In: International conference on weblogs and social media, ICWSM

  • Hager WW (1989) Updating the inverse of a matrix. SIAM Rev 31(2):221–239

    Article  MathSciNet  MATH  Google Scholar 

  • Isenberg DJ (1986) Group polarization: a critical review and meta-analysis. J Personal Soc Psychol 50(6):1141–1151

    Article  Google Scholar 

  • Kempe D, Kleinberg J, Tardos E (2003) Maximizing the spread of influence through a social network. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 137–146

  • Lappas T, Crovella M, Terzi E (2012) Selecting a characteristic set of reviews. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 832–840

  • Lawrence P, Sergey B, Motwani R, Winograd T (1998) The pagerank citation ranking: bringing order to the web. Technical report, Stanford University

  • Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167

    Article  MathSciNet  Google Scholar 

  • Mallat S (2008) A wavelet tour of signal processing, third edition: the sparse way, 3rd edn. Academic Press, Cambridge

    MATH  Google Scholar 

  • Munson SA, Lee SY, Resnick P (2013) Encouraging reading of diverse political viewpoints with a browser widget. In: International conference on weblogs and social media, ICWSM

  • Munson SA, Resnick P (2010) Presenting diverse political opinions: how and how much. In: International conference on human factors in computing systems, CHI, pp 1457–1466

  • Natarajan BK (1995) Sparse approximate solutions to linear systems. SIAM J Comput 24(2):227–234

    Article  MathSciNet  MATH  Google Scholar 

  • Pariser E (2011) The filter bubble: what the internet is hiding from you. The Penguin Group

  • Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl Based Syst 108(C):42–49. doi:10.1016/j.knosys.2016.06.009

    Article  Google Scholar 

  • Sunstein CR (2002) The law of group polarization. J Polit Philos 10(2):175–195

    Article  Google Scholar 

  • Vicario MD, Scala A, Caldarelli G, Stanley HE, Quattrociocchi W (2016) Modeling confirmation bias and polarization. arXiv:1607.00022

  • Vydiswaran V, Zhai C, Roth D, Pirolli P (2015) Overcoming bias to learn about controversial topics. J Assoc Inf Sci Technol 66(8):1655–1672

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Marie Curie Reintegration Grant projects titled JMUGCS which has received research funding from the European Union, and the National Science Foundation grants: IIS 1320542, IIS 1421759, CAREER 1253393, as well as a gift from Microsoft. We would also like to thank Evaggelia Pitoura for useful comments and discussions on early drafts of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonis Matakos.

Additional information

Responsible editor: Kurt Driessens, Dragi Kocev, Marko Robnik-Šikonja Myra Spiliopoulou.

Appendix A: Theorem 1 proof

Appendix A: Theorem 1 proof

Theorem 1 The ModerateInternal problem is NP-hard.

Proof

Our proof uses a reduction from the m-SubsetSum problem, where given a set of N positive integer numbers \(v_1,\ldots ,v_N\), a value m, and a target value b, we ask if there is a set of numbers B of size m, such that \(\sum _{v_i \in B} v_i = b\).

Given an instance of the m-SubsetSum problem, we construct an instance of ModerateInternal as follows. The graph is a star with \(N+1\) nodes: we have a central node \(u_0\), and a spoke node \(u_i\) for each integer \(v_i\). For the center of the star (node \(u_0\)) we have that \(w_{00} = t\), for an appropriately selected value of t (we will discuss this below), and \(s_0 = -1\). The weight of the edge \((u_0,u_i)\) from the center to node \(u_i\) is \(w_{0i} = v_i\), and the weight of node \(u_i\) to its internal opinion is also \(w_{ii} = v_i\). The opinion of all spoke nodes is \(s_i = 1\). We set \(k = N-m\), and we ask for a set of nodes \(T_s\), \(|T_s| = k\), such that, when setting \(s_i = 0\) for \(u_i \in T_s\) \(\pi (\mathbf {z}\mid T_z) = \Vert \mathbf {z}\Vert ^2\) is minimized.

The intuition of the proof is that the expressed opinion of the center node \(z_0\) determines \(\pi (\mathbf {z})\). The value of \(z_0\) is determined by the weight t of the internal opinion of \(u_0\), and the weights of the edges of nodes whose opinion is not set to zero. If we select t appropriately, we can guarantee that \(\Vert \mathbf {z}\Vert ^2\) is minimized when the nodes whose opinion is not set to zero sums to the value b.

Formally, assume that we have selected the set \(T_s\), \(|T_s| = k\). Assume that \(u_0 \not \in T_s\). Also let \(R = V {\setminus } T_s\cup \{u_0\}\) denote the set of spoke nodes whose opinion was not set to 0. According to the opinion formation model, the equations for the expressed opinions of the spoke nodes are as follows. For every node \(u_i \in R\), \( z_i = \frac{z_0}{2} + \frac{1}{2}. \) while for every node \(u_i \in T_s\), \( z_i = \frac{z_0}{2}. \)

We can thus write:

$$\begin{aligned} \pi (\mathbf {z}\mid T_s)= & {} \Vert \mathbf {z}\Vert ^2 = z_0^2 + k\frac{1}{4} z_0^2 + (N-k)\frac{1}{4}(z_0^2 + 2z_0 + 1)\\= & {} \frac{N+4}{4}z_0^2 + \frac{N-k}{2}z_0 + \frac{N-k}{4}. \end{aligned}$$

Recall that we want to minimize \(\pi (\mathbf {z}\mid T_s)\). To find the value of \(z_0\) that minimizes \(\pi (\mathbf {z}\mid T_s)\), we take the derivative of the expression above, we set it zero, and solve for \(z_0\). We get that the value of \(z_0\) that minimizes \(\pi (\mathbf {z})\) is:

$$\begin{aligned} z_0^*= \frac{k-N}{N+4} . \end{aligned}$$

It follows that the minimum value of \(\pi (\mathbf {z}\mid T_s)\) is

$$\begin{aligned} \pi ^*= \frac{(N-k)(k+4)}{4(N+4)}. \end{aligned}$$

We now set the value of t such that if the set of numbers in R sums to the value of b, then \(z_0\) achieves the \(z_0^*\) value. First we compute the value of \(z_0\) as a function of t. In the following we set \(W = \sum _{i=1}^N v_i\). We have that:

$$\begin{aligned} z_0= & {} \sum _{i = 1}^N \frac{v_iz_i}{W+t} - \frac{t}{W+t} = \sum _{u_i \in T_s} \frac{v_i z_0}{2(W+t)} + \sum _{u_i \in R} \frac{v_i(z_0 + 1)}{2(W+t)} - \frac{t}{W+t} \\= & {} \frac{\sum _{i = 1}^N v_i}{2(W+t)}z_0 + \frac{\sum _{u_i \in R} v_i}{2(W+t)} - \frac{t}{W+t} = \frac{W}{2(W+t)}z_0 + \frac{\sum _{u_i \in R} v_i - 2t}{2(W+t)} \end{aligned}$$

Solving for \(z_0\) we get:

$$\begin{aligned} z_0 = \frac{\sum _{u_i \in R} v_i - 2t}{W+2t}. \end{aligned}$$

We want the minimum to be achieved when \(\sum _{u_i \in R} v_i = b\). Setting \(z_0 = z_0^*\) we get:

$$\begin{aligned} \frac{b - 2t}{W+2t} = \frac{K-N}{N+4} \end{aligned}$$

Solving for t we get:

$$\begin{aligned} t = \frac{(N+4)b + (N-k)W}{2(k+4)}. \end{aligned}$$

Now, we want to prove the following. There is a set B of m numbers such that \(\sum _{v_i \in B} v_i = b\), if and only if there is a set of nodes \(T_s\) of size \(k = N-m\) such that when setting their internal opinion to zero, \(\pi (\mathbf {z}\mid T_s) < \pi ^*+\epsilon \) for some appropriate value of \(\epsilon \).

The forward direction is easy. If there exists this set B, then there is a set \(T_s\) such that when setting their opinions to zero, for the set R we have that

$$\begin{aligned} z_0 = \frac{\sum _{u_i \in R} v_i - 2t}{W+2t} = \frac{b - 2t}{W+2t} = \frac{k-N}{N+4} , \end{aligned}$$

and therefore \(\pi (\mathbf {z}\mid T_s) = \pi ^*\).

For the backwards direction, if no such set of numbers exists, then it is not possible to find a set of nodes \(T_s\) such the nodes in R give \(z_0 = \frac{K-N}{N+4}\) that minimizes \(\pi (\mathbf {z}\mid T_s)\). Therefore, there must be an \(\epsilon \) such that \(\pi (\mathbf {z}\mid T_s) \ge \pi ^*+ \epsilon \).

To set \(\epsilon \) note that for any \(z_0 \ne z_0^*\)

$$\begin{aligned} \left| z_0 - z_0^*\right| = \left| \frac{\sum _{u_i \in R} v_i - b}{W+2t} \right| \ge \frac{1}{W+2t} = \frac{k+4}{(N+4)(W+b)} , \end{aligned}$$

where the inequality follows from the fact that the values \(v_1,\ldots ,v_N, b\) are integers and their difference is at least one. Now, let \(\mathbf {z}^*\) be the vector with \(z_0^*\) that achieves the minimum value \(\pi ^*\). For any other \(\mathbf {z}\) we have

$$\begin{aligned} \pi (\mathbf {z}) - \pi ^*= & {} \frac{N+4}{4}\left( z_0^2- (z_0^*)^2\right) + \frac{N-k}{2}(z_0 - z_0^*)\\= & {} \left( z_0-z_0^*\right) \left( \frac{N+4}{4}z_0 + \frac{N+4}{4}z_0^*-\frac{2(N+4)}{4} \frac{k-N}{N+4}\right) \\= & {} \left( z_0-z_0^*\right) \left( \frac{N+4}{4}z_0 - \frac{N+4}{4}z_0^*\right) = \frac{N+4}{4} \left( z_0-z_0^*\right) ^2\\\ge & {} \frac{N+4}{4}\left( \frac{1}{W+2t}\right) ^2 = \frac{(k+4)^2}{4(N+4)(W+b)^2}. \end{aligned}$$

So it suffices to set \(\epsilon < \frac{(k+4)^2}{4(N+4)(W+b)^2}\).

Finally, in our computations so far we have assumed that our set \(T_s\) does not contain node \(u_0\). This is not a restrictive assumption. Consider a solution \(T_s\), where \(u_0 \in T_s\), and \(s_0 = 0\). Then, since \(s_0\) is the only negative opinion value in our instance, it follows that \(z_0 \ge 0\), and for any node \(u_i\in R\) we have that \( z_i = \frac{1}{2} z_0 + \frac{1}{2} \ge \frac{1}{2} \). There are \(N+1-k\) nodes in R. Therefore,

$$\begin{aligned} \pi (\mathbf {z}\mid T_s) \ge \frac{N+1-k}{2}. \end{aligned}$$

Note that \(\pi ^*= (N-k)(k+4)/4(N+4) \le (N-k)/4\), since \(k \le N\). Therefore, \(\pi (\mathbf {z}) \ge 2\pi ^*+1/4\). Selecting \(\epsilon < \pi ^*+ \frac{1}{4}\) guarantees that \(\pi (\mathbf {z}| T_s) > \pi ^*+ \epsilon \). Thus, if there is a set \(T_s\) such that \(\pi (\mathbf {z}| T_s)\) is minimized, it cannot contain \(u_0\). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Matakos, A., Terzi, E. & Tsaparas, P. Measuring and moderating opinion polarization in social networks. Data Min Knowl Disc 31, 1480–1505 (2017). https://doi.org/10.1007/s10618-017-0527-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-017-0527-9

Keywords

Navigation