Abstract
Civil unrest events (protests, strikes, and “occupy” events) range from small, nonviolent protests that address specific issues to events that turn into large-scale riots. Detecting and forecasting these events is of key interest to social scientists and policy makers because they can lead to significant societal and cultural changes. We forecast civil unrest events in six countries in Latin America on a daily basis, from November 2012 through August 2014, using multiple data sources that capture social, political and economic contexts within which civil unrest occurs. The models contain predictors extracted from social media sites (Twitter and blogs) and news sources, in addition to volume of requests to Tor, a widely used anonymity network. Two political event databases and country-specific exchange rates are also used. Our forecasting models are evaluated using a Gold Standard Report, which is compiled by an independent group of social scientists and subject matter experts. We use logistic regression models with Lasso to select a sparse feature set from our diverse datasets. The experimental results, measured by F1-scores, are in the range 0.68–0.95, and demonstrate the efficacy of using a multi-source approach for predicting civil unrest. Case studies illustrate the insights into unrest events that are obtained with our method. The ablation study demonstrates the relative value of data sources for prediction. We find that social media and news are more informative than other data sources, including the political event databases, and enhance the prediction performance. However, social media increases the variation in the performance metrics.
Similar content being viewed by others
Notes
June 21, 2013, “Protesters, criminals get around government censors using secret web network,” http://bit.ly/1Sghvo7.
This model was briefly mentioned, along with several others in Ramakrishnan et al. (2014) as part of an automated, real-time forecasting software system. This paper describes our model and results in detail.
The dictionary is compiled by a different group of experts from the one that generated the GSR.
Source: https://www.torproject.org/.
References
Arias M, Arratia A, Xuriguera R (2013) Forecasting with Twitter data. ACM Trans Intell Syst Technol (TIST) 5(1):8:1–8:24
Asur S, Huberman BA (2010) Predicting the future with social media. In: Proceedings of the 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (WI-IAT), vol 1, pp 492–499
Bellemare MF (2015) Rising food prices, food price volatility, and social unrest. Am J Agric Econ 97(1):1–21
Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8
Chakraborty P, Khadivi P, Lewis B, Mahendiran A, Chen J, Butler P, Nsoesie EO, Mekaru SR, Brownstein JS, Marathe M, et al (2014) Forecasting a moving target: ensemble models for ILI case count predictions. In: Proceedings of the 2014 SIAM international conference on data mining, pp 262–270
Chen F, Neill DB (2014) Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1166–1175
Culotta A (2010) Towards detecting influenza epidemics by analyzing Twitter messages. In: Proceedings of the first workshop on social media analytics, pp 115–122
El-Katiri L, Fattouh B, Mallinson R (2014) The Arab uprisings and MENA political instability: implications for oil & gas markets. OIES Paper: MEP 8, Oxford Institute for Energy Studies, Oxford. http://www.oxfordenergy.org/2014/03/the-arab-uprisings-and-mena-political-instability- implications-foroil-gas-markets/
Gerner DJ, Schrodt PA, Francisco RA, Weddle JL (1994) Machine coding of event data using regional and international sources. Int Stud Q 38(1):91–119
Gerner DJ, Schrodt PA, Yilmaz O, Abu-Jabr R (2002) Conflict and mediation event observations (CAMEO): a new event data framework for the analysis of foreign policy interactions. In: 43rd Annual convention of the international studies association, pp 24–27
Golub GH, Reinsch C (1970) Singular value decomposition and least squares solutions. Numer Math 14(5):403–420
González-Bailón S, Borge-Holthoefer J, Rivero A, Moreno Y (2011) The dynamics of protest recruitment through an online network. Sci Rep 1(197). doi:10.1038/srep00197
Kallus N (2014) Predicting crowd behavior with big public data. In: Proceedings of the Companion publication of the 23rd international conference on world wide web companion, pp 625–630
Keneshloo Y, Cadena J, Korkmaz G, Ramakrishnan N (2014) Detecting and forecasting domestic political crises: a graph-based approach. In: Proceedings of the 2014 ACM conference on web science, pp 192–196
Korkmaz G, Cadena J, Kuhlman CJ, Marathe A, Vullikanti A, Ramakrishnan N (2015) Combining heterogeneous data sources for civil unrest forecasting. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining. ACM, pp 258–265
Lampos V, De Bie T, Cristianini N (2010) Flu detector: tracking epidemics on Twitter. In: Balcázar JL, Bonchi F, Gionis A, Sebag M (eds) Proceedings of the 2010 European conference on machine learning and knowledge discovery in databases: Part III (ECML PKDD'10). Springer, Berlin, Heidelberg, pp 599–602
Leetaru K, Schrodt PA (2013) GDELT: Global data on events, location, and tone, 1979–2012. In: International Studies Association (ISA) Annual Convention, vol 2. Citeseer
Lynch J (1973) The Spanish-American revolutions, 1808–1826. Norton, New York
McFadden D (1973) Conditional logit analysis of qualitative choice behavior. Front Econ pp 105–142
McFadden D (1977) Quantitative methods for analyzing travel behaviour of individuals: some recent developments. Technical report, Cowles Foundation for Research in Economics, Yale University
Muthiah S, Huang B, Arredondo J, Mares D, Getoor L, Katz G, Ramakrishnan N (2015) Planned protest modeling in news and social media. In: Proceedings of the Twenty-Seventh annual conference on innovative applications of artificial intelligence (IAAI), pp 3920–3927
Piven FF, Cloward RA (1977) Poor people’s movements. Pantheon, New York
Ramakrishnan N, Butler P, Muthiah S, Self N, Khandpur R, Saraf P, Wang W, Cadena J, Vullikanti A, Korkmaz G, Kuhlman C, Marathe A, Zhao L, Hua T, Chen F, Lu CT, Huang B, Srinivasan A, Trinh K, Getoor L, Katz G, Doyle A, Ackermann C, Zavorin I, Ford J, Summers K, Fayed Y, Arredondo J, Gupta D, Mares D (2014) “Beating the news” with EMBERS: forecasting civil unrest using open source indicators. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1799–1808
Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on world wide web. ACM, pp 851–860
Shi L, Agarwal N, Agrawal A, Garg R, Spoelstra J (2012) Predicting US primary elections with Twitter. http://snap.stanford.edu/social2012/papers/shi
Starbird K, Palen L (2012) (How) will the revolution be retweeted? Information diffusion and the 2011 Egyptian uprising. In: Proceedings of the 2012 ACM conference on computer supported cooperative work, pp 7–16
Stoll RJ, Subramanian D (2006) Hubs, authorities, and networks: predicting conflict using events data. In: International Studies Association (ISA) Annual Convention, Citeseer
Székely GJ, Rizzo ML, Bakirov NK et al (2007) Measuring and testing dependence by correlation of distances. Ann Stat 35(6):2769–2794
Tang J, Wang X, Liu H (2012) Integrating social media data for community detection. In: Modeling and mining ubiquitous social media. Springer, Berlin, pp 1–20
Theocharis Y (2013) The wealth of (occupation) networks? Communication patterns and information distribution in a Twitter protest network. J Inf Technol Politics 10(1):35–56
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B (Methodological) 58(1):267–288
Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with Twitter: what 140 characters reveal about political sentiment. In: Proceedings of the fourth international AAAI conference on weblogs and social media (ICWSM), vol 10, pp 178–185
Ward MD, Metternich NW, Carrington C, Dorff C, Gallop M, Hollenbach FM, Schultz A, Weschle S (2012) Geographical models of crises: evidence from ICEWS. Adv Des Cross-Cult Activities 429–438
Wulf V, Aal K, Abu Kteish I, Atam M, Schubert K, Rohde M, Yerousis GP, Randall D (2013a) Fighting against the wall: social media use by political activists in a Palestinian village. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 1979–1988
Wulf V, Misaki K, Atam M, Randall D, Rohde M (2013b) On the ground in Sidi Bouzid: Investigating social media use during the Tunisian revolution. In: Proceedings of the 2013 ACM conference on computer supported cooperative work, pp 1409–1418
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B (Stat Methodol) 68(1):49–67
Acknowledgments
This work has been partially supported by the following Grants: DTRA Grant HDTRA1-11-1-0016, DTRA CNIMS Contract HDTRA1-11-D-0016-0010, NSF ICES CCF-1216000, NSF NETSE Grant CNS-1011769 and NIH 1R01GM109718. Also, supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center (DoI/NBC) Contract No. D12PC000337, the US Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon.
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary version of the paper appeared in the Proceedings of 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (Korkmaz et al. 2015).
Rights and permissions
About this article
Cite this article
Korkmaz, G., Cadena, J., Kuhlman, C.J. et al. Multi-source models for civil unrest forecasting. Soc. Netw. Anal. Min. 6, 50 (2016). https://doi.org/10.1007/s13278-016-0355-8
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-016-0355-8