Abstract
We propose a system that assigns topical labels to automatically detected events in the Twitter stream. The automatic detection and labeling of events in social media streams is challenging due to the large number and variety of messages that are posted. The early detection of future social events, specifically those associated with civil unrest, has a wide applicability in areas such as security, e-governance, and journalism. We used machine learning algorithms and encoded the social media data using a wide range of features. Experiments show a high-precision (but low-recall) performance in the first step. We designed a second step that exploits classification probabilities, boosting the recall of our category of interest, social action events.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A live event detection system using the method of [9] is available at http://lamaevents.cls.ru.nl/.
- 2.
TOR requests are an indication of the number of people who choose to hide their identity and location.
- 3.
Since we wanted to provide our system with as much training data as possible, we also extracted relevant tweets that were posted after the event took place. Obviously, when predicting events in the future, this type of data will be unavailable.
- 4.
In addition to Naive Bayes, we experimented with Support Vector Machines and K-nearest neighbors. We will only report on the outcomes of Naive Bayes, which yielded the best performance.
- 5.
- 6.
This was calculated by using the weighted setting in scikit-learn, which is why the F-score is not necessarily between precision and recall.
- 7.
In personal communication, we asked Alan Ritter about this distribution. Unfortunately, he was unable to recover the document with the specific division of categories in the test set.
References
Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: Dbpedia-a crystallization point for the web of data. Web Semant. Sci. Serv. Agents World Wide Web 7(3), 154–165 (2009)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Compton, R., Lee, C.-K., Lu, T.-C., de Silva, L., Macy, M.: Detecting future social unrest in unprocessed Twitter data: emerging phenomena and big data. In: 2013 IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 56–60. IEEE (2013)
De Smedt, T., Daelemans, W.: Pattern for Python. J. Mach. Learn. Res. 13(1), 2063–2067 (2012)
Hayes, A.F., Krippendorff, K.: Answering the call for a standard reliability measure for coding data. Commun. Methods Meas. 1(1), 77–89 (2007)
Korkmaz, G., Cadena, J., Kuhlman, C.J., Marathe, A., Vullikanti, A., Ramakrishnan, N.: Multi-source models for civil unrest forecasting. Soc. Netw. Anal. Min. 6(1), 1–25 (2016)
Krippendorff, K.: Content Analysis: An Introduction to Its Methodology. Sage, Thousand Oaks (2004)
Kunneman, F., van den Bosch, A.: Automatically identifying periodic social events from Twitter. In: Proceedings of the RANLP 2015, pp. 320–328 (2015)
Kunneman, F., van den Bosch, A.: Open-domain extraction of future events from Twitter. Nat. Lang. Eng. 22, 655–686 (2016)
Meij, E., Weerkamp, W., de Rijke, M.: Adding semantics to microblog posts. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, pp. 563–572. ACM (2012)
Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 233–242. ACM (2007)
Muthiah, S., Huang, B., Arredondo, J., Mares, D., Getoor, L., Katz, G., Ramakrishnan, N.: Planned protest modeling in news and social media. In: AAAI, pp. 3920–3927 (2015)
NOS: Cohen: fouten politie, burgemeester. Nederlandse Omroep Stichting, 7 March 2013. http://nos.nl/
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Ramage, D., Dumais, S.T., Liebling, D.J.: Characterizing microblogs with topic models. ICWSM 10, 1 (2010)
Ramakrishnan, N., Butler, P., Muthiah, S., Self, N., Khandpur, R., Saraf, P., Wang, W., Cadena, J., Vullikanti, A., Korkmaz, G., et al.: ‘Beating the news’ with embers: forecasting civil unrest using open source indicators. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1799–1808. ACM (2014)
Ritter, A., Etzioni, O., Clark, S., et al.: Open domain event extraction from Twitter. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1104–1112. ACM (2012)
van Heerden, D.: Facebook birthday invite leads to mayhem in Dutch town, authorities say. CNN, 24 September 2012. http://edition.cnn.com/
Volkskrant: Enkele duizenden bij protestmars bezuinigingen. de Volkskrant, 21 September 2013. http://volkskrant.nl/
Zhao, W.X., Jiang, J., Weng, J., He, J., Lim, E.-P., Yan, H., Li, X.: Comparing Twitter and traditional media using topic models. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 338–349. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20161-5_34
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
van Noord, R., Kunneman, F.A., van den Bosch, A. (2017). Predicting Civil Unrest by Categorizing Dutch Twitter Events. In: Bosse, T., Bredeweg, B. (eds) BNAIC 2016: Artificial Intelligence. BNAIC 2016. Communications in Computer and Information Science, vol 765. Springer, Cham. https://doi.org/10.1007/978-3-319-67468-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-67468-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67467-4
Online ISBN: 978-3-319-67468-1
eBook Packages: Computer ScienceComputer Science (R0)