Skip to main content

Predicting Civil Unrest by Categorizing Dutch Twitter Events

  • Conference paper
  • First Online:
BNAIC 2016: Artificial Intelligence (BNAIC 2016)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 765))

Included in the following conference series:

Abstract

We propose a system that assigns topical labels to automatically detected events in the Twitter stream. The automatic detection and labeling of events in social media streams is challenging due to the large number and variety of messages that are posted. The early detection of future social events, specifically those associated with civil unrest, has a wide applicability in areas such as security, e-governance, and journalism. We used machine learning algorithms and encoded the social media data using a wide range of features. Experiments show a high-precision (but low-recall) performance in the first step. We designed a second step that exploits classification probabilities, boosting the recall of our category of interest, social action events.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A live event detection system using the method of [9] is available at http://lamaevents.cls.ru.nl/.

  2. 2.

    TOR requests are an indication of the number of people who choose to hide their identity and location.

  3. 3.

    Since we wanted to provide our system with as much training data as possible, we also extracted relevant tweets that were posted after the event took place. Obviously, when predicting events in the future, this type of data will be unavailable.

  4. 4.

    In addition to Naive Bayes, we experimented with Support Vector Machines and K-nearest neighbors. We will only report on the outcomes of Naive Bayes, which yielded the best performance.

  5. 5.

    http://scikit-learn.org.

  6. 6.

    This was calculated by using the weighted setting in scikit-learn, which is why the F-score is not necessarily between precision and recall.

  7. 7.

    In personal communication, we asked Alan Ritter about this distribution. Unfortunately, he was unable to recover the document with the specific division of categories in the test set.

References

  1. Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: Dbpedia-a crystallization point for the web of data. Web Semant. Sci. Serv. Agents World Wide Web 7(3), 154–165 (2009)

    Article  Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Compton, R., Lee, C.-K., Lu, T.-C., de Silva, L., Macy, M.: Detecting future social unrest in unprocessed Twitter data: emerging phenomena and big data. In: 2013 IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 56–60. IEEE (2013)

    Google Scholar 

  4. De Smedt, T., Daelemans, W.: Pattern for Python. J. Mach. Learn. Res. 13(1), 2063–2067 (2012)

    MATH  Google Scholar 

  5. Hayes, A.F., Krippendorff, K.: Answering the call for a standard reliability measure for coding data. Commun. Methods Meas. 1(1), 77–89 (2007)

    Article  Google Scholar 

  6. Korkmaz, G., Cadena, J., Kuhlman, C.J., Marathe, A., Vullikanti, A., Ramakrishnan, N.: Multi-source models for civil unrest forecasting. Soc. Netw. Anal. Min. 6(1), 1–25 (2016)

    Article  Google Scholar 

  7. Krippendorff, K.: Content Analysis: An Introduction to Its Methodology. Sage, Thousand Oaks (2004)

    Google Scholar 

  8. Kunneman, F., van den Bosch, A.: Automatically identifying periodic social events from Twitter. In: Proceedings of the RANLP 2015, pp. 320–328 (2015)

    Google Scholar 

  9. Kunneman, F., van den Bosch, A.: Open-domain extraction of future events from Twitter. Nat. Lang. Eng. 22, 655–686 (2016)

    Article  Google Scholar 

  10. Meij, E., Weerkamp, W., de Rijke, M.: Adding semantics to microblog posts. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, pp. 563–572. ACM (2012)

    Google Scholar 

  11. Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 233–242. ACM (2007)

    Google Scholar 

  12. Muthiah, S., Huang, B., Arredondo, J., Mares, D., Getoor, L., Katz, G., Ramakrishnan, N.: Planned protest modeling in news and social media. In: AAAI, pp. 3920–3927 (2015)

    Google Scholar 

  13. NOS: Cohen: fouten politie, burgemeester. Nederlandse Omroep Stichting, 7 March 2013. http://nos.nl/

  14. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  15. Ramage, D., Dumais, S.T., Liebling, D.J.: Characterizing microblogs with topic models. ICWSM 10, 1 (2010)

    Google Scholar 

  16. Ramakrishnan, N., Butler, P., Muthiah, S., Self, N., Khandpur, R., Saraf, P., Wang, W., Cadena, J., Vullikanti, A., Korkmaz, G., et al.: ‘Beating the news’ with embers: forecasting civil unrest using open source indicators. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1799–1808. ACM (2014)

    Google Scholar 

  17. Ritter, A., Etzioni, O., Clark, S., et al.: Open domain event extraction from Twitter. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1104–1112. ACM (2012)

    Google Scholar 

  18. van Heerden, D.: Facebook birthday invite leads to mayhem in Dutch town, authorities say. CNN, 24 September 2012. http://edition.cnn.com/

  19. Volkskrant: Enkele duizenden bij protestmars bezuinigingen. de Volkskrant, 21 September 2013. http://volkskrant.nl/

  20. Zhao, W.X., Jiang, J., Weng, J., He, J., Lim, E.-P., Yan, H., Li, X.: Comparing Twitter and traditional media using topic models. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 338–349. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20161-5_34

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rik van Noord .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

van Noord, R., Kunneman, F.A., van den Bosch, A. (2017). Predicting Civil Unrest by Categorizing Dutch Twitter Events. In: Bosse, T., Bredeweg, B. (eds) BNAIC 2016: Artificial Intelligence. BNAIC 2016. Communications in Computer and Information Science, vol 765. Springer, Cham. https://doi.org/10.1007/978-3-319-67468-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67468-1_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67467-4

  • Online ISBN: 978-3-319-67468-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics