Challenges in Applying Machine Learning Methods: Studying Political Interactions on Social Networks

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10546)


This document discusses the potential role of Machine Learning (ML) methods in social science research, in general, and specifically in studies of political behavior of users in social networks (SN). This paper explores challenges which occurred in a set of studies which we conducted regarding classification of comments to posts of politicians and suggests ways of addressing these challenges. These challenges apply to a larger set of online political behavior studies.


Comment relevance classification Machine learning Social media Supervised learning Political content Political behavior 


  1. 1.
    Boyd, D., Crawford, K.: Critical questions for big data. Inf. Commun. Soc. 15, 662–679 (2012)CrossRefGoogle Scholar
  2. 2.
    Dalton, C.M., Taylor, L., Thatcher (alphabetical), J.: Critical data studies: a dialog on data and space. Big Data Soc. 3, (2016).
  3. 3.
    Nahon, K.: Where there is social media there is politics. In: Bruns, A., Enli, G., Skogerbo, E., Larsson, A.O., Christensen, C. (eds.) The Routledge Companion to Social Media and Politics. Routledge, New York (2016)Google Scholar
  4. 4.
    Mitchell, T.M.: The discipline of machine learning. Machine Learning Department, School of Computer Science, Carnegie Mellon University (2006)Google Scholar
  5. 5.
    Shah, D.V., Cappella, J.N., Neuman, W.R., Burscher, B., Vliegenthart, R., De Vreese, C.H.: Using supervised machine learning to code policy issues: can classifiers generalize across contexts? Ann. Am. Acad. Pol. Soc. Sci. 659, 122–131 (2015)CrossRefGoogle Scholar
  6. 6.
    Hopkins, D.J., King, G.: A method of automated nonparametric content analysis for social science. Am. J. Polit. Sci. 54, 229–247 (2010)CrossRefGoogle Scholar
  7. 7.
    Grimmer, J., Stewart, B.M.: Text as data: the promise and pitfalls of automatic content analysis methods for political texts. Polit. Anal. 21, 267–297 (2013)CrossRefGoogle Scholar
  8. 8.
    Stewart, B.M., Zhukov, Y.M.: Use of force and civil–military relations in Russia: an automated content analysis. Small Wars Insur. 20, 319–343 (2009)CrossRefGoogle Scholar
  9. 9.
    Liebeskind, C., Nahon, K., Hacohen-Kerner, Y., Manor, Y.: Comparing sentiment analysis models to classify attitudes of political comments on Facebook. Polibits Res. J. Comput. Sci. Comput. Eng. Appl. (2017)Google Scholar
  10. 10.
    Driscoll, K., Walker, S.: Big data, big questions| working within a black box: transparency in the collection and production of big Twitter data. Int. J. Commun. 8, 20 (2014)Google Scholar
  11. 11.
    Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)CrossRefzbMATHGoogle Scholar
  12. 12.
    Liebeskind, C., Liebeskind, S., HaCohen-Kerner, Y.: Comment relevance classification in Facebook. In: CICLING 2016, the Eighteen International Conference on Computational Linguistics and Intelligent Text Processing, Budapest, Hungary (2017)Google Scholar
  13. 13.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco (2005)zbMATHGoogle Scholar
  14. 14.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11, 10–18 (2009)CrossRefGoogle Scholar
  15. 15.
    Hall, M.A.: Correlation-based Feature Selection for Machine Learning (1999)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Jerusalem College of Technology, Lev Academic CenterJerusalemIsrael
  2. 2.Interdisciplinary Center (IDC) HerzliyaHerzliyaIsrael
  3. 3.University of WashingtonSeattleUSA

Personalised recommendations