Skip to main content
Log in

Analysing user identity via time-sensitive semantic edit distance (t-SED): a case study of Russian trolls on Twitter

  • Research Article
  • Published:
Journal of Computational Social Science Aims and scope Submit manuscript

Abstract

In the digital era, individuals are increasingly profiled and grouped based on the traces that they leave behind in online social networks such as Twitter and Facebook. In this paper, we develop and evaluate a novel text analysis approach for studying user identity and social roles by redefining identity as a sequence of timestamped items (e.g., tweet texts). We operationalise this idea by developing a novel text distance metric, the time-sensitive semantic edit distance (t-SED), which accounts for the temporal context across multiple traces. To evaluate this method, we undertake a case study of Russian online-troll activity within US political discourse. The novel metric allows us to classify the social roles of trolls based on their traces, in this case tweets, into one of the predefined categories left-leaning, right-leaning, and news feed. We show the effectiveness of the t-SED metric to measure the similarities between tweets while accounting for the temporal context, and we use novel data visualisation techniques and qualitative analysis to uncover new empirical insights into Russian troll activity that have not been identified in the previous work. In addition, we highlight a connection with the field of actor–network theory and the related hypotheses of Gabriel Tarde, and we discuss how social sequence analysis using t-SED may provide new avenues for tackling a longstanding problem in social theory: how to analyse society without separating reality into micro vs. macro-levels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. Available at https://github.com/fivethirtyeight/russian-troll-tweets/.

  2. MAGA is an acronym that stands for Make America Great Again. It was the election slogan used by Donald Trump during his election campaign in 2016, and has subsequently become a central theme of his presidency.

  3. Bernie Sanders was the alternative Democrat Presidential Nominee.

  4. Available at https://github.com/s/preprocessor.

  5. The bag-of-words is used to map a sequence to vector.

  6. Length normalisation: \(\text {sed}(\varvec{a},\varvec{b})/\max (\varvec{|a|},\varvec{|b|})\).

  7. Ratio normalisation: \(\text {sed}(\varvec{a},\varvec{b})/\text {ed}(\varvec{a},\varvec{b})\).

  8. Twitter has a 140-character limitation before Nov. 2017.

References

  1. Abbott, A. (1995). Sequence analysis: New methods for old ideas. Annual Review of Sociology, 21(1), 93–113.

    Article  Google Scholar 

  2. Abbott, A., & Tsay, A. (2000). Sequence analysis and optimal matching methods in sociology: Review and prospect. Sociological Methods & Research, 29(1), 3–33.

    Article  Google Scholar 

  3. Badawy, A., Ferrara, E., & Lerman, K. (2018). Analyzing the digital traces of political manipulation: The 2016 russian interference twitter campaign. arXiv:180204291 (preprint).

  4. Bessi, A., & Ferrara, E. (2016). Social bots distort the 2016 U.S. presidential election online discussion. First Monday, 21(11). https://doi.org/10.5210/fm.v21i11.7090. URL http://firstmonday.org/ojs/index.php/fm/article/view/7090.

  5. Broniatowski, D. A., Jamison, A. M., Qi, S., AlKulaib, L., Chen, T., Benton, A., Quinn, S. C., & Dredze, M. (2018). Weaponized health communication: Twitter bots and russian trolls amplify the vaccine debate. American Journal of Public Health, 108(10). https://doi.org/10.2105/AJPH.2018.304567

  6. Buckels, E. E., Trapnell, P. D., & Paulhus, D. L. (2014). Trolls just want to have fun. Personality and Individual Differences, 67, 97–102. https://doi.org/10.1016/j.paid.2014.01.016. http://www.sciencedirect.com/science/article/pii/S0191886914000324 (the Dark Triad of Personality).

  7. Cook, D. M., Waugh, B., Abdipanah, M., Hashemi, O., & Rahman, S. A. (2014). Twitter deception and influence: Issues of identity, slacktivism, and puppetry. Journal of Information Warfare, 13(1), 58–71.

    Google Scholar 

  8. Cornwell, B. (2015). Social sequence analysis: Methods and applications, (Vol. 37). Cambridge: Cambridge University Press.

    Book  Google Scholar 

  9. Davis, C. A., Varol, O., Ferrara, E., Flammini, A., & Menczer, F. (2016). Botornot: A system to evaluate social bots. In Proceedings of the 25th International Conference Companion on World Wide Web (pp. 273–274). International World Wide Web Conferences Steering Committee.

  10. Ferrara, E., Varol, O., Davis, C., Menczer, F., & Flammini, A. (2016). The rise of social bots. Communications of the ACM, 59(7), 96–104.

    Article  Google Scholar 

  11. Flores-Saviaga, C., Keegan, B., & Savage, S. (2018). Mobilizing the trump train: Understanding collective action in a political trolling community. In Twelfth International AAAI Conference on Web and Social Media (ICWSM18). International World Wide Web Conferences Steering Committee.

  12. Herring, S., Job-Sluder, K., Scheckler, R., & Barab, S. (2002). Searching for safety online: Managing “trolling” in a feminist forum. The Information Society, 18(5), 371–384.

    Article  Google Scholar 

  13. Kollanyi, B., Howard, P. N., & Woolley, S. C. (2016). Bots and automation over Twitter during the first U.S. presidential debate. COMPROP Data Memo No. 1. http://blogs.oii.ox.ac.uk/politicalbots/wp-content/uploads/sites/89/2016/10/Data-Memo-First-Presidential-Debate.pdf. Accessed 1 Nov 2018.

  14. Kumar, S., Cheng, J., Leskovec, J., & Subrahmanian, V. (2017). An army of me: Sockpuppets in online discussion communities. In Proceedings of the 26th International Conference on World Wide Web (pp. 857–866). International World Wide Web Conferences Steering Committee.

  15. Latour, B. (2002). Gabriel Tarde and the end of the social. In P. Joyce (Ed.), The Social in Question: New Bearings in History and the Social Sciences (pp. 117–133). London: Routledge.

    Google Scholar 

  16. Latour, B., Jensen, P., Venturini, T., Grauwin, S., & Boullier, D. (2012). ‘The whole is always smaller than its parts’—A digital test of gabriel tardes’ monads. The British Journal of Sociology, 63(4), 590–615.

    Article  Google Scholar 

  17. Leskovec, J., Backstrom, L., Kumar, R., & Tomkins, A. (2008). Microscopic evolution of social networks. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 462-470). ACM.

  18. Boatwright, B. C., Linvill, D. L., & Warren, P. L. (2018). Troll factories: The internet research agency and state-sponsored agenda building. Resource Centre on Media Freedom in Europe. http://pwarren.people.clemson.edu/Linvill_Warren_TrollFactory.pdf. Accessed 1 Nov 2018.

  19. Maaten, L. V. D., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of machine learning research, 9(1), 2579–2605.

    Google Scholar 

  20. Mihaylov, T., Georgiev, G., & Nakov, P. (2015). Finding opinion manipulation trolls in news community forums. In Proceedings of the Nineteenth Conference on Computational Natural Language Learning (pp. 310–314).

  21. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv:1301.3781. Accessed 10 Dec 2018.

  22. Navarro, G. (2001). A guided tour to approximate string matching. ACM Computing Surveys (CSUR), 33(1), 31–88.

    Article  Google Scholar 

  23. Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1532–1543).

  24. Phinney, J. S. (2000). Ethnic and racial identity: Ethnic identity. In A. E. Kazdin (Ed.), Encyclopedia of Psychology (Vol. 3, pp. 254–259). Washington, DC: American Psychological Association.

    Google Scholar 

  25. Rizoiu, M. A., Graham, T., Zhang, R., Zhang, Y., Ackland, R., & Xie, L. (2018). DEBATENIGHT: The role and influence of socialbots on twitter during the first 2016 US presidential debate. In 12th International AAAI Conference on Web and Social Media, ICWSM 2018.

  26. Rizoiu, M. A., Lee, Y., Mishra, S., & Xie, L. (2018b). Hawkes processes for events in social media. In S.F. Chang (Eds.), Frontiers of Multimedia Research (pp. 191–218). Springer, New York. https://doi.org/10.1145/3122865.3122874,

  27. Shao, C., Ciampaglia, G. L., Varol, O., Flammini, A., & Menczer, F. (2017). The spread of fake news by social bots (pp. 96–104). https://www.andyblackassociates.co.uk/wp-content/uploads/2015/06/fakenewsbots.pdf. Accessed 20 Oct 2018.

  28. Stewart, L. G., Arif, A., & Starbird, K. (2018). Examining trolls and polarization with a retweet network. In Proceedings of Web Search and Data Mining (2018), Workshop on Misinformation and Misbehavior Mining on the Web. http://faculty.washington.edu/kstarbi/examining-trolls-polarization.pdf. Accessed 1 Dec 2018.

  29. Tarde, G. (2012). [1895]. Monadology and sociology. Melbourne, Victoria: Re.press.

  30. Varol, O., Ferrara, E., Davis, C. A., Menczer, F., & Flammini, A. (2017). Online human-bot interactions: Detection, estimation, and characterization. arXiv:170303107 (preprint).

  31. Zannettou, S., Caulfield, T., Setzer, W., Sirivianos, M., Stringhini, G., & Blackburn, J. (2018). Who let the trolls out? towards understanding state-sponsored trolls. CoRR abs/1811.03130. arXiv:1811.03130

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Timothy Graham.

Ethics declarations

Conflict of Interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, D., Graham, T., Wan, Z. et al. Analysing user identity via time-sensitive semantic edit distance (t-SED): a case study of Russian trolls on Twitter. J Comput Soc Sc 2, 331–351 (2019). https://doi.org/10.1007/s42001-019-00051-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42001-019-00051-x

Keywords

Navigation