Analysing user identity via time-sensitive semantic edit distance (t-SED): a case study of Russian trolls on Twitter

Kim, Dongwoo; Graham, Timothy; Wan, Zimin; Rizoiu, Marian-Andrei

doi:10.1007/s42001-019-00051-x

Analysing user identity via time-sensitive semantic edit distance (t-SED): a case study of Russian trolls on Twitter

Research Article
Published: 27 July 2019

Volume 2, pages 331–351, (2019)
Cite this article

Journal of Computational Social Science Aims and scope Submit manuscript

Dongwoo Kim¹,
Timothy Graham ORCID: orcid.org/0000-0002-4053-9313²,
Zimin Wan¹ &
…
Marian-Andrei Rizoiu³

1345 Accesses
15 Citations
41 Altmetric
4 Mentions
Explore all metrics

Abstract

In the digital era, individuals are increasingly profiled and grouped based on the traces that they leave behind in online social networks such as Twitter and Facebook. In this paper, we develop and evaluate a novel text analysis approach for studying user identity and social roles by redefining identity as a sequence of timestamped items (e.g., tweet texts). We operationalise this idea by developing a novel text distance metric, the time-sensitive semantic edit distance (t-SED), which accounts for the temporal context across multiple traces. To evaluate this method, we undertake a case study of Russian online-troll activity within US political discourse. The novel metric allows us to classify the social roles of trolls based on their traces, in this case tweets, into one of the predefined categories left-leaning, right-leaning, and news feed. We show the effectiveness of the t-SED metric to measure the similarities between tweets while accounting for the temporal context, and we use novel data visualisation techniques and qualitative analysis to uncover new empirical insights into Russian troll activity that have not been identified in the previous work. In addition, we highlight a connection with the field of actor–network theory and the related hypotheses of Gabriel Tarde, and we discuss how social sequence analysis using t-SED may provide new avenues for tackling a longstanding problem in social theory: how to analyse society without separating reality into micro vs. macro-levels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

What is a social pattern? Rethinking a central social science term

Article Open access 13 October 2021

Hernan Mondani & Richard Swedberg

Conceptualising and measuring social media engagement: A systematic literature review

Article Open access 11 August 2021

Mariapina Trunfio & Simona Rossi

Social Network Theories: An Overview

Notes

Available at https://github.com/fivethirtyeight/russian-troll-tweets/.
MAGA is an acronym that stands for Make America Great Again. It was the election slogan used by Donald Trump during his election campaign in 2016, and has subsequently become a central theme of his presidency.
Bernie Sanders was the alternative Democrat Presidential Nominee.
Available at https://github.com/s/preprocessor.
The bag-of-words is used to map a sequence to vector.
Length normalisation: \(\text {sed}(\varvec{a},\varvec{b})/\max (\varvec{|a|},\varvec{|b|})\).
Ratio normalisation: \(\text {sed}(\varvec{a},\varvec{b})/\text {ed}(\varvec{a},\varvec{b})\).
Twitter has a 140-character limitation before Nov. 2017.

References

Abbott, A. (1995). Sequence analysis: New methods for old ideas. Annual Review of Sociology, 21(1), 93–113.
Article Google Scholar
Abbott, A., & Tsay, A. (2000). Sequence analysis and optimal matching methods in sociology: Review and prospect. Sociological Methods & Research, 29(1), 3–33.
Article Google Scholar
Badawy, A., Ferrara, E., & Lerman, K. (2018). Analyzing the digital traces of political manipulation: The 2016 russian interference twitter campaign. arXiv:180204291 (preprint).
Bessi, A., & Ferrara, E. (2016). Social bots distort the 2016 U.S. presidential election online discussion. First Monday, 21(11). https://doi.org/10.5210/fm.v21i11.7090. URL http://firstmonday.org/ojs/index.php/fm/article/view/7090.
Broniatowski, D. A., Jamison, A. M., Qi, S., AlKulaib, L., Chen, T., Benton, A., Quinn, S. C., & Dredze, M. (2018). Weaponized health communication: Twitter bots and russian trolls amplify the vaccine debate. American Journal of Public Health, 108(10). https://doi.org/10.2105/AJPH.2018.304567
Buckels, E. E., Trapnell, P. D., & Paulhus, D. L. (2014). Trolls just want to have fun. Personality and Individual Differences, 67, 97–102. https://doi.org/10.1016/j.paid.2014.01.016. http://www.sciencedirect.com/science/article/pii/S0191886914000324 (the Dark Triad of Personality).
Cook, D. M., Waugh, B., Abdipanah, M., Hashemi, O., & Rahman, S. A. (2014). Twitter deception and influence: Issues of identity, slacktivism, and puppetry. Journal of Information Warfare, 13(1), 58–71.
Google Scholar
Cornwell, B. (2015). Social sequence analysis: Methods and applications, (Vol. 37). Cambridge: Cambridge University Press.
Book Google Scholar
Davis, C. A., Varol, O., Ferrara, E., Flammini, A., & Menczer, F. (2016). Botornot: A system to evaluate social bots. In Proceedings of the 25th International Conference Companion on World Wide Web (pp. 273–274). International World Wide Web Conferences Steering Committee.
Ferrara, E., Varol, O., Davis, C., Menczer, F., & Flammini, A. (2016). The rise of social bots. Communications of the ACM, 59(7), 96–104.
Article Google Scholar
Flores-Saviaga, C., Keegan, B., & Savage, S. (2018). Mobilizing the trump train: Understanding collective action in a political trolling community. In Twelfth International AAAI Conference on Web and Social Media (ICWSM18). International World Wide Web Conferences Steering Committee.
Herring, S., Job-Sluder, K., Scheckler, R., & Barab, S. (2002). Searching for safety online: Managing “trolling” in a feminist forum. The Information Society, 18(5), 371–384.
Article Google Scholar
Kollanyi, B., Howard, P. N., & Woolley, S. C. (2016). Bots and automation over Twitter during the first U.S. presidential debate. COMPROP Data Memo No. 1. http://blogs.oii.ox.ac.uk/politicalbots/wp-content/uploads/sites/89/2016/10/Data-Memo-First-Presidential-Debate.pdf. Accessed 1 Nov 2018.
Kumar, S., Cheng, J., Leskovec, J., & Subrahmanian, V. (2017). An army of me: Sockpuppets in online discussion communities. In Proceedings of the 26th International Conference on World Wide Web (pp. 857–866). International World Wide Web Conferences Steering Committee.
Latour, B. (2002). Gabriel Tarde and the end of the social. In P. Joyce (Ed.), The Social in Question: New Bearings in History and the Social Sciences (pp. 117–133). London: Routledge.
Google Scholar
Latour, B., Jensen, P., Venturini, T., Grauwin, S., & Boullier, D. (2012). ‘The whole is always smaller than its parts’—A digital test of gabriel tardes’ monads. The British Journal of Sociology, 63(4), 590–615.
Article Google Scholar
Leskovec, J., Backstrom, L., Kumar, R., & Tomkins, A. (2008). Microscopic evolution of social networks. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 462-470). ACM.
Boatwright, B. C., Linvill, D. L., & Warren, P. L. (2018). Troll factories: The internet research agency and state-sponsored agenda building. Resource Centre on Media Freedom in Europe. http://pwarren.people.clemson.edu/Linvill_Warren_TrollFactory.pdf. Accessed 1 Nov 2018.
Maaten, L. V. D., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of machine learning research, 9(1), 2579–2605.
Google Scholar
Mihaylov, T., Georgiev, G., & Nakov, P. (2015). Finding opinion manipulation trolls in news community forums. In Proceedings of the Nineteenth Conference on Computational Natural Language Learning (pp. 310–314).
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv:1301.3781. Accessed 10 Dec 2018.
Navarro, G. (2001). A guided tour to approximate string matching. ACM Computing Surveys (CSUR), 33(1), 31–88.
Article Google Scholar
Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1532–1543).
Phinney, J. S. (2000). Ethnic and racial identity: Ethnic identity. In A. E. Kazdin (Ed.), Encyclopedia of Psychology (Vol. 3, pp. 254–259). Washington, DC: American Psychological Association.
Google Scholar
Rizoiu, M. A., Graham, T., Zhang, R., Zhang, Y., Ackland, R., & Xie, L. (2018). DEBATENIGHT: The role and influence of socialbots on twitter during the first 2016 US presidential debate. In 12th International AAAI Conference on Web and Social Media, ICWSM 2018.
Rizoiu, M. A., Lee, Y., Mishra, S., & Xie, L. (2018b). Hawkes processes for events in social media. In S.F. Chang (Eds.), Frontiers of Multimedia Research (pp. 191–218). Springer, New York. https://doi.org/10.1145/3122865.3122874,
Shao, C., Ciampaglia, G. L., Varol, O., Flammini, A., & Menczer, F. (2017). The spread of fake news by social bots (pp. 96–104). https://www.andyblackassociates.co.uk/wp-content/uploads/2015/06/fakenewsbots.pdf. Accessed 20 Oct 2018.
Stewart, L. G., Arif, A., & Starbird, K. (2018). Examining trolls and polarization with a retweet network. In Proceedings of Web Search and Data Mining (2018), Workshop on Misinformation and Misbehavior Mining on the Web. http://faculty.washington.edu/kstarbi/examining-trolls-polarization.pdf. Accessed 1 Dec 2018.
Tarde, G. (2012). [1895]. Monadology and sociology. Melbourne, Victoria: Re.press.
Varol, O., Ferrara, E., Davis, C. A., Menczer, F., & Flammini, A. (2017). Online human-bot interactions: Detection, estimation, and characterization. arXiv:170303107 (preprint).
Zannettou, S., Caulfield, T., Setzer, W., Sirivianos, M., Stringhini, G., & Blackburn, J. (2018). Who let the trolls out? towards understanding state-sponsored trolls. CoRR abs/1811.03130. arXiv:1811.03130

Download references

Author information

Authors and Affiliations

Australian National University, Canberra, Australia
Dongwoo Kim & Zimin Wan
Queensland University of Technology, Brisbane, Australia
Timothy Graham
University of Technology Sydney, Ultimo, Australia
Marian-Andrei Rizoiu

Authors

Dongwoo Kim
View author publications
You can also search for this author in PubMed Google Scholar
Timothy Graham
View author publications
You can also search for this author in PubMed Google Scholar
Zimin Wan
View author publications
You can also search for this author in PubMed Google Scholar
Marian-Andrei Rizoiu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Timothy Graham.

Ethics declarations

Conflict of Interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, D., Graham, T., Wan, Z. et al. Analysing user identity via time-sensitive semantic edit distance (t-SED): a case study of Russian trolls on Twitter. J Comput Soc Sc 2, 331–351 (2019). https://doi.org/10.1007/s42001-019-00051-x

Download citation

Received: 28 March 2019
Accepted: 21 July 2019
Published: 27 July 2019
Issue Date: 01 July 2019
DOI: https://doi.org/10.1007/s42001-019-00051-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysing user identity via time-sensitive semantic edit distance (t-SED): a case study of Russian trolls on Twitter

Abstract

Access this article

Similar content being viewed by others

What is a social pattern? Rethinking a central social science term

Conceptualising and measuring social media engagement: A systematic literature review

Social Network Theories: An Overview

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Analysing user identity via time-sensitive semantic edit distance (t-SED): a case study of Russian trolls on Twitter

Abstract

Access this article

Similar content being viewed by others

What is a social pattern? Rethinking a central social science term

Conceptualising and measuring social media engagement: A systematic literature review

Social Network Theories: An Overview

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation