Skip to main content

Using Behavior and Text Analysis to Detect Propagandists and Misinformers on Twitter

  • Conference paper
  • First Online:
Information Management and Big Data (SIMBig 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 898))

Included in the following conference series:

Abstract

There are organized groups that disseminate similar messages in online forums and social media; they respond to real-time events or as persistent policy, and operate with state-level or organizational funding. Identifying these groups is of vital importance for preventing distribution of sponsored propaganda and misinformation. This paper presents an unsupervised approach using behavioral and text analysis of users and messages to identify groups of users who abuse the Twitter micro-blogging service to disseminate propaganda and misinformation. Groups of users who frequently post strikingly similar content at different times are identified through repeated clustering and frequent itemset mining, with the lack of credibility of their content validated through human assessment. This paper introduces a case study into automatic identification of propagandists and misinformers in social media.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    VK is a social network popular in Russia, see https://vk.com.

  2. 2.

    Propaganda is defined as: “posts that contain information, especially of a biased or misleading nature, that is used to promote or publicize a particular political cause or point of view” (Oxford English Dictionary, 3rd Online Edition).

  3. 3.

    A tweet was considered as a document, and collection of all tweets as a corpus.

  4. 4.

    By “user” we mean account and not individual, based on assumption (4).

  5. 5.

    Edge weights are normalized to be in range of [0, 1].

  6. 6.

    Topic modeling was performed using KNIME’s LDA implementation.

  7. 7.

    See http://www.propornot.com.

References

  1. Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I., et al.: Fast discovery of association rules. Adv. Knowl. Discov. Data Min. 12(1), 307–328 (1996)

    Google Scholar 

  2. Akoglu, L., Chandy, R., Faloutsos, C.: Opinion fraud detection in online reviews by network effects. In: ICWSM 2013, pp. 2–11 (2013)

    Google Scholar 

  3. Allcott, H., Gentzkow, M.: Social media and fake news in the 2016 election. J. Econ. Perspect. 31(2), 211–36 (2017)

    Article  Google Scholar 

  4. Berthold, M.R., et al.: KNIME: the Konstanz information miner. In: Preisach, C., Burkhardt, H., Schmidt-Thieme, L., Decker, R. (eds.) Data Analysis, Machine Learning and Applications. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 319–326. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78246-9_38

    Chapter  Google Scholar 

  5. Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web, WWW 2011, pp. 675–684. ACM, New York (2011). https://doi.org/10.1145/1963405.1963500

  6. Chen, Y., Conroy, N.J., Rubin, V.L.: Misleading online content: recognizing clickbait as false news. In: Proceedings of the 2015 ACM on Workshop on Multimodal Deception Detection, pp. 15–19. ACM (2015)

    Google Scholar 

  7. Conroy, N.J., Rubin, V.L., Chen, Y.: Automatic deception detection: methods for finding fake news. Proc. Assoc. Inf. Sci. Technol. 52(1), 1–4 (2015)

    Article  Google Scholar 

  8. Hamidian, S., Diab, M.T.: Rumor identification and belief investigation on Twitter. In: WASSA@ NAACL-HLT, pp. 3–8 (2016)

    Google Scholar 

  9. Kleinberg, J.M., Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.S.: The web as a graph: measurements, models, and methods. In: Asano, T., Imai, H., Lee, D.T., Nakano, S., Tokuyama, T. (eds.) COCOON 1999. LNCS, vol. 1627, pp. 1–17. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48686-0_1

    Chapter  Google Scholar 

  10. Li, Z.C., He, P.L., Lei, M.: A high efficient AprioriTid algorithm for mining association rule. In: 2005 International Conference on Machine Learning and Cybernetics, vol. 3, pp. 1812–1815. IEEE, August 2005. https://doi.org/10.1109/ICMLC.2005.1527239

  11. Lumezanu, C., Feamster, N., Klein, H.: #bias: measuring the tweeting behavior of propagandists. In: Sixth International AAAI Conference on Weblogs and Social Media (2012). http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4588

  12. Meduza: Authors of paid comments in support of Moscow authorities forgot to edit assignment (2017). https://meduza.io/shapito/2017/02/03/avtory-platnyh-kommentariev-v-podderzhku-moskovskih-vlastey-zabyli-otredaktirovat-zadanie

  13. Mehta, B., Hofmann, T., Fankhauser, P.: Lies and propaganda: detecting spam users in collaborative filtering. In: Proceedings of the 12th International Conference on Intelligent User Interfaces, pp. 14–21. ACM (2007). https://doi.org/10.1145/1216295.1216307

  14. Mendoza, M., Poblete, B., Castillo, C.: Twitter under crisis: can we trust what we RT? In: Proceedings of the First Workshop on Social Media Analytics, SOMA 2010, pp. 71–79. ACM, New York (2010). https://doi.org/10.1145/1964858.1964869

  15. Metaxas, P.: Using propagation of distrust to find untrustworthy web neighborhoods. In: 2009 Fourth International Conference on Internet and Web Applications and Services, ICIW 2009, pp. 516–521. IEEE (2009). https://doi.org/10.1109/ICIW.2009.83

  16. Paul, C., Matthews, M.: The Russian “Firehose of Falsehood” Propaganda Model. RAND Corporation, Santa Monica (2016)

    Google Scholar 

  17. Qazvinian, V., Rosengren, E., Radev, D.R., Mei, Q.: Rumor has it: identifying misinformation in microblogs. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, pp. 1589–1599. Association for Computational Linguistics, Stroudsburg (2011). https://www.aclweb.org/anthology/D11-1147

  18. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)

    Article  Google Scholar 

  19. The Alliance for Securing Democracy: Hamilton 68 (2017). https://dashboard.securingdemocracy.org

  20. Volkova, S., Shaffer, K., Jang, J.Y., Hodas, N.: Separating facts from fiction: linguistic models to classify suspicious and trusted news posts on Twitter. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Short Papers, vol. 2, pp. 647–653 (2017)

    Google Scholar 

  21. Zubiaga, A., Liakata, M., Procter, R., Bontcheva, K., Tolmie, P.: Towards detecting rumours in social media. In: AAAI Workshop: AI for Cities (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Michael Orlov or Marina Litvak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Orlov, M., Litvak, M. (2019). Using Behavior and Text Analysis to Detect Propagandists and Misinformers on Twitter. In: Lossio-Ventura, J., Muñante, D., Alatrista-Salas, H. (eds) Information Management and Big Data. SIMBig 2018. Communications in Computer and Information Science, vol 898. Springer, Cham. https://doi.org/10.1007/978-3-030-11680-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-11680-4_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-11679-8

  • Online ISBN: 978-3-030-11680-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics