Leveraging node neighborhoods and egograph topology for better bot detection in social graphs

Abstract

Due to their popularity, online social networks are a popular target for spam, scams, malware distribution and more recently state-actor propaganda. In this paper, we review a number of recent approaches to fake account and bot classification. Based on this review and our experiments, we propose our own method which leverages the social graph’s topology and differences in egographs of legitimate and fake user accounts to improve identification of the latter. We evaluate our approach against other common approaches on a real-world dataset of users of the social network Twitter.

This is a preview of subscription content, access via your institution.

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Notes

  1. 1.

    Twitter developer policy, accessed November 15, 2020: https://developer.twitter.com/en/developer-terms/agreement-and-policy.

References

  1. Abou Daya A, Salahuddin MA, Limam N, Boutaba R (2019) A graph-based machine learning approach for bot detection. In 2019 IFIP/IEEE symposium on integrated network and service management (IM). IEEE, pp 144–152

  2. Allcott H, Gentzkow M (2017) Social media and fake news in the 2016 election. J Eon Perspect 31(2):211–36

    Article  Google Scholar 

  3. Alothali E, Zaki N, Mohamed EA, Alashwal H (2018) Detecting social bots on twitter: a literature review. In: 2018 International conference on innovations in information technology (IIT). IEEE, pp 175–180

  4. Bail CA, Guay B, Maloney E, Combs A, Hillygus DS, Merhout F, Freelon D, Volfovsky A (2020) Assessing the Russian internet research agency’s impact on the political attitudes and behaviors of American Twitter users in late 2017. Proc Natl Acad Sci 117(1):243–250

    Article  Google Scholar 

  5. Bergsma S, Dredze M, Van Durme B, Wilson T, Yarowsky D (2013) Broadly improving user classification via communication-based name and location clustering on twitter. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1010–1019

  6. Beskow DM, Carley KM (2018) Bot conversations are different: leveraging network metrics for bot detection in twitter. In: 2018 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). IEEE, pp 825–832

  7. Bhat SY, Abulaish M (2013) Community-based features for identifying spammers in online social networks. In: 2013 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM 2013). IEEE, pp 100–107

  8. Bhat SY, Abulaish M, Mirza AA (2014) Spammer classification using ensemble methods over structural social network features. In: 2014 IEEE/WIC/ACM international joint conferences on web intelligence (WI) and intelligent agent technologies (IAT). IEEE, 2:454–458

  9. Cai C, Li L, Zeng D (2017a) Detecting social bots by jointly modeling deep behavior and content information. In: Proceedings of the 2017 ACM on conference on information and knowledge management, pp 1995–1998

  10. Cai C, Li L, Zengi D (2017b) Behavior enhanced deep bot detection in social media. In: 2017 IEEE international conference on intelligence and security informatics (ISI). IEEE, pp 128–130

  11. Cao Q, Sirivianos M, Yang X, Pregueiro T (2012) Aiding the detection of fake accounts in large scale social online services. In: Proceedings of the 9th USENIX conference on networked systems design and implementation. USENIX Association, pp 15–15

  12. Chavoshi N, Hamooni H, Mueen A (2017) Temporal patterns in bot activities. In: Proceedings of the 26th international conference on world wide web companion. International world wide web conferences steering committee, pp 1601–1606

  13. Chowdhury S, Khanzadeh M, Akula R, Zhang F, Zhang S, Medal H, Marufuzzaman M, Bian L (2017) Botnet detection using graph-based feature clustering. J Big Data 4(1):14

    Article  Google Scholar 

  14. Cornelissen LA, Barnett RJ, Schoonwinkel P, Eichstadt BD, Magodla HB (2018) A network topology approach to bot classification. In: Proceedings of the annual conference of the South African institute of computer scientists and information technologists. ACM, pp 79–88

  15. Cresci S, Di Pietro R, Petrocchi M, Spognardi A, Tesconi M (2015) Fame for sale: efficient detection of fake twitter followers. Decis Support Syst 80:56–71

    Article  Google Scholar 

  16. Cresci S, Lillo F, Regoli D, Tardelli S, Tesconi M (2018) \$fake: evidence of spam and bot activity in stock microblogs on twitter. In: Twelfth international AAAI conference on web and social media

  17. Ferraz Costa A, Yamaguchi Y, Juci Machado Traina A, Traina Jr C, Faloutsos C (2015) RSC: mining and modeling temporal activity in social media. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 269–278

  18. Grinberg N, Joseph K, Friedland L, Swire-Thompson B, Lazer D (2019) Fake news on twitter during the 2016 US presidential election. Science 363(6425):374–378

    Article  Google Scholar 

  19. Guimaraes RG, Rosa RL, De Gaetano D, Rodriguez DZ, Bressan G (2017) Age groups classification in social network using deep learning. IEEE Access 5:10805–10816

    Article  Google Scholar 

  20. Gurajala S, White JS, Hudson B, Matthews JN (2015) Fake twitter accounts: profile characteristics obtained using an activity-based pattern detection approach. In: Proceedings of the 2015 international conference on social media & society. ACM, p 9

  21. Kudugunta S, Ferrara E (2018) Deep neural networks for bot detection. Inf Sci 467:312–322

    Article  Google Scholar 

  22. Kwak H, Lee C, Park H, Moon S (2010) What is twitter, a social network or a news media? In: Proceedings of the 19th international conference on World Wide Web. ACM, pp 591–600

  23. Loyola-González O, Monroy R, Rodríguez J, López-Cuevas A, Mata-Sánchez JI (2019) Contrast pattern-based classification for bot detection on twitter. IEEE Access 7:45800–45817

    Article  Google Scholar 

  24. Lundberg J, Nordqvist J, Laitinen M (2019) Towards a language independent twitter bot detector. In DHN, pp 308–319

  25. Malhotra A, Totti L, Meira Jr W, Kumaraguru P, Almeida V (2012) Studying user footprints in different online social networks. In: Proceedings of the 2012 international conference on advances in social networks analysis and mining (ASONAM 2012). IEEE Computer Society, pp 1065–1070

  26. Minnich A, Chavoshi N, Koutra D, Mueen A (2017) Botwalk: efficient adaptive exploration of twitter bot networks. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, pp 467–474

  27. Pennacchiotti M, Popescu A-M (2011) Democrats, republicans and starbucks afficionados: user classification in twitter. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 430–438

  28. Ping H, Qin S (2018) A social bots detection model based on deep learning algorithm. In: 2018 IEEE 18th international conference on communication technology (ICCT). IEEE, pp 1435–1439

  29. Rodríguez-Ruiz J, Mata-Sánchez JI, Monroy R, Loyola-González O, López-Cuevas A (2020) A one-class classification approach for bot detection on twitter. Comput Secur 91:101715

    Article  Google Scholar 

  30. Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor Newslett 19(1):22–36

    Article  Google Scholar 

  31. Stringhini G, Kruegel C, Vigna G (2010) Detecting spammers on social networks. In: Proceedings of the 26th annual computer security applications conference, pp 1–9

  32. Subrahmanian V, Azaria A, Durst S, Kagan V, Galstyan A, Lerman K, Zhu L, Ferrara E, Flammini A, Menczer F (2016) The darpa twitter bot challenge. Computer 49(6):38–46

    Article  Google Scholar 

  33. Tan C, Lee L, Tang J, Jiang L, Zhou M, Li P (2011) User-level sentiment analysis incorporating social networks. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1397–1405

  34. Wang AH (2010) Detecting spam bots in online social networking sites: a machine learning approach. In: IFIP annual conference on data and applications security and privacy. Springer, pp 335–342

  35. Wang Y, Wu C, Zheng K, Wang X (2018) Social bot detection using tweets similarity. In: International conference on security and privacy in communication systems. Springer, pp 63–78

  36. Xiao C, Freeman DM, Hwa T (2015) Detecting clusters of fake accounts in online social networks. In: Proceedings of the 8th ACM workshop on artificial intelligence and security. ACM, pp 91–101

  37. Xu H, Yang Y, Wang L, Liu W (2013) Node classification in social network via a factor graph model. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 213–224

Download references

Acknowledgements

This work was partly supported by the Institute for Information & Communications Technology Promotion (2015-0-00310-SW.StarLab, 2017-0-01772-VTT, 2018-0-00622-RMI, 2019-0-01367-BabyMind) and the Korea Institute for Advancement Technology (P0006720-GENKO) Grant funded by the Korean government.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Björn Bebensee.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Feature correlation heatmap

See Fig. 12.

Fig. 12
figure12

Correlation heatmap of all features scraped or generated (neighborhood features and egograph features)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bebensee, B., Nazarov, N. & Zhang, BT. Leveraging node neighborhoods and egograph topology for better bot detection in social graphs. Soc. Netw. Anal. Min. 11, 10 (2021). https://doi.org/10.1007/s13278-020-00713-z

Download citation

Keywords

  • Fake account detection
  • Social graph
  • Network topology
  • Social network analysis