Hunting Malicious Bots on Twitter: An Unsupervised Approach

  • Zhouhan Chen
  • Rima S. Tanash
  • Richard Stoll
  • Devika Subramanian
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10540)

Abstract

Malicious bots violate Twitter’s terms of service – they include bots that post spam content, adware and malware, as well as bots that are designed to sway public opinion. How prevalent are such bots on Twitter? Estimates vary, with Twitter [3] itself stating that less than 5% of its over 300 million active accounts are bots. Using a supervised machine learning approach with a manually curated set of Twitter bots, [12] estimate that between 9% to 15% of active Twitter accounts are bots (both benign and malicious). In this paper, we propose an unsupervised approach to hunt for malicious bot groups on Twitter. Key structural and behavioral markers for such bot groups are the use of URL shortening services, duplicate tweets and content coordination over extended periods of time. While these markers have been identified in prior work [9, 15], we devise a new protocol to automatically harvest such bot groups from live Tweet streams. Our experiments with this protocol show that between 4% to 23% (mean 10.5%) of all accounts that use shortened URLs are bots and bot networks that evade detection over a long period of time, with significant heterogeneity in distribution based on the URL shortening service. We compare our detection approach with two state-of-the-art methods for bot detection on Twitter: a supervised learning approach called BotOrNot [10] and an unsupervised technique called DeBot [8]. We show that BotOrNot misclassifies around 40% of the malicious bots identified by our protocol. The overlap between bots detected by our approach and DeBot, which uses synchronicity of tweeting as a primary behavioral marker, is around 7%, indicating that the detection approaches target very different types of bots. Our protocol effectively identifies malicious bots in a language-independent, as well as topic and keyword independent framework in real-time in an entirely unsupervised manner and is a useful supplement to existing bot detection tools.

Keywords

Bot detection Social network analysis Data mining 

References

  1. 1.
    Earthquakebot. https://twitter.com/earthquakebot?lang=en. Accessed 30 Mar 2017
  2. 2.
    Fighting spam with botmaker. https://blog.twitter.com/2014/fighting-spam-with-botmaker. Accessed 20 Mar 2017
  3. 3.
  4. 4.
    Twitter developer documentation. https://dev.twitter.com/streaming/overview/request-parameters. Accessed 20 Mar 2017
  5. 5.
    The Twitter Rules. https://support.twitter.com/articles/18311. Accessed 13 Jan 2017
  6. 6.
    We need you to help to get linkis back to work. http://blog.linkis.com/2017/06/02/we-need-you-to-help-to-get-linkis-back-to-work. Accessed 19 July 2017
  7. 7.
    Cao, C., Caverlee, J.: Detecting spam URLs in social media via behavioral analysis. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 703–714. Springer, Cham (2015). doi:10.1007/978-3-319-16354-3_77 Google Scholar
  8. 8.
    Chavoshi, N., Hamooni, H., Mueen, A.: DeBot: Twitter bot detection via warped correlation. In: Proceedings of the 16th IEEE International Conference on Data Mining (2016)Google Scholar
  9. 9.
    Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of Twitter accounts: are you a human, bot, or cyborg? IEEE Trans. Dependable Secure Comput. 9(6), 811–824 (2012)CrossRefGoogle Scholar
  10. 10.
    Davis, C.A., Ferrara, V.E., Flammini, A., Menczer, F.: BotOrNot: a system to evaluate social bots. In: Companion to Proceedings of the 25th International Conference on the World Wide Web, pp. 273–274. International World Wide Web Conferences Steering Committee (2016)Google Scholar
  11. 11.
    Ferrara, E., Varol, O., Davis, C., Menczer, F., Flammini, A.: The rise of social bots. Commun. ACM 59(7), 96–104 (2016)CrossRefGoogle Scholar
  12. 12.
    Ferrara, O.V.E., Davis, C.A., Menczer, F., Flammini, A.: Online human-bot interactions: detection, estimation, and characterization. arXiv preprint (2017). arXiv:1703.03107
  13. 13.
    Jiang, M., Cui, P., Faloutsos, C.: Suspicious behavior detection: current trends and future directions. IEEE Intell. Syst. 31(1), 31–39 (2016)CrossRefGoogle Scholar
  14. 14.
    Montesinos, L., Rodrguez, S.J.P., Orchard, M., Eyheramendy, S.: Sentiment analysis and prediction of events in Twitter. In: 2015 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON), pp. 903–910, October 2015Google Scholar
  15. 15.
    Wang, D., Navathe, S., Liu, L., Irani, D., Tamersoy, A., Pu, C.: Click traffic analysis of short URL spam on Twitter. In: 2013 9th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom), pp. 250–259. IEEE (2013)Google Scholar
  16. 16.
    Twitter Bot Monitor project on github. https://github.com/Joe--Chen/TwitterBotProject

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Zhouhan Chen
    • 1
  • Rima S. Tanash
    • 1
  • Richard Stoll
    • 1
  • Devika Subramanian
    • 1
  1. 1.Rice UniversityHoustonUSA

Personalised recommendations