Hunting Malicious Bots on Twitter: An Unsupervised Approach
- 6 Citations
- 5.5k Downloads
Abstract
Malicious bots violate Twitter’s terms of service – they include bots that post spam content, adware and malware, as well as bots that are designed to sway public opinion. How prevalent are such bots on Twitter? Estimates vary, with Twitter [3] itself stating that less than 5% of its over 300 million active accounts are bots. Using a supervised machine learning approach with a manually curated set of Twitter bots, [12] estimate that between 9% to 15% of active Twitter accounts are bots (both benign and malicious). In this paper, we propose an unsupervised approach to hunt for malicious bot groups on Twitter. Key structural and behavioral markers for such bot groups are the use of URL shortening services, duplicate tweets and content coordination over extended periods of time. While these markers have been identified in prior work [9, 15], we devise a new protocol to automatically harvest such bot groups from live Tweet streams. Our experiments with this protocol show that between 4% to 23% (mean 10.5%) of all accounts that use shortened URLs are bots and bot networks that evade detection over a long period of time, with significant heterogeneity in distribution based on the URL shortening service. We compare our detection approach with two state-of-the-art methods for bot detection on Twitter: a supervised learning approach called BotOrNot [10] and an unsupervised technique called DeBot [8]. We show that BotOrNot misclassifies around 40% of the malicious bots identified by our protocol. The overlap between bots detected by our approach and DeBot, which uses synchronicity of tweeting as a primary behavioral marker, is around 7%, indicating that the detection approaches target very different types of bots. Our protocol effectively identifies malicious bots in a language-independent, as well as topic and keyword independent framework in real-time in an entirely unsupervised manner and is a useful supplement to existing bot detection tools.
Keywords
Bot detection Social network analysis Data miningReferences
- 1.Earthquakebot. https://twitter.com/earthquakebot?lang=en. Accessed 30 Mar 2017
- 2.Fighting spam with botmaker. https://blog.twitter.com/2014/fighting-spam-with-botmaker. Accessed 20 Mar 2017
- 3.Twitter Annual Report. http://files.shareholder.com/downloads/AMDA-2F526X/4335316487x0xS1564590-17-2584/1418091/filing.pdf. Accessed 22 Apr 2017
- 4.Twitter developer documentation. https://dev.twitter.com/streaming/overview/request-parameters. Accessed 20 Mar 2017
- 5.The Twitter Rules. https://support.twitter.com/articles/18311. Accessed 13 Jan 2017
- 6.We need you to help to get linkis back to work. http://blog.linkis.com/2017/06/02/we-need-you-to-help-to-get-linkis-back-to-work. Accessed 19 July 2017
- 7.Cao, C., Caverlee, J.: Detecting spam URLs in social media via behavioral analysis. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 703–714. Springer, Cham (2015). doi: 10.1007/978-3-319-16354-3_77 Google Scholar
- 8.Chavoshi, N., Hamooni, H., Mueen, A.: DeBot: Twitter bot detection via warped correlation. In: Proceedings of the 16th IEEE International Conference on Data Mining (2016)Google Scholar
- 9.Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of Twitter accounts: are you a human, bot, or cyborg? IEEE Trans. Dependable Secure Comput. 9(6), 811–824 (2012)CrossRefGoogle Scholar
- 10.Davis, C.A., Ferrara, V.E., Flammini, A., Menczer, F.: BotOrNot: a system to evaluate social bots. In: Companion to Proceedings of the 25th International Conference on the World Wide Web, pp. 273–274. International World Wide Web Conferences Steering Committee (2016)Google Scholar
- 11.Ferrara, E., Varol, O., Davis, C., Menczer, F., Flammini, A.: The rise of social bots. Commun. ACM 59(7), 96–104 (2016)CrossRefGoogle Scholar
- 12.Ferrara, O.V.E., Davis, C.A., Menczer, F., Flammini, A.: Online human-bot interactions: detection, estimation, and characterization. arXiv preprint (2017). arXiv:1703.03107
- 13.Jiang, M., Cui, P., Faloutsos, C.: Suspicious behavior detection: current trends and future directions. IEEE Intell. Syst. 31(1), 31–39 (2016)CrossRefGoogle Scholar
- 14.Montesinos, L., Rodrguez, S.J.P., Orchard, M., Eyheramendy, S.: Sentiment analysis and prediction of events in Twitter. In: 2015 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON), pp. 903–910, October 2015Google Scholar
- 15.Wang, D., Navathe, S., Liu, L., Irani, D., Tamersoy, A., Pu, C.: Click traffic analysis of short URL spam on Twitter. In: 2013 9th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom), pp. 250–259. IEEE (2013)Google Scholar
- 16.Twitter Bot Monitor project on github. https://github.com/Joe--Chen/TwitterBotProject