Bot Detection: Will Focusing on Recall Cause Overall Performance Deterioration?
Social bots are an effective tool in the arsenal of malicious actors who manipulate discussions on social media. Bots help spread misinformation, promote political propaganda, and inflate the popularity of users and content. Hence, it is necessary to differentiate bot accounts and human users. There are several bot detection methods that approach this problem. Conventional methods either focus on precision regardless of the overall performance or optimize overall performance, say \(F_1\), without monitoring its effect on precision or recall. Focusing on precision means that those users marked as bots are more likely than not bots but a large portion of the bots could remain undetected. From a user’s perspective, however, it is more desirable to have less interaction with bots, even if it would incur a loss in precision. This can be achieved by a detection method with higher recall. A trivial, but useless, solution for high recall is to classify every account (human or bot) as bot, hence, resulting in poor overall performance.
In this work, we investigate if it is feasible for a method to focus on recall without considerable loss in overall performance. Extensive experiments with recall and precision trade-off suggest that high recall can be achieved without much overall performance deterioration. This research leads to a recall-focused approach to bot detection, REFOCUS, with some lessons learned and future directions.
KeywordsSocial media Twitter Social bots Bot detection Recall
Support was provided, in part, by NSF grant 1461886 on “Disaster Preparation and Response via Big Data Analysis and Robust Networking" and ONR grants N000141612257 (on “Intelligent Analysis of Big Social Media Data for Crisis Tracking") and N000141812108 (on “Bot Hunter"). We would like to thank anonymous reviewers for their valuable feedback.
- 2.Alothali, E., Zaki, N., Mohamed, E.A., Alashwal, H.: Detecting social bots on Twitter: a literature review. In: IIT, pp. 175–180. IEEE (2018)Google Scholar
- 4.Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Who is tweeting on Twitter: human, bot, or cyborg? In: ACSAC, pp. 21–30. ACM (2010)Google Scholar
- 5.Cresci, S., Di Pietro, R., Petrocchi, M., Spognardi, A., Tesconi, M.: The paradigm-shift of social spambots: evidence, theories, and tools for the arms race. In: The Web Conference, pp. 963–972 (2017)Google Scholar
- 6.Davis, C.A., Varol, O., Ferrara, E., Flammini, A., Menczer, F.: Botornot: a system to evaluate social bots. In: The Web Conference, pp. 273–274 (2016)Google Scholar
- 7.Khaund, T., Al-Khateeb, S., Tokdemir, S., Agarwal, N.: Analyzing social bots and their coordination during natural disasters. In: Thomson, R., Dancy, C., Hyder, A., Bisgin, H. (eds.) SBP-BRiMS 2018. LNCS, vol. 10899, pp. 207–212. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93372-6_23CrossRefGoogle Scholar
- 9.Lee, K., Eoff, B.D., Caverlee, J.: Seven months with the devils: a long-term study of content polluters on Twitter. In: ICWSM, pp. 185–192. AAAI (2011)Google Scholar
- 11.Morstatter, F., Wu, L., Nazer, T.H., Carley, K.M., Liu, H.: A new approach to bot detection: striking the balance between precision and recall. In: ASONAM, pp. 533–540. IEEE (2016)Google Scholar
- 13.Ratkiewicz, J., et al.: Truthy: mapping the spread of astroturf in microblog streams. In: The Web Conference, pp. 249–252. ACM (2011)Google Scholar
- 14.Ratkiewicz, J., Conover, M., Meiss, M., Gonçalves, B., Flammini, A., Menczer, F.: Detecting and tracking political abuse in social media. In: ICWSM, pp. 297–304. AAAI (2011)Google Scholar
- 16.Varol, O., Ferrara, E., Davis, C.A., Menczer, F., Flammini, A.: Online human-bot interactions: detection, estimation, and characterization. In: ICWSM, pp. 280–289. AAAI (2017)Google Scholar