Bot Detection: Will Focusing on Recall Cause Overall Performance Deterioration?

  • Tahora H. NazerEmail author
  • Matthew DavisEmail author
  • Mansooreh Karami
  • Leman Akoglu
  • David Koelle
  • Huan Liu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11549)


Social bots are an effective tool in the arsenal of malicious actors who manipulate discussions on social media. Bots help spread misinformation, promote political propaganda, and inflate the popularity of users and content. Hence, it is necessary to differentiate bot accounts and human users. There are several bot detection methods that approach this problem. Conventional methods either focus on precision regardless of the overall performance or optimize overall performance, say \(F_1\), without monitoring its effect on precision or recall. Focusing on precision means that those users marked as bots are more likely than not bots but a large portion of the bots could remain undetected. From a user’s perspective, however, it is more desirable to have less interaction with bots, even if it would incur a loss in precision. This can be achieved by a detection method with higher recall. A trivial, but useless, solution for high recall is to classify every account (human or bot) as bot, hence, resulting in poor overall performance.

In this work, we investigate if it is feasible for a method to focus on recall without considerable loss in overall performance. Extensive experiments with recall and precision trade-off suggest that high recall can be achieved without much overall performance deterioration. This research leads to a recall-focused approach to bot detection, REFOCUS, with some lessons learned and future directions.


Social media Twitter Social bots Bot detection Recall 



Support was provided, in part, by NSF grant 1461886 on “Disaster Preparation and Response via Big Data Analysis and Robust Networking" and ONR grants N000141612257 (on “Intelligent Analysis of Big Social Media Data for Crisis Tracking") and N000141812108 (on “Bot Hunter"). We would like to thank anonymous reviewers for their valuable feedback.


  1. 1.
    Allcott, H., Gentzkow, M.: Social media and fake news in the 2016 election. J. Econ. Perspect. 31(2), 211–36 (2017)CrossRefGoogle Scholar
  2. 2.
    Alothali, E., Zaki, N., Mohamed, E.A., Alashwal, H.: Detecting social bots on Twitter: a literature review. In: IIT, pp. 175–180. IEEE (2018)Google Scholar
  3. 3.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Rese. 3(Jan), 993–1022 (2003)zbMATHGoogle Scholar
  4. 4.
    Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Who is tweeting on Twitter: human, bot, or cyborg? In: ACSAC, pp. 21–30. ACM (2010)Google Scholar
  5. 5.
    Cresci, S., Di Pietro, R., Petrocchi, M., Spognardi, A., Tesconi, M.: The paradigm-shift of social spambots: evidence, theories, and tools for the arms race. In: The Web Conference, pp. 963–972 (2017)Google Scholar
  6. 6.
    Davis, C.A., Varol, O., Ferrara, E., Flammini, A., Menczer, F.: Botornot: a system to evaluate social bots. In: The Web Conference, pp. 273–274 (2016)Google Scholar
  7. 7.
    Khaund, T., Al-Khateeb, S., Tokdemir, S., Agarwal, N.: Analyzing social bots and their coordination during natural disasters. In: Thomson, R., Dancy, C., Hyder, A., Bisgin, H. (eds.) SBP-BRiMS 2018. LNCS, vol. 10899, pp. 207–212. Springer, Cham (2018). Scholar
  8. 8.
    Kudugunta, S., Ferrara, E.: Deep neural networks for bot detection. Inf. Sci. 467, 312–322 (2018)CrossRefGoogle Scholar
  9. 9.
    Lee, K., Eoff, B.D., Caverlee, J.: Seven months with the devils: a long-term study of content polluters on Twitter. In: ICWSM, pp. 185–192. AAAI (2011)Google Scholar
  10. 10.
    Lee, S., Kim, J.: Early filtering of ephemeral malicious accounts on Twitter. Comput. Commun. 54, 48–57 (2014)CrossRefGoogle Scholar
  11. 11.
    Morstatter, F., Wu, L., Nazer, T.H., Carley, K.M., Liu, H.: A new approach to bot detection: striking the balance between precision and recall. In: ASONAM, pp. 533–540. IEEE (2016)Google Scholar
  12. 12.
    Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  13. 13.
    Ratkiewicz, J., et al.: Truthy: mapping the spread of astroturf in microblog streams. In: The Web Conference, pp. 249–252. ACM (2011)Google Scholar
  14. 14.
    Ratkiewicz, J., Conover, M., Meiss, M., Gonçalves, B., Flammini, A., Menczer, F.: Detecting and tracking political abuse in social media. In: ICWSM, pp. 297–304. AAAI (2011)Google Scholar
  15. 15.
    Rijsbergen, C.J.V.: Information Retrieval, 2nd edn. Butterworth-Heinemann, Newton (1979)zbMATHGoogle Scholar
  16. 16.
    Varol, O., Ferrara, E., Davis, C.A., Menczer, F., Flammini, A.: Online human-bot interactions: detection, estimation, and characterization. In: ICWSM, pp. 280–289. AAAI (2017)Google Scholar
  17. 17.
    Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G., Osipkov, I.: Spamming botnets: signatures and characteristics. ACM SIGCOMM Comput. Commun. Rev. 38(4), 171–182 (2008)CrossRefGoogle Scholar
  18. 18.
    Zhang, C.M., Paxson, V.: Detecting and analyzing automated activity on Twitter. In: Spring, N., Riley, G.F. (eds.) PAM 2011. LNCS, vol. 6579, pp. 102–111. Springer, Heidelberg (2011). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Arizona State UniversityTempeUSA
  2. 2.Carnegie Mellon UniversityPittsburghUSA
  3. 3.Charles River AnalyticsCambridgeUSA

Personalised recommendations