Advertisement

Identifying Correlated Bots in Twitter

  • Nikan Chavoshi
  • Hossein Hamooni
  • Abdullah Mueen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10047)

Abstract

We develop a technique to identify abnormally correlated user accounts in Twitter, which are very unlikely to be human operated. This new approach of bot detection considers cross-correlating user activities and requires no labeled data, as opposed to existing bot detection techniques that consider users independently, and require large amount of recently labeled data. Our system uses a lag-sensitive hashing technique and a warping-invariant correlation measure to quickly organize the user accounts in clusters of abnormally correlated accounts. Our method is 94 % precise and detects unique bots that other methods cannot detect. Our system produces daily reports on bots at a rate of several hundred bots per day. The reports are available online for further analysis.

Keywords

Social Medium Human User Social Media Data Spam Detection Automate Account 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
  2. 2.
    Supporting web page containing video, data, code and daily report. http://www.cs.unm.edu/~chavoshi/debot/
  3. 3.
    Asur, S., Huberman, B.A.: Predicting the future with social media. In: 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol. 1, pp. 492–499. IEEE, Aug. 2010Google Scholar
  4. 4.
    Beutel, A., Xu, W., Guruswami, V., Palow, C., Faloutsos, C.: Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 119–130. International World Wide Web Conferences Steering Committee (2013)Google Scholar
  5. 5.
    Cole, R., Shasha, D., Zhao, X.: Fast window correlations over uncooperative time series. In: Proceeding of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining - KDD 2005, p. 743 (2005)Google Scholar
  6. 6.
    Ferrara, E., Varol, O., Davis, C., Menczer, F., Flammini, A.: The rise of social bots. arXiv preprint 2014. arXiv:1407.5225
  7. 7.
    Galán-García, P., de la Puerta, J.G., Gómez, C.L., Santos, I., Bringas, P.G.: Supervised machine learning for the detection of troll profiles in twitter social network: application to a real case of cyberbullying. In: Herrero, Á., et al. (eds.) International Joint Conference SOCO’13-CISIS’13-ICEUTE’13. AISC, vol. 239, pp. 419–428. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  8. 8.
    Ghosh, S., Viswanath, B., Kooti, F., Sharma, N.K., Korlam, G., Benevenuto, F., Ganguly, N., Gummadi, K.P.: Understanding and combating link farming in the twitter social network. In: Proceedings of the 21st International Conference on World Wide Web - WWW 2012, p. 61. ACM Press, New York, April 2012Google Scholar
  9. 9.
    Goga, O., Lei, H., Parthasarathi, S.H.K., Friedland, G., Sommer, R., Teixeira, R.: Exploiting innocuous activity for correlating users across sites, pp. 447–458, May 2013Google Scholar
  10. 10.
    Grier, C., Thomas, K., Paxson, V., Zhang, M.: @spam: the underground on 140 characters or less. In: Proceedings of the 17th ACM Conference on Computer and Communications Security - CCS 2010, p. 27. ACM Press, New York, October 2010Google Scholar
  11. 11.
    Jiang, J., Shan, Z.-F., Wang, X., Zhang, L., Dai, Y.-F.: Understanding sybil groups in the wild. J. Comput. Sci. Technol. 30(6), 1344–1357 (2015)CrossRefGoogle Scholar
  12. 12.
    Li, H., Mukherjee, A., Liu, B., Kornfield, R., Emery, S.: Detecting campaign promoters on twitter using markov random fields. In: 2014 IEEE International Conference on Data Mining (ICDM), pp. 290–299 (2014)Google Scholar
  13. 13.
    Matsubara, Y., Sakurai, Y., Ueda, N., Yoshikawa, M.: Fast and exact monitoring of co-evolving data streams. In: 2014 IEEE International Conference on Data Mining, pp. 390–399. IEEE, December 2014Google Scholar
  14. 14.
    Morstatter, F., Carley, K.M., Liu, H.: Bot detection in social media: networks, behavior, and evaluation. In: ASONAM - Tutorial, August 2015Google Scholar
  15. 15.
    Ruiz, E.J., Hristidis, V., Castillo, C., Gionis, A., Jaimes, A.: Correlating financial time series with micro-blogging activity. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining - WSDM 2012, p. 513. ACM Press, New York, February 2012Google Scholar
  16. 16.
    Sakurai, Y., Papadimitriou, S., Faloutsos, C.: Braid: Stream mining through group lag correlations. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, p. 610 (2005)Google Scholar
  17. 17.
    Subrahmanian, V., Azaria, A., Durst, S., Kagan, V., Galstyan, A., Lerman, K., Zhu, L., Ferrara, E., Flammini, A., Menczer, F., Waltzman, R., Stevens, A., Dekhtyar, A., Gao, S., Hogg, T., Kooti, F., Liu, Y., Varol, O., Shiralkar, P., Vydiswaran, V., Mei, Q., Huang, T.: The darpa twitter bot challenge. IEEE Comput. 1, 38–46 (2016). (In press)CrossRefGoogle Scholar
  18. 18.
    Thomas, K., Paxson, V., Mccoy, D., Grier, C.: Trafficking fraudulent accounts: the role of the underground market in twitter spam and abuse trafficking fraudulent accounts. In: USENIX Security Symposium, pp. 195–210 (2013)Google Scholar
  19. 19.
    Twitter. The Twitter Rules. https://support.twitter.com/articles/18311
  20. 20.
    Wang, A.H.: Detecting spam bots in online social networking sites: a machine learning approach. In: Foresti, S., Jajodia, S. (eds.) DBSec 2010. LNCS, vol. 6166, pp. 335–342. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-13739-6_25 CrossRefGoogle Scholar
  21. 21.
    Zhang, C.M., Paxson, V.: Detecting and analyzing automated activity on twitter. In: Spring, N., Riley, G.F. (eds.) PAM 2011. LNCS, vol. 6579, pp. 102–111. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-19260-9_11 CrossRefGoogle Scholar
  22. 22.
    Zhu, Y., Shasha, D.: StatStream: statistical monitoring of thousands of data streams in real time. In: Proceedings of the 28th International Conference on Very Large Data Bases, volume 54 of VLDB 2002, pp. 358–369 (2002)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Nikan Chavoshi
    • 1
  • Hossein Hamooni
    • 1
  • Abdullah Mueen
    • 1
  1. 1.University of New MexicoAlbuquerqueUSA

Personalised recommendations