BotFinder: a novel framework for social bots detection in online social networks based on graph embedding and community detection

Li, Shudong; Zhao, Chuanyu; Li, Qing; Huang, Jiuming; Zhao, Dawei; Zhu, Peican

doi:10.1007/s11280-022-01114-2

BotFinder: a novel framework for social bots detection in online social networks based on graph embedding and community detection

Published: 11 November 2022

Volume 26, pages 1793–1809, (2023)
Cite this article

World Wide Web Aims and scope Submit manuscript

Shudong Li^1,2^na1,
Chuanyu Zhao¹^na1,
Qing Li³,
Jiuming Huang⁴,
Dawei Zhao⁵ &
…
Peican Zhu⁶

611 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

With the widespread popularity of online social networks (OSNs), the number of users has also increased exponentially in recent years. At the same time, Social bots, i.e. accounts that controlled by program, are also on the rise. Service providers of OSNs often use them to keep social networks active. Meanwhile, some social bots are also registered for malicious purposes. It is necessary to detect these malicious social bots to present a real public opinion environment. We propose BotFinder, a framework to detect malicious social bots in OSNs. Specifically, it combines machine learning and graph methods so that the potential features of social bots can be effectively extracted. Regarding the feature engineering, we generate second order features and use coding methods to encode variables that have high cardinality. These features make full use of both labelled and unlabeled samples. With respect to the graphs, we firstly generate node vectors through embedding method, following which the similarity between vectors of humans and bots can be further calculated; Then, we use an unsupervised method to diffuse labels and thus the performance can be improved again. To valid the performance of the proposed method, we conduct extensive experiments on the dataset provided by an artificial intelligence contest which is composed of over eight million records of users. Results show that our approach reaches a F1-score of 0.8850, which is much better compared to the state of the art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Online social networks security and privacy: comprehensive review and analysis

Article Open access 01 June 2021

Graph convolutional networks: a comprehensive review

Article Open access 10 November 2019

Semantic-enhanced graph neural networks with global context representation

Article 29 April 2024

Data availability

Not applicable.

References

Yang F, Liu Y, Yu X, et al.: Automatic detection of rumor on sina weibo[C]//Proceedings of the ACM SIGKDD workshop on mining data semantics. 1–7 (2012)
Bessi A, Ferrara E.: Social bots distort the 2016 US Presidential election online discussion[J]. First monday, 21(11–7) (2016)
Costa, B.C., Alberto, B.L.A., Portela, A.M., et al.: Fraud detection in electric power distribution networks using an ann-based knowledge-discovery process[J]. Int. J. Artif. Intell. Appl. 4(6), 17 (2013)
Google Scholar
Chang, W.H., Chang, J.S.: An effective early fraud detection method for online auctions[J]. Electron. Commer. Res. Appl. 11(4), 346–360 (2012)
Article Google Scholar
Ganji, V.R., Mannem, S.N.P.: Credit card fraud detection using anti-k nearest neighbor algorithm[J]. Int. J. Comput. Sci. Eng. 4(6), 1035–1039 (2012)
Google Scholar
Ferrara, E.: Disinformation and social bot operations in the run up to the 2017 French presidential election[J]. arXiv preprint arXiv:1707.00086, (2017)
Stella, M., Ferrara, E., De Domenico, M.: Bots increase exposure to negative and inflammatory content in online social systems[J]. Proc. Natl. Acad. Sci. 115(49), 12435–12440 (2018)
Article Google Scholar
Stukal, D., Sanovich, S., Bonneau, R., et al.: Detecting bots on Russian political Twitter[J]. Big Data 5(4), 310–324 (2017)
Article Google Scholar
Cai C, Li L, Zengi D. Behavior enhanced deep bot detection in social media[C]//2017 IEEE International Conference on Intelligence and Security Informatics (ISI). IEEE, 128–130 (2017)
Kudugunta, S., Ferrara, E.: Deep neural networks for bot detection[J]. Inf. Sci. 467, 312–322 (2018)
Article Google Scholar
Cresci, S., Di Pietro, R., Petrocchi, M., et al.: DNA-inspired online behavioral modeling and its application to spambot detection[J]. IEEE Intell. Syst. 31(5), 58–64 (2016)
Article Google Scholar
Cresci, S., Di Pietro, R., Petrocchi, M., et al.: Social fingerprinting: detection of spambot groups through DNA-inspired behavioral modeling[J]. IEEE Trans. Dependable Secure Comput. 15(4), 561–576 (2017)
Google Scholar
Chen Z, Subramanian D.: An unsupervised approach to detect spam campaigns that use botnets on twitter[J]. arXiv preprint arXiv:1804.05232, (2018)
Jiang, M., Cui, P., Beutel, A., et al.: Catching synchronized behaviors in large networks: A graph mining approach[J]. ACM Trans. Knowl. Discov. Data 10(4), 1–27 (2016)
Article Google Scholar
Su, S., Tian, Z., Liang, S., et al.: A reputation management scheme for efficient malicious vehicle identification over 5G networks[J]. IEEE Wirel. Commun. 27(3), 46–52 (2020)
Article Google Scholar
Mazza M, Cresci S, Avvenuti M, et al.: Rtbust: Exploiting temporal patterns for botnet detection on twitter[C]//Proceedings of the 10th ACM Conference on Web Science. 183–192 (2019)
Guillaume, L.: Fast unfolding of communities in large networks[J]. J. Stat. Mech.: Theory Exp. 10, P1008 (2008)
Google Scholar
Li, S., Jiang, L., Wu, X., et al.: A weighted network community detection algorithm based on deep learning[J]. Appl. Math. Comput. 401, 126012 (2021)
MathSciNet MATH Google Scholar
Lerer A, Wu L, Shen J, et al.: Pytorch-biggraph: A large-scale graph embedding system[J]. arXiv preprint arXiv:1903.12287 (2019)
Yu, W., Cheng, W., Aggarwal, C.C., et al.: Netwalk: A flexible deep embedding approach for anomaly detection in dynamic networks[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2672–2681 (2018)
Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855–864 (2016)
Pham, P., Nguyen, L.T.T., Vo, B., et al.: Bot2Vec: a general approach of intra-community oriented representation learning for bot detection in different types of social networks[J]. Inf. Syst. 103, 101771 (2022)
Article Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks[J]. arXiv preprint arXiv:1609.02907 (2016)
Aljohani, N.R., Fayoumi, A., Hassan, S.U.: Bot prediction on social networks of Twitter in altmetrics using deep graph convolutional networks[J]. Soft Computing, 1–12 (2020)
Li, S., Zhao, D., Wu, X., et al.: Functional immunization of networks based on message passing[J]. Appl. Math. Comput. 366, 124728 (2020)
MathSciNet MATH Google Scholar
Nie, Y., Jia, Y., Li, S., et al.: Identifying users across social networks based on dynamic core interests[J]. Neurocomputing 210, 107–115 (2016)
Article Google Scholar
Gao, C., Liu, J.: Network-based modeling for characterizing human collective behaviors during extreme events[J]. IEEE Trans. Syst. Man Cybernetics: Syst. 47(1), 171–183 (2016)
Google Scholar
Zhu, P., Zhi, Q., Guo, Y., et al.: Analysis of epidemic spreading process in adaptive networks[J]. IEEE Trans. Circuits Syst. II Express Briefs 66(7), 1252–1256 (2018)
Google Scholar
Su, S., Tian, Z., Li, S., et al.: IoT root union: a decentralized name resolving system for IoT based on blockchain[J]. Inf. Process. Manage. 58(3), 102553 (2021)
Article Google Scholar
Ke, G., Meng, Q., Finley, T., et al.: Lightgbm: A highly efficient gradient boosting decision tree[J]. Adv. Neural. Inf. Process. Syst. 30, 3146–3154 (2017)
Google Scholar
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system[C]//Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 785–794 (2016)
Dorogush, A.V., Ershov, V., Gulin, A.: CatBoost: gradient boosting with categorical features support[J]. arXiv preprint arXiv:1810.11363 (2018)

Download references

Acknowledgements

This research was funded by NSFC (Grant Nos. U1803263, 62072131, 62073263), Science and Technology Projects in Guangzhou (No.202206030001, 202102010442), Guangdong Basic and Applied Basic Research Foundation (No.2022A1515011401), the Major Key Project of PCL (Grant No. PCL2021A09, PCL2021A02, PCL2022A03), Guangdong Higher Education Innovation Group (Grant No.2020KCXTD007) and Guangzhou Higher Education Innovation Group (Grant No.202032854).

Funding

This research was funded by NSFC (Grant Nos. U1803263, 62072131, 62073263), Science and Technology Projects in Guangzhou (No.202206030001, 202102010442), Guangdong Basic and Applied Basic Research Foundation (No.2022A1515011401), the Major Key Project of PCL (Grant No. PCL2021A09, PCL2021A02, PCL2022A03), Guangdong Higher Education Innovation Group (Grant No.2020KCXTD007) and Guangzhou Higher Education Innovation Group (Grant No.202032854).

Author information

Shudong Li and Chuanyu Zhao contributed equally to this work.

Authors and Affiliations

Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, 510006, China
Shudong Li & Chuanyu Zhao
Peng Cheng Laboratory, Shenzhen, 518000, China
Shudong Li
Shandong Jianzhu University, Jinan, 250101, China
Qing Li
Hunan Singhand Intelligent Data Technology Co., Ltd, Changsha, China
Jiuming Huang
Shandong Provincial Key Laboratory of Computer Networks, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, 250014, China
Dawei Zhao
School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, Xi’an, 710072, China
Peican Zhu

Authors

Shudong Li
View author publications
You can also search for this author in PubMed Google Scholar
Chuanyu Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Qing Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiuming Huang
View author publications
You can also search for this author in PubMed Google Scholar
Dawei Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Peican Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conception and design of study: Shudong Li and Chuanyu Zhao

Data processing: Qing Li and Jiuming Huang

Analysis of experimental result: Dawei Zhao

Manuscript revision: Peican Zhu

Corresponding authors

Correspondence to Shudong Li or Qing Li.

Ethics declarations

Ethical approval and consent to participate

Not applicable.

Human and animal ethics

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, S., Zhao, C., Li, Q. et al. BotFinder: a novel framework for social bots detection in online social networks based on graph embedding and community detection. World Wide Web 26, 1793–1809 (2023). https://doi.org/10.1007/s11280-022-01114-2

Download citation

Received: 19 July 2022
Revised: 26 September 2022
Accepted: 02 October 2022
Published: 11 November 2022
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11280-022-01114-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BotFinder: a novel framework for social bots detection in online social networks based on graph embedding and community detection

Abstract

Access this article

Similar content being viewed by others

Online social networks security and privacy: comprehensive review and analysis

Graph convolutional networks: a comprehensive review

Semantic-enhanced graph neural networks with global context representation

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethical approval and consent to participate

Human and animal ethics

Consent for publication

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

BotFinder: a novel framework for social bots detection in online social networks based on graph embedding and community detection

Abstract

Access this article

Similar content being viewed by others

Online social networks security and privacy: comprehensive review and analysis

Graph convolutional networks: a comprehensive review

Semantic-enhanced graph neural networks with global context representation

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethical approval and consent to participate

Human and animal ethics

Consent for publication

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation