Skip to main content
Log in

Detection of Fake Profiles on Online Social Network Platforms: Performance Evaluation of Artificial Intelligence Techniques

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

The emergence of online social network (OSN) platforms has resulted in the production of enormous amounts of data from billions of active users. The ease of access to personal information on OSN platforms renders users vulnerable to the creation of fake profiles. The primary objective of fake accounts is to disseminate unsolicited messages, unverified information, and other deceitful content on OSN platforms. No study has been conducted to ascertain whether traditional machine learning techniques or deep learning techniques are more effective at detecting fake profiles on OSN platforms with respect to dataset size. The present study fills this void by conducting a performance evaluation of artificial intelligence (traditional machine learning and deep learning) techniques using benchmark datasets of different sizes. An ablation study is conducted to ascertain the optimal combination of features, while data augmentation techniques are employed to address the issue of an imbalanced dataset. The study’s results show that using the data augmentation technique, particularly the synthetic minority over-sampling technique (SMOTE), yields better results on an imbalanced dataset. Further deep learning techniques (LSTM, which has an accuracy rate of 97%) work better on large datasets for finding fake profiles on OSN platforms, while traditional machine learning techniques (XGBoost, which also has an accuracy rate of 97%) work better on small datasets. The top-performing techniques are also compared with state-of-the-art techniques to validate the results. The study may aid future researchers in developing a comprehensive methodology for detecting fake profiles on OSN platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data Availability

In the reference section, the datasets utilized for this research are referenced.

Notes

  1. https://datareportal.com/social-media-users and Global Social Media User Statistics (2023) — Demography & Facts (demandsage.com) (Accessed on 10th January 2024).

  2. https://backlinko.com/social-media-users and https://datareportal.com/social-media-users (Accessed on 10th January 2024).

  3. Global Social Media User Statistics (2023) — Demography & Facts (demandsage.com) (Accessed on 10th January 2024).

  4. https://www.smartinsights.com/social-media-marketing/social-media-strategy/new-global-social-media-research/ (Accessed on 10th January 2024).

  5. https://www.bankmycell.com/blog/how-many-phones-are-in-the-world (Accessed on 10th January 2024).

References

  1. Aïmeur E, Amri S, Brassard G. Fake news, disinformation and misinformation in social media: a review. Soc Netw Anal Min. 2023. https://doi.org/10.1007/s13278-023-01028-5.

    Article  Google Scholar 

  2. Priyadharshini VM, Valarmathi A. A novel spam detection technique for detecting and classifying malicious profiles in online social networks. J Intell Fuzzy Syst. 2021;41(1):993–1007. https://doi.org/10.3233/JIFS-202937.

    Article  Google Scholar 

  3. Gupta S, Verma B, Gupta P, Goel L, Yadav AK, Yadav D. Identification of fake news using deep neural network-based hybrid model. SN Comput Sci. 2023;4:679. https://doi.org/10.1007/s42979-023-02117-0.

    Article  Google Scholar 

  4. Clark EM, Williams JR, Jones CA, Galbraith RA, Danforth CM, Dodds PS. Sifting robotic from organic text: a natural language approach for detecting automation on Twitter. J Comput Sci. 2016;16:1–7. https://doi.org/10.1016/J.JOCS.2015.11.002.

    Article  Google Scholar 

  5. Mughaid A, Obeidat I, AlZu’bi S, Elsoud EA, Alnajjar A, Alsoud AR, Abualigah L. A novel machine learning and face recognition technique for fake accounts detection system on cyber social networks. Multimed Tools Appl. 2023;82:26353–78. https://doi.org/10.1007/s11042-023-14347-8.

    Article  Google Scholar 

  6. Van Der Walt E, Eloff J. Using machine learning to detect fake identities: bots vs humans. IEEE Access. 2018;6:6540–9. https://doi.org/10.1109/ACCESS.2018.2796018.

    Article  Google Scholar 

  7. Swe MM, Nyein Myo N. Fake Accounts Detection on Twitter Using Blacklist. In: Proceedings – 17th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2018, pp. 562–566, 2018. https://doi.org/10.1109/ICIS.2018.8466499.

  8. Shahid W, Li Y, Staples D, Amin G, Hakak S, Ghorbani A. Are you a cyborg, bot or human? A survey on detecting fake news spreaders. IEEE Access. 2022;10:27069–83. https://doi.org/10.1109/ACCESS.2022.3157724.

    Article  Google Scholar 

  9. Islam MM, Uddin MA, Islam L, Akter A, Sharmin S, Acharjee UK. Cyberbullying Detection on Social Networks Using Machine Learning Approaches. In: 2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2020, 2020. https://doi.org/10.1109/CSDE50874.2020.9411601.

  10. Shah A, Varshney S, Mehrotra M. DeepMUI: a novel method to identify malicious users on online social network platforms. Concurr Comput. 2023. https://doi.org/10.1002/CPE.7917.

    Article  Google Scholar 

  11. Sahoo SR, Gupta BB. Classification of various attacks and their defence mechanism in online social networks: a survey. Enterp Inf Syst. 2019;13(6):832–64. https://doi.org/10.1080/17517575.2019.1605542.

    Article  Google Scholar 

  12. Singh N, Sharma T, Thakral A, Choudhury T. Detection of Fake Profile in Online Social Networks Using Machine Learning. In: Proceedings on 2018 International Conference on Advances in Computing and Communication Engineering, ICACCE 2018; 2018. pp. 231–234. https://doi.org/10.1109/ICACCE.2018.8441713.

  13. Ambareen K, Meenakshi Sundaram S. A survey of cyberbullying detection and performance: its impact in social media using artificial intelligence. SN Comput Sci. 2023;4:859. https://doi.org/10.1007/s42979-023-02301-2.

    Article  Google Scholar 

  14. Thaokar C, Rout JK, Rout M, Ray NK. N-Gram based sarcasm detection for news and social media text using hybrid deep learning models. SN Comput Sci. 2024;5:163. https://doi.org/10.1007/s42979-023-02506-5.

    Article  Google Scholar 

  15. Ramalingam D, Chinnaiah V. Fake profile detection techniques in large-scale online social networks: a comprehensive review. Comput Electr Eng. 2018;65:165–77. https://doi.org/10.1016/J.COMPELECENG.2017.05.020.

    Article  Google Scholar 

  16. Wanda P, Jie HJ. DeepProfile: finding fake profile in online social network using dynamic CNN. J Inform Secur Appl. 2020;52:102465.https://doi.org/10.1016/J.JISA.2020.102465

    Article  Google Scholar 

  17. Pv S, Bhanu SMS. UbCadet: detection of compromised accounts in twitter based on user behavioural profiling. Multimed Tools Appl. 2020;79:27–8. https://doi.org/10.1007/S11042-020-08721-Z.

    Article  Google Scholar 

  18. Bharti KK, Pandey S. Fake account detection in twitter using logistic regression with particle swarm optimization. Soft Comput. 2021;25(16):11333–45. https://doi.org/10.1007/S00500-021-05930-Y.

    Article  Google Scholar 

  19. Roy PK, Chahar S. Fake profile detection on social networking websites: a comprehensive review. IEEE Trans Artif Intell. 2020;1(3):271–85. https://doi.org/10.1109/TAI.2021.3064901.

    Article  Google Scholar 

  20. Kaushik K, Bhardwaj A, Kumar M, Gupta SK, Gupta A. A novel machine learning-based framework for detecting fake Instagram profiles. Concurr Comput. 2022;34(28):e7349. https://doi.org/10.1002/CPE.7349.

    Article  Google Scholar 

  21. Jia J, Wang B, Gong NZ. Random walk based fake account detection in online social networks. In: Proceedings – 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2017; 2017. pp 273–284. https://doi.org/10.1109/DSN.2017.55.

  22. Fire M, Goldschmidt R, Elovici Y. Online social networks: threats and solutions. IEEE Commun Surv Tutorials. Apr. 2014;16(4):2019–36. https://doi.org/10.1109/COMST.2014.2321628.

  23. Alsaleh M, Alarifi A, Al-Salman AM, Alfayez M, Almuhaysin A. TSD: Detecting sybil accounts in twitter. In: Proceedings – 2014 13th International Conference on Machine Learning and Applications, ICMLA 2014; 2014. pp. 463–469. https://doi.org/10.1109/ICMLA.2014.81.

  24. Erşahin B, Aktaş Ö, Kilmç D, Akyol C. Twitter fake account detection. In: 2nd International Conference on Computer Science and Engineering, UBMK 2017; 2017. pp. 388–392. https://doi.org/10.1109/UBMK.2017.8093420.

  25. Alom Z, Carminati B, Ferrari E. Detecting spam accounts on Twitter. In: Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018; 2018. pp. 1191–1198. https://doi.org/10.1109/ASONAM.2018.8508495.

  26. Senthil Raja M, Arun Raj L. Detection of malicious profiles and protecting users in online social networks. Wirel Pers Commun. 2022;127(1):107–24. https://doi.org/10.1007/S11277-021-08095-X.

    Article  Google Scholar 

  27. Awan MJ, Khan MA, Ansari ZK, Yasin A, Shehzad HMF. Fake profile recognition using big data analytics in social media platforms. Int J Comput Appl Technol. 2022;68(3):215–22. https://doi.org/10.1504/IJCAT.2022.124942.

    Article  Google Scholar 

  28. David I, Siordia OS, Moctezuma D. Features combination for the detection of malicious Twitter accounts. In: 2016 IEEE International Autumn Meeting on Power, Electronics and Computing. ROPEC 2016; 2017. https://doi.org/10.1109/ROPEC.2016.7830626.

  29. Revathi S, Suriakala M. Profile similarity communication matching approaches for detection of duplicate profiles in online social network. In: Proceedings 2018 3rd International Conference on Computational Systems and Information Technology for Sustainable Solutions. CSITSS 2018; 2018. pp. 174–182. https://doi.org/10.1109/CSITSS.2018.8768751.

  30. Sheikhi S. An efficient method for detection of fake accounts on the Instagram platform. Rev Intell Artif. 2020;34(4):429–36. https://doi.org/10.18280/RIA.340407.

    Article  Google Scholar 

  31. Singh M, Bansal D, Sofat S. Detecting malicious users in Twitter using classifiers. In: ACM International Conference Proceeding Series, vol. 2014. 2014. pp. 247–253. https://doi.org/10.1145/2659651.2659736.

  32. Akyon FC, Esat Kalfaoglu M. Instagram fake and automated account detection. In: Proceedings – 2019 Innovations in Intelligent Systems and Applications Conference, ASYU 2019; 2019. https://doi.org/10.1109/ASYU48272.2019.8946437.

  33. Quinlan JR. Induction of decision trees. Mach Learn. 1986;1(1):81–106. https://doi.org/10.1007/BF00116251.

    Article  Google Scholar 

  34. Schölkopf B. SVMs: a practical consequence of learning theory. IEEE Intell Syst Appl. 1998;13(4):18–21. https://doi.org/10.1109/5254.708428.

    Article  Google Scholar 

  35. Breiman L. Random forests. Mach Learn. 2001;45(1):5–32. https://doi.org/10.1023/A:1010933404324/METRICS.

    Article  Google Scholar 

  36. Lewis DD. Naive(Bayes) at forty: the independence assumption in information retrieval. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 1398. 1998. pp. 4–15. https://doi.org/10.1007/BFB0026666.

  37. Maalouf M. Logistic regression in data analysis: an overview. Int J Data Anal Tech Strateg. 2011;3(3):281–99. https://doi.org/10.1504/IJDATS.2011.041335.

    Article  Google Scholar 

  38. Chen T, Guestrin C. XGBoost: a scalable tree boosting system. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. 13. 2016. pp. 785–794. https://doi.org/10.1145/2939672.2939785.

  39. Pineda FJ. Generalization of back-propagation to recurrent neural networks. Phys Rev Lett. 1987;59(19):2229. https://doi.org/10.1103/PhysRevLett.59.2229.

    Article  MathSciNet  Google Scholar 

  40. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80. https://doi.org/10.1162/NECO.1997.9.8.1735.

    Article  Google Scholar 

  41. LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–323. https://doi.org/10.1109/5.726791.

    Article  Google Scholar 

  42. Cresci S, Spognardi A, Petrocchi M, Tesconi M, Di Pietro R. The paradigm-shift of social spambots: evidence, theories, and tools for the arms race. In: 26th International World Wide Web Conference 2017. WWW 2017 Companion; 2017. pp. 963–972. https://doi.org/10.1145/3041021.3055135.

  43. Cresci S, Di Pietro R, Petrocchi M, Spognardi A, Tesconi M. Fame for sale: efficient detection of fake Twitter followers. Decis Support Syst. 2015;80:56–71. https://doi.org/10.1016/J.DSS.2015.09.003.

    Article  Google Scholar 

  44. Ghanem R, Erbay H. Spam detection on social networks using deep contextualized word representation. Multimed Tools Appl. 2023;82(3):3697–712. https://doi.org/10.1007/S11042-022-13397-8.

    Article  Google Scholar 

  45. Ghanem R, Erbay H, Bakour K. Contents-based spam detection on social networks using RoBERTa embedding and stacked BLSTM. SN Comput Sci. 2023. https://doi.org/10.1007/s42979-023-01798-x.

    Article  Google Scholar 

Download references

Funding

There was no explicit financing for this paper from any public, private, or nonprofit funding source. The figures and tables used in this publication do not require copyright rights.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Akash Shah.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest pertaining to this research.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shah, A., Varshney, S. & Mehrotra, M. Detection of Fake Profiles on Online Social Network Platforms: Performance Evaluation of Artificial Intelligence Techniques. SN COMPUT. SCI. 5, 489 (2024). https://doi.org/10.1007/s42979-024-02839-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-024-02839-9

Keywords

Navigation