SNEFL: Social network explicit fuzzy like dataset and its application for Incel detection


In this paper, with respect to reviewing and comparing existing social networks’ datasets, we introduce SNEFL dataset: the first social network dataset that includes the level of users’ likes (fuzzy like) data in addition to the likes between users. With users’ privacy in mind, the data has been collected from a social network. It includes several additional features including age, gender, marital status, height, weight, educational level and religiosity of the users. We have described its structure, analysed its features and evaluated its advantages in comparison with other social network datasets. On top of that, using unique feature of SNEFL dataset (fuzzy like) for the first time a rule-based algorithm has been developed to detect involuntary celibates (Incels) in social networks. Despite Incels activities in online social networks, until now no study on computer science has been performed to identify them. This study is the first step to address this challenge that society is facing today. Experimental results show that the accuracy of the proposed algorithm in identifying Incels among all social network users is 23.21% and among users who have fuzzy like data is 68.75%. In addition to the Incel detection, SNEFL dataset can be used by researchers in different fields to produce more accurate results. Some study areas that SNEFL dataset can be used in are network analysis, frequent pattern mining, classification and clustering.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15


  1. 1.

  2. 2.

  3. 3.


  1. 1.

    Althoff T, Leskovec J (2015) Donor retention in online crowdfunding communities: A case study of In: Proceedings of the 24th International Conference on World Wide Web. ACM, p 34–44

  2. 2.

    Anderson A, Huttenlocher D, Kleinberg J, Leskovec J, Tiwari M (2015) Global diffusion via cascading invitations: Structure, growth, and homophily. In: Proceedings of the 24th International Conference on World Wide Web. ACM, p 66–76

  3. 3.

    Bachrach Y, Graepel T, Kohli P, Kosinski M, Stillwell D (2014) Your digital image: factors behind demographic and psychometric predictions from social network profiles. In: Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems. International Foundation for Autonomous Agents and Multiagent Systems, p 1649–1650

  4. 4.

    Bello-Orgaz G, Jung JJ, Camacho D (2016) Social big data: Recent achievements and new challenges. Information Fusion 28:45–59

    Article  Google Scholar 

  5. 5.

    Bi B, Shokouhi M, Kosinski M, Graepel T (2013) Inferring the demographics of search users: Social data meets search queries. In: Proceedings of the 22nd international conference on World Wide Web. ACM, p 131–140

  6. 6.

    Blommaert J (2017) Online-offline modes of identity and community: Elliot Rodger’s twisted world of masculine victimhood

  7. 7.

    Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008(10):P10008

    Article  Google Scholar 

  8. 8.

    Buccafurri F, Fotia L, Lax G (2013) Allowing privacy-preserving analysis of social network likes. In: Privacy, Security and Trust (PST), 2013 Eleventh Annual International Conference on. IEEE, p 36–43

  9. 9.

    Burke M, Marlow C, Lento T (2010) Social network activity and social well-being. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, p 1909–1912

  10. 10.

    Burrow AL, Rainone N (2017) How many likes did I get?: Purpose moderates links between positive social media feedback and self-esteem. J Exp Soc Psychol 69:232–236

    Article  Google Scholar 

  11. 11.

    Cacioppo S, Grippo AJ, London S, Goossens L, Cacioppo JT (2015) Loneliness: Clinical import and interventions. Perspect Psychol Sci 10(2):238–249

    Article  Google Scholar 

  12. 12.

    Cheng J, Danescu-Niculescu-Mizil C, Leskovec J (2015) Antisocial behavior in online discussion communities. arXiv preprint arXiv:1504.00680

  13. 13.

    Correa T, Hinsley AW, De Zuniga HG (2010) Who interacts on the Web?: The intersection of users’ personality and social media use. Comput Hum Behav 26(2):247–253

    Article  Google Scholar 

  14. 14. (2018) Pricing. Available at: Accessed 22 May 2018

  15. 15.

    Domènech-Abella J, Lara E, Rubio-Valera M, Olaya B, Moneta MV, Rico-Uribe LA, Ayuso-Mateos JL, Mundó J, Haro JM (2017) Loneliness and depression in the elderly: the role of social network. Soc Psychiatry Psychiatr Epidemiol 52(4):381–390

    Article  Google Scholar 

  16. 16.

    Erlandsson F, Bródka P, Boldt M, Johnson H (2017) Do we really need to catch them all? A new User-guided Social Media Crawling method. Entropy 19(12):686

    Article  Google Scholar 

  17. 17.

    Erlandsson F, Bródka P, Borg A, Johnson H (2016) Finding influential users in social media using association rule learning. Entropy 18(5):164

    Article  Google Scholar 

  18. 18.

    Erlandsson F, Nia R, Boldt M, Johnson H, Wu SF (2015) Crawling online social networks. In: Network Intelligence Conference (ENIC), 2015 Second European. IEEE, p 9–16

  19. 19.

    Ferrara E, Interdonato R, Tagarelli A (2014) Online popularity and topical interests through the lens of instagram. In: Proceedings of the 25th ACM conference on Hypertext and social media. ACM, p 24–34

  20. 20.

    Fortna VP (2015) Do Terrorists Win? Rebels' Use of Terrorism and Civil War Outcomes. Int Organ 69(3):519–556

    Article  Google Scholar 

  21. 21.

    Ging, D. (2017). Alphas, betas, and incels: Theorizing the masculinities of the manosphere. Men and Masculinities.

    Article  Google Scholar 

  22. 22.

    Hajarian M, Bastanfard A, Mohammadzadeh J, Khalilian M (2017) Introducing fuzzy like in social networks and its effects on advertising profits and human behavior. Comput Hum Behav 77:282–293

    Article  Google Scholar 

  23. 23.

    Hallac D, Leskovec J, Boyd S (2015) Network lasso: Clustering and optimization in large graphs. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, p 387–396

  24. 24.

    Jin X, Wu L, Zhao G, Zhou X, Zhang X, Li X (2018) IDEA: a new dataset for image aesthetic scoring. Multimed Tools Appl:1–15

  25. 25.

    Khandelwal A, Yang Z, Ye E, Agarwal R, Stoica I (2017). ZipG: a memory-efficient graph store for interactive queries. In: Proceedings of the 2017 ACM International Conference on Management of Data. ACM, p 1149–1164

  26. 26.

    Kim AY, Escobedo-Land A (2015) OkCupid data for introductory statistics and data science courses. J Stat Educ 23(2)

  27. 27.

    Kunegis J, Lommatzsch A, Bauckhage C (2009) The slashdot zoo: mining a social network with negative edges. In Proceedings of the 18th international conference on World wide web (pp. 741–750). ACM

  28. 28.

    Leskovec J, Huttenlocher D, Kleinberg J (2010a). Predicting positive and negative links in online social networks. In: Proceedings of the 19th international conference on World wide web. ACM, p 641–650

  29. 29.

    Leskovec J, Huttenlocher D, Kleinberg J (2010b) Signed networks in social media. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, p 1361–1370.

  30. 30.

    Matz SC, Kosinski M, Nave G, Stillwell DJ (2017) Psychological targeting as an effective approach to digital mass persuasion. Proc Natl Acad Sci 201710966

  31. 31.

    McAuley J, Pandey R, Leskovec J (2015) Inferring networks of substitutable and complementary products. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, p 785–794

  32. 32.

    Meyffret S, Guillot E, Médini L, Laforest F (2012) RED: a rich epinions dataset for recommender systems (Doctoral dissertation, LIRIS)

  33. 33.

    Nagle A (2016) The New Man of 4chan. The Baffler (30):64–76

  34. 34.

    Narayanan A, Shmatikov V (2009) De-anonymizing social networks. In: Security and Privacy, 2009 30th IEEE Symposium on. IEEE, p 173–187

  35. 35.

    Nazir F, Ghazanfar MA, Maqsood M, Aadil F, Rho S, Mehmood I (2018) Social media signal detection using tweets volume, hashtag, and sentiment analysis. Multimed Tools Appl 1–34

  36. 36.

    NBC News (2018) After Toronto attack, online misogynists praise suspect as ‘new saint’. [online] Available at: Accessed 20 May 2018

  37. 37.

    New Scientist (2018) Huge new Facebook data leak exposed intimate details of 3m users. [online] Available at: Accessed 22 May 2018

  38. 38.

    Nia R, Erlandsson F, Bhattacharyya P, Rahman MR, Johnson H, Wu SF (2012) Sin: A platform to make interactions in social networks accessible. In Social Informatics (SocialInformatics), 2012 International Conference on (p 205–214). IEEE

  39. 39.

    Parand FA, Rahimi H, Gorzin M (2016) Combining fuzzy logic and eigenvector centrality measure in social network analysis

  40. 40.

    Pittman M, Reich B (2016) Social media and loneliness: Why an Instagram picture may be worth more than a thousand Twitter words. Comput Hum Behav 62:155–167

    Article  Google Scholar 

  41. 41.

    Pizzato L, Rej T, Akehurst J, Koprinska I, Yacef K, Kay J (2013) Recommending people to people: the nature of reciprocal recommenders with a case study in online dating. User Model User-Adap Inter 23(5):447–488

    Article  Google Scholar 

  42. 42.

    Popescu A, Hildebrandt M, Papadopoulos S, Claeys L, Lund D, Michalareas T, Kastrinogiannis T, Pierson J, Padyab AM (2015) October. User empowerment for enhanced online presence management–use cases and tools. In: Amsterdam Privacy Conference. p 23–26

  43. 43. (2018) Rich Epinions Dataset. [online] Available at: Accessed 22 May 2018

  44. 44.

    Raj ED, Babu LD (2017) An enhanced trust prediction strategy for online social networks using probabilistic reputation features. Neurocomputing 219:412–421

    Article  Google Scholar 

  45. 45.

    Rozemberczki B, Davies R, Sarkar R, Sutton C (2018) GEMSEC: Graph Embedding with Self Clustering arXiv preprint arXiv 1802:03997

    Google Scholar 

  46. 46.

    Ruan Z, Miao Y, Pan L, Xiang Y, Zhang J (2018) Big network traffic data visualization. Multimed Tools Appl 77(9):11459–11487

    Article  Google Scholar 

  47. 47. (2018) Available at: Accessed 22 May 2018

  48. 48.

    Stillwell DJ, Kosinski M (2012) myPersonality project: Example of successful utilization of online social networks for large-scale social research. Am Psychol 59(2):93–104

    Google Scholar 

  49. 49.

    Subbian K, Aggarwal C, Srivastava J (2016) Mining influencers using information flows in social streams. ACM Transactions on Knowledge Discovery from Data (TKDD) 10(3):26

    Article  Google Scholar 

  50. 50.

    Time (2018) The Toronto Van Attack Suspect Was Obsessed With Rejection From Women. [online] Available at: Accessed 28 April 2018

  51. 51.

    Tiwari A, Weth CVD, Kankanhalli MS (2018) Multimodal Multiplatform Social Media Event Summarization. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 14(2s):38

    Google Scholar 

  52. 52.

    West R, Paranjape A, Leskovec J (2015) Mining missing hyperlinks from human navigation traces: A case study of Wikipedia. In: Proceedings of the 24th international conference on World Wide Web. ACM, p 1242–1252.

  53. 53.

    Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181–213

    Article  Google Scholar 

  54. 54.

    Youyou W, Kosinski M, Stillwell D (2015) Computer-based personality judgments are more accurate than those made by humans. Proc Natl Acad Sci 112(4):1036–1040

    Article  Google Scholar 

  55. 55.

    Zhao Q, Erdogdu MA, He HY, Rajaraman A, Leskovec J 2015. Seismic: A self-exciting point process model for predicting tweet popularity. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, p 1513–1522

  56. 56.

    Hajarian M, Bastanfard A, Mohammadzadeh J, Khalilian M (2019) A personalized gamification method for increasing user engagement in social networks. Social Network Analysis and Mining 9(1)

Download references


The authors would like to express their deepest gratitude to Dr. Anahita Hajarian for her contribution to this article.

Author information



Corresponding authors

Correspondence to Mohammad Hajarian or Azam Bastanfard.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



Table 3 Comparing Social network datasets without like data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hajarian, M., Bastanfard, A., Mohammadzadeh, J. et al. SNEFL: Social network explicit fuzzy like dataset and its application for Incel detection. Multimed Tools Appl 78, 33457–33486 (2019).

Download citation


  • Dataset
  • Social network
  • Fuzzy like
  • Social media
  • Incel detection
  • Involuntary celibate