Skip to main content
Log in

Facebook Inspector (FbI): Towards automatic real-time detection of malicious content on Facebook

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Online Social Networks witness a rise in user activity whenever a major event makes news. Cyber criminals exploit this spur in user engagement levels to spread malicious content that compromises system reputation, causes financial losses and degrades user experience. In this paper, we collect and characterize a dataset of 4.4 million public posts generated on Facebook during 17 news-making events (natural calamities, sports, terror attacks, etc.) over a 16-month time period. From this dataset, we filter out two sets of malicious posts, one using URL blacklists and another using human annotations. Our observations reveal some characteristic differences between malicious posts obtained from the two methodologies, thus demanding a twofold filtering process for a more complete and robust filtering system. We empirically confirm the need for this twofold filtering approach by cross-validating supervised learning models obtained from the two sets of malicious posts. These supervised learning models include Naive Bayesian, Decision Trees, Random Forest, and Support Vector Machine-based models. Based on this learning, we implement Facebook Inspector, a REST API-based browser plug-in for identifying malicious Facebook posts in real time. Facebook Inspector uses class probabilities obtained from two independent supervised learning models based on a Random Forest classifier to identify malicious posts in real time. These supervised learning models are based on a feature set comprising of 44 features and achieve an accuracy of over 80% each, using only publicly available features. During the first 9 months of its public deployment (August 2015–May 2016), Facebook Inspector processed 0.97 million posts at an average response time of 2.6 s per post and was downloaded over 2500 times. We also evaluate Facebook Inspector in terms of performance and usability to identify further scope for improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. http://edition.cnn.com/2014/07/14/tech/social-media/world-cup-social-media/.

  2. http://news.discovery.com/human/psychology/social-media-ebola-hoax-causes-deaths-14100.htm.

  3. http://multiosn.iiitd.edu.in/fbapi/endpoint/?version=2.0&fid=<post_id>.

  4. https://apps.facebook.com/mypagekeeper/.

  5. https://www.facebook.com/legal/terms.

  6. https://www.facebook.com/communitystandards.

  7. https://www.mturk.com/mturk/welcome.

  8. http://www.crowdflower.com/.

  9. http://docs.python-requests.org/en/latest/.

  10. http://longurl.org/api.

  11. We refer to a post as malicious if it contains a malicious URL.

  12. https://developers.facebook.com/docs/graph-api/reference/v2.2/url.

  13. The top 25 applications were used to generate over 95% of content in all three categories we analyzed.

  14. http://scikit-learn.org.

  15. http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html.

  16. Overall, only 24.74% of all posts in Dataset III originated from pages.

  17. We marked events as crisis based on the following definition of crisis from the Oxford English Dictionary: “A time of intense difficulty or danger”.

  18. http://webpy.org/.

  19. https://developers.facebook.com/docs/graph-api/reference/v2.5/post.

  20. As of May, 2016. Source: http://www.w3schools.com/browsers/browsers_stats.asp.

  21. http://precog.iiitd.edu.in/osm.html#fbi.

  22. https://chrome.google.com/webstore/detail/facebook-inspector/jlhjfkmldnokgkhbhgbnmiejokohmlfc.

  23. https://addons.mozilla.org/en-US/firefox/addon/fbi-facebook-inspector/.

  24. https://measuringu.com/sus.php.

  25. http://www.bbc.com/news/world-us-canada-21508660.

  26. http://newsroom.fb.com/news/2014/07/world-cup-breaks-facebook-records/.

  27. https://www.facebook.com/legal/terms.

References

  • Acar A, Muraki Y (2011) Twitter for crisis communication: lessons learned from Japan’s tsunami disaster. Int J Web Based Communities 7(3):392–402

    Article  Google Scholar 

  • Aggarwal A, Rajadesingan A, Kumaraguru P (2012) Phishari: automatic realtime phishing detection on twitter. In: eCrime Researchers Summit (eCrime), 2012. IEEE, pp 1–12

  • Ahmed F, Abulaish M (2012) An mcl-based approach for spam profile detection in online social networks. In: IEEE TrustCom. IEEE, pp 602–608

  • Antoniades D, Polakis I, Kontaxis G, Athanasopoulos E, Ioannidis S, Markatos EP, Karagiannis T (2011) we. b: The web of short URLs. In: Proceedings of WWW. ACM, pp 715–724

  • Benevenuto F, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on twitter. In: CEAS, vol 6, p 12

  • Benevenuto F, Rodrigues T, Almeida V, Almeida J, Gonçalves M (2009) Detecting spammers and content promoters in online video social networks. In: Proceedings of ACM SIGIR. ACM, pp 620–627

  • Brooke J (1996) SUS-a quick and dirty usability scale. Usability Eval Ind 189:194

    Google Scholar 

  • Castillo C, Mendoza M, Poblete B (2011) Information credibility on twitter. In: WWW. ACM, pp 675–684

  • Catanese S, De Meo P, Ferrara E, Fiumara G, Provetti A (2012) Extraction and analysis of facebook friendship relations. In: Computational Social Networks. Springer, Berlin, pp 291–324

  • Chhabra S, Aggarwal A, Benevenuto F, Kumaraguru P (2011) Phi.sh/$ocial: the phishing landscape through short urls. In: CEAS. ACM, pp 92–101

  • Chu Z, Widjaja I, Wang H (2012) Detecting social spam campaigns on twitter. In: Applied cryptography and network security. Springer, Berlin, pp 455–472

  • Facebook (2014) http://newsroom.fb.com/company-info/. Facebook Company Info

  • Facebook, Ericsson, Qualcomm (2013) A focus on efficiency. Whitepaper, Internet.org

  • Facebook Developers (2011) Keeping you safe from scams and spam. https://www.facebook.com/notes/facebook-security/keeping-you-safe-from-scams-and-spam/10150174826745766

  • Facebook Developers (2013) Facebook graph api search. https://developers.facebook.com/docs/graph-api/using-graph-api/v1.0#search

  • Gao H, Chen Y, Lee K, Palsetia D, Choudhary AN (2012) Towards online spam filtering in social networks. In: NDSS

  • Gao H, Hu J, Wilson C, Li Z, Chen Y, Zhao BY (2010) Detecting and characterizing social spam campaigns. In: Internet measurement conference. ACM, pp 35–47

  • Gao H, Yang Y, Bu K, Chen Y, Downey D, Lee K, Choudhary A (2014) Spam ain’t as diverse as it seems: throttling osn spam with templates underneath. In: Proceedings of the 30th annual computer security applications conference. ACM, pp 76–85

  • Google (2014) Safe browsing api. https://developers.google.com/safe-browsing/

  • Grier C, Thomas K, Paxson V, Zhang M (2010) @ spam: the underground on 140 characters or less. In: CCS. ACM, pp 27–37

  • Gupta A, Kumaraguru P (2012) Credibility ranking of tweets during high impact events. In: PSOSM. ACM

  • Gupta A, Kumaraguru P, Castillo C, Meier P (2014) Tweetcred: Real-time credibility assessment of content on twitter. In: Social Informatics. Springer, Berlin, pp 228–243

  • Gupta A, Lamba H, Kumaraguru P (2013) $1.00 per rt #bostonmarathon #prayforboston: analyzing fake content on twitter. In: eCRS. IEEE, p 12

  • Gupta M, Zhao P, Han J (2012) Evaluating event credibility on twitter. In: SDM. SIAM, pp 153–164

  • Hispasec Sistemas SL (2013) VirusTotal Public API. https://www.virustotal.com/en/documentation/public-api/

  • Holcomb J, Gottfried J, Mitchell A (2013) News use across social media platforms. Technical report, Pew Research Center

  • Marca.com (2014) Luis suarez used as bait for Facebook scam. http://www.marca.com/2014/07/18/en/football/barcelona/1405709402.html

  • McAuley JJ, Leskovec J (2012) Learning to discover social circles in ego networks. NIPS 2012:548–556

    Google Scholar 

  • Mendoza M, Poblete B, Castillo C (2010) Twitter under crisis: can we trust what we rt? In: Proceedings of the first workshop on social media analytics. ACM, pp 71–79

  • OpenDNS (2014) Phishtank api. http://www.phishtank.com/api_info.php

  • Opsahl T (2013) Triadic closure in two-mode networks: redefining the global and local clustering coefficients. Soc Netw 35(2):159–167

    Article  Google Scholar 

  • Owens E, Turitzin C (2014) News feed fyi: Cleaning up news feed spam. http://newsroom.fb.com/news/2014/04/news-feed-fyi-cleaning-up-news-feed-spam/

  • Owens E, Weinsberg U (2015) News feed fyi: Showing fewer hoaxes. https://newsroom.fb.com/news/2015/01/news-feed-fyi-showing-fewer-hoaxes/

  • Palen L (2008) Online social media in crisis events. Educ Q 31(3):76–78

    MathSciNet  Google Scholar 

  • Rahman MS, Huang T-K, Madhyastha HV, Faloutsos M (2012) Efficient and scalable socware detection in online social networks. In: USENIX security symposium, pp 663–678

  • Rudra K, Banerjee S, Ganguly N, Goyal P, Imran M, Mitra P (2016) Summarizing situational tweets in crisis scenario. In: Proceedings of the 27th ACM conference on hypertext and social media. ACM, pp 137–147

  • Security RBS (2010) Facebook names dataset. https://blog.skullsecurity.org/2010/return-of-the-facebook-snatchers

  • Semaan B, Mark G (2012) ’facebooking’towards crisis recovery and beyond: disruption as an opportunity. In: Proceedings of the ACM 2012 conference on computer supported cooperative work. ACM, pp 27–36

  • Sheng S, Wardman B, Warner G, Cranor L, Hong J, Zhang C (2009) An empirical analysis of phishing blacklists. In: Sixth conference on Email and anti-spam (CEAS)

  • SpamHaus (2014) Domain block list. http://www.spamhaus.org/dbl/

  • Stein T, Chen E, Mangla K (2011) Facebook immune system. In: Workshop on social network systems. ACM, p 8

  • Stringhini G, Kruegel C, Vigna G (2010) Detecting spammers on social networks. In: ACSAC. ACM, pp 1–9

  • SURBL, URI (2011) Reputation data. http://www.surbl.org/surbl-analysis

  • Szell M, Grauwin S, Ratti C (2014) Contraction of online response to major events. PLoS One 9(2):e89052 MIT

    Article  Google Scholar 

  • TheGuardian (2013) Facebook spammers make $200m just posting links, researchers say. http://www.theguardian.com/technology/2013/aug/28/facebook-spam-202-million-italian-research

  • Traud AL, Mucha PJ, Porter MA (2012) Social structure of facebook networks. Physica A 391(16):4165–4180

    Article  Google Scholar 

  • Viswanath B, Mislove A, Cha M, Gummadi KP (2009) On the evolution of user interaction in facebook. In: Proceedings of the 2nd ACM workshop on online social networks. ACM, pp 37–42

  • Wang AH (2010) Don’t follow me: Spam detection in twitter. In: SECRYPT. IEEE, pp 1–10

  • WOT (2014) Web of trust api. https://www.mywot.com/en/api

  • Zech M (2014) Flight 17 spam scams on facebook, twitter. http://www.nltimes.nl/2014/07/22/flight-17-spam-scams-facebook-twitter/

  • Zhang X, Zhu S, Liang W (2012) Detecting spam and promoting campaigns in the twitter social network. In: 2012 IEEE 12th international conference on data mining (ICDM). IEEE, pp 1194–1199

  • Zhu T, Gao H, Yang Y, Bu K, Chen Y, Downey D, Lee K, Choudhary AN (2016) Beating the artificial chaos: fighting OSN spam using its own templates. IEEE/ACM Trans Netw 24(6):3856–3869

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank Manik Panwar for helping with the development of Facebook Inspector and Bhavna Nagpal for helping with conducting the usability survey. We would also like to thank the members of Precog Research Group at IIIT-Delhi for their constant support and feedback.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prateek Dewan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dewan, P., Kumaraguru, P. Facebook Inspector (FbI): Towards automatic real-time detection of malicious content on Facebook. Soc. Netw. Anal. Min. 7, 15 (2017). https://doi.org/10.1007/s13278-017-0434-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-017-0434-5

Keywords

Navigation