A Framework for Unsupervised Spam Detection in Social Networking Sites

Bosma, Maarten; Meij, Edgar; Weerkamp, Wouter

doi:10.1007/978-3-642-28997-2_31

Maarten Bosma²²,
Edgar Meij²² &
Wouter Weerkamp²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7224))

Included in the following conference series:

European Conference on Information Retrieval

2863 Accesses
19 Citations

Abstract

Social networking sites offer users the option to submit user spam reports for a given message, indicating this message is inappropriate. In this paper we present a framework that uses these user spam reports for spam detection. The framework is based on the HITS web link analysis framework and is instantiated in three models. The models subsequently introduce propagation between messages reported by the same user, messages authored by the same user, and messages with similar content. Each of the models can also be converted to a simple semi-supervised scheme. We test our models on data from a popular social network and compare the models to two baselines, based on message content and raw report counts. We find that our models outperform both baselines and that each of the additions (reporters, authors, and similar messages) further improves the performance of the framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Benevenuto, F., Rodrigues, T., Almeida, V., Almeida, J., Zhang, C., Ross, K.: Identifying video spammers in online social networks. In: Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web, pp. 45–52. ACM (2008)
Google Scholar
Bian, J., Liu, Y., Zhou, D., Agichtein, E., Zha, H.: Learning to recognize reliable users and content in social media with coupled mutual reinforcement. In: Proceedings of the 18th International Conference on World Wide Web, pp. 51–60. ACM (2009)
Google Scholar
Campbell, C.S., Maglio, P.P., Cozzi, A., Dom, B.: Expertise identification using email communications. In: Proceedings of the Twelfth International Conference on Information and Knowledge Management, pp. 528–531. ACM (2003)
Google Scholar
DeBarr, D., Wechsler, H.: Using Social Network Analysis for Spam Detection. In: Chai, S.-K., Salerno, J.J., Mabry, P.L. (eds.) SBP 2010. LNCS, vol. 6007, pp. 62–69. Springer, Heidelberg (2010)
Chapter Google Scholar
Dom, B., Eiron, I., Cozzi, A., Zhang, Y.: Graph-based ranking algorithms for e-mail expertise analysis. In: Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 42–48. ACM (2003)
Google Scholar
Fawcett, T.: An introduction to roc analysis. Pattern Recognition Letters 27(8), 861–874 (2006)
Article MathSciNet Google Scholar
Ghiossi, C.: The facebook blog: Explaining facebook’s spam prevention systems (2010), http://blog.facebook.com/blog.php?post=403200567130 (accessed May 12, 2011)
Guha, R., Kumar, R., Raghavan, P., Tomkins, A.: Propagation of trust and distrust. In: Proceedings of the 13th International Conference on World Wide Web, pp. 403–412. ACM (2004)
Google Scholar
Irani, D., Webb, S., Pu, C.: Study of static classification of social spam profiles in myspace. In: Proceedings of the 4th International Conference on Weblogs and Social Media (2010)
Google Scholar
Jurczyk, P., Agichtein, E.: Discovering authorities in question answer communities by using link analysis. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 919–922. ACM (2007)
Google Scholar
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46(5), 604–632 (1999)
Article MathSciNet MATH Google Scholar
Lee, K., Caverlee, J., Webb, S.: Uncovering social spammers: social honeypots+ machine learning. In: Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 435–442. ACM (2010)
Google Scholar
Lu, Y., Tsaparas, P., Ntoulas, A., Polanyi, L.: Exploiting social context for review quality prediction. In: Proceedings of the 19th International Conference on World Wide Web, pp. 691–700. ACM (2010)
Google Scholar
Mehta, B., Hofmann, T., Fankhauser, P.: Lies and propaganda: detecting spam users in collaborative filtering. In: Proceedings of the 12th International Conference on Intelligent User Interfaces, pp. 14–21. ACM (2007)
Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. In: Stanford InfoLab. Citeseer (1999)
Google Scholar
Wang, A.H.: Don’t follow me: Spam detection in twitter. In: Proceedings of the 2010 International Conference on Security and Cryptography (SECRYPT), pp. 1–10. IEEE (2010)
Google Scholar
Zhang, J., Tang, J., Li, J.: Expert Finding in a Social Network. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 1066–1069. Springer, Heidelberg (2007)
Chapter Google Scholar
Ziegler, C.N., Lausen, G.: Spreading activation models for trust propagation. In: IEEE International Conference on e-Technology, e-Commerce and e-Service, EEE 2004, pp. 83–97. IEEE (2004)
Google Scholar
Zinman, A., Donath, J.: Is britney spears spam. In: Fourth Conference on Email and Anti-Spam, Mountain View. Citeseer, CA (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

ISLA, University of Amsterdam, Science Park 904, 1098 XH, Amsterdam, The Netherlands
Maarten Bosma, Edgar Meij & Wouter Weerkamp

Authors

Maarten Bosma
View author publications
You can also search for this author in PubMed Google Scholar
Edgar Meij
View author publications
You can also search for this author in PubMed Google Scholar
Wouter Weerkamp
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Yahoo! Research, Diagonal 177, 08018, Barcelona, Spain
Ricardo Baeza-Yates & B. Barla Cambazoglu &
Centrum Wiskunde & Informatica, Science Park 123, Amsterdam, The Netherlands
Arjen P. de Vries
Websays, Nàpols 294 7-4, 08025, Barcelona, Spain
Hugo Zaragoza
Yahoo! Research, Diagnoal 177, 08018, Barcelona, Spain
Vanessa Murdock
Yahoo! Labs, Tower 3, Matam Park, 31905, Haifa, Israel
Ronny Lempel
ISTI-CNR, via G. Moruzzi, 1, 56124, Pisa, Italy
Fabrizio Silvestri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bosma, M., Meij, E., Weerkamp, W. (2012). A Framework for Unsupervised Spam Detection in Social Networking Sites. In: Baeza-Yates, R., et al. Advances in Information Retrieval. ECIR 2012. Lecture Notes in Computer Science, vol 7224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28997-2_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-28997-2_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28996-5
Online ISBN: 978-3-642-28997-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics