Skip to main content

Spam Filtering in Twitter Using Sender-Receiver Relationship

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 6961))

Abstract

Twitter is one of the most visited sites in these days. Twitter spam, however, is constantly increasing. Since Twitter spam is different from traditional spam such as email and blog spam, conventional spam filtering methods are inappropriate to detect it. Thus, many researchers have proposed schemes to detect spammers in Twitter. These schemes are based on the features of spam accounts such as content similarity, age and the ratio of URLs. However, there are two significant problems in using account features to detect spam. First, account features can easily be fabricated by spammers. Second, account features cannot be collected until a number of malicious activities have been done by spammers. This means that spammers will be detected only after they send a number of spam messages. In this paper, we propose a novel spam filtering system that detects spam messages in Twitter. Instead of using account features, we use relation features, such as the distance and connectivity between a message sender and a message receiver, to decide whether the current message is spam or not. Unlike account features, relation features are difficult for spammers to manipulate and can be collected immediately. We collected a large number of spam and non-spam Twitter messages, and then built and compared several classifiers. From our analysis we found that most spam comes from an account that has less relation with a receiver. Also, we show that our scheme is more suitable to detect Twitter spam than the previous schemes.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alexa: Top sites in united states (2011), http://www.alexa.com/topsites/countries/US

  2. NielsenWire: Twitter’s tweet smell of success (2009), http://blog.nielsen.com/nielsenwire/online_mobile/twitters-tweet-smell-of-success/

  3. TwitterBlog: #numbers (2011), http://blog.twitter.com/2011/03/numbers.html

  4. Webb, S., Caverlee, J., Pu, C.: Social honeypots: Making friends with a spammer near you. In: Proceedigns of the Fifth Conference on Email and Anti-Spam, CEAS (2008)

    Google Scholar 

  5. Lee, K., Caverlee, J., Webb, S.: Uncovering social spammers: Social honeypots + machine learning. In: Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retreival. ACM, New York (2010)

    Google Scholar 

  6. Stringhini, G., Kruegel, C., Vigna, G.: Detecting spammers on social networks. In: Proceedings of the 26th Annual Computer Security Applications Conference (ACSAC). ACM, New York (2010)

    Google Scholar 

  7. Markines, B., Cattuto, C., Menczer, F.: Social spam detection. In: Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web (AIRWeb). ACM, New York (2009)

    Google Scholar 

  8. Yardi, S., Romero, D.: Detecting spam in a twitter network. First Monday 15(1) (2010)

    Google Scholar 

  9. Gayo-Avello, D., Brenes, D.J.: Overcoming spammers in twitter - a tale of five algorithms. In: 1st Spanish Conference on Information Retrieval (2010)

    Google Scholar 

  10. Wang, A.H.: Don’t follow me: Spam detection in twitter. In: Proceedings of 5th International Conference on Security and Cryptography (SECRYPT), pp. 142–151 (2010)

    Google Scholar 

  11. Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on twitter. In: Proceedings of the 7th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference, CEAS (2010)

    Google Scholar 

  12. Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Who is tweeting on twitter: Human, bot, or cyborg?. In: Proceedings of Annual Computer Security Applications Conference, ACSAC (2010)

    Google Scholar 

  13. Thomas, K., Grier, C., Ma, J., Paxson, V., Song, D.: Design and evaluation of a real-time url spam filtering service. In: Proceedings of the IEEE Symposium on Security and Privacy (2011)

    Google Scholar 

  14. Blanzieri, E., Bryl, A.: A survey of learning-based techniques of email spam filtering. Artif. Intell. Rev. 29, 63–92 (2008)

    Article  Google Scholar 

  15. Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A bayseian approach to filtering junk e-mail. In: Learning for Text Categorization: Papers from the 1998 Workshop. AAAI Technical Report, AAAI Technical Report WS-98-05 (1998)

    Google Scholar 

  16. Grier, C., Thomas, K., Paxson, V., Zhang, M.: @spam: the underground on 140 characters or less. In: Proceedings of the 17th ACM Conference on Computer and Communications Security. ACM, New York (2010)

    Google Scholar 

  17. TwitterHelpCenter: How to report spam on twitter, http://support.twitter.com/articles/64986-how-to-report-spam-on-twitter

  18. TwitterBlog: State of twitter spam (2010), http://blog.twitter.com/2010/03/state-of-twitter-spam.html

  19. TwitterBlog: Measuring tweets (2010), http://blog.twitter.com/2010/02/measuring-tweets.html

  20. Yu, H., Kaminsky, M., Gibbons, P.B., Flaxman, A.: Sybilguard: defending against sybil attacks via social networks. In: Proceedings of ACM SIGCOMM Conference, SIGCOMM 2006. ACM, New York (2006)

    Google Scholar 

  21. Yu, H., Gibbons, P.B., Kaminsky, M., Xiao, F.: Sybillimit: A near-optimal social network defense against sybil attacks. In: Proceedings of the 2008 IEEE Symposium on Security and Privacy. IEEE Computer Society, Los Alamitos (2008)

    Google Scholar 

  22. Kwak, H., Lee, C., Park, H., Moon, S.: What is twitter, a social network or a news media? In: WWW 2010: Proceedings of the 19th International Conference on World Wide Web. ACM, New York (2010)

    Google Scholar 

  23. Menger, K.: Zur allgemeinen kurventheorie. Inventiones Mathematicae 10 (1927)

    Google Scholar 

  24. Aharoni, R., Berger, E.: Menger’s theorem for infinite graphs. Inventiones Mathematicae 176(1) (2009)

    Google Scholar 

  25. Langville, A.N., Meyer, C.D.: A survey of eigenvector methods for web information retrieval. SIAM Review 47(1) (2005)

    Google Scholar 

  26. Perron, O.: Zur theorie der matrices. Mathematicsche Annalen 64(2) (1907)

    Google Scholar 

  27. Keener, J.: The perron–frobenius theorem and the ranking of football teams. SIAM Review 35(1) (1993)

    Google Scholar 

  28. TwitterAPIwiki: Rate limiting, http://dev.twitter.com/pages/rate-limiting

  29. Paul, R.: Twitter tells third-party devs to stop making twitter client apps (2011), http://arstechnica.com/software/news/2011/03/twitter-tells-third-party-devs-to-stop-making-twitter-client-apps.ars

  30. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. SIGKDD Explorations 11(1) (2009), http://www.cs.waikato.ac.nz/ml/weka/

  31. Ntoulas, A., Najork, M., Manasse, M., Fetterly, D.: Detecting spam web pages through content analysis. In: Proceedings of the 15th International Conference on World Wide Web, WWW 2006. ACM, New York (2006)

    Google Scholar 

  32. Thomas, K., Grier, C., Ma, J., Paxson, V., Song, D.: Design and evaluation of a real-time url spam filtering service. In: Proceedings of the IEEE Symposium on Security and Privacy (May 2011)

    Google Scholar 

  33. Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Identifying suspicious urls: an application of large-scale online learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Robin Sommer Davide Balzarotti Gregor Maier

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Song, J., Lee, S., Kim, J. (2011). Spam Filtering in Twitter Using Sender-Receiver Relationship. In: Sommer, R., Balzarotti, D., Maier, G. (eds) Recent Advances in Intrusion Detection. RAID 2011. Lecture Notes in Computer Science, vol 6961. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23644-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23644-0_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23643-3

  • Online ISBN: 978-3-642-23644-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics