Dynamic Feature Selection for Spam Detection in Twitter
- 352 Downloads
Social Networks continue to increase their popularity day by day. With the widespread availability of Internet access, interest of people in social networks has also increased significantly. The fact that, popularity of social media makes it tempting to use social media platforms for bad purposes. Malicious people are attempting to gain unfair profits by using fake accounts and various techniques. Among these initiatives, SPAM is one of the most frequently used methods. Today, SPAM attacks on social networks are increasing and many social network users are exposed to this and similar attacks. To identify SPAM users among billions of social network users, the examination of massive amounts of data requires a challenging large-scale data analysis. In this study, we group similar Twitter users and introduce a dynamic feature selection technique that use different features for each user groups instead of use static feature set and apply machine learning algorithms to classify spam users on Twitter.
Keywords:Social Media Spam Detection Feature Selection Big Data
This work is also a part of the M.Sc. thesis titled Big Data Analysis in Social Media at Istanbul University, Department of Computer Engineering.
- 2.Kandasamy KT, Koroth P (2014) An integrated approach to spam classification on Twitter using URL analysis. In: IEEE students’ conference on electrical, electronics and computer scienceGoogle Scholar
- 3.Chaitanya KT, Ponnapalli H, Herts D, Pablo J (2012) Analysis and detection of modern spam techniques on social networking sites. In: Third international conference on services in emerging marketsGoogle Scholar
- 5.Amit AA, Reddy N, Yadav S, Gu G, Yang C (2013) CATS: characterizing automation of Twitter spammers, communication systems and networks (COMSNETS). Fifth Int Conf 2013:1–10Google Scholar
- 7.Twitter, 2015, Twitter Kullanımı ve Şirket Verileri, https://about.twitter.com/tr/company, [Ziyaret Tarihi: 27 Ekim 2015].
- 8.Wank K, Wang Y, Li H, Zhang X (2011). A new approach for detecting spam microblogs based on text and user’s social network features. In: Proceedings of the VLDB endowment, vol 4, No 12. Seattle, WashingtonGoogle Scholar
- 9.Cao C, Caverlee J (2014) Behavioral detection of spam URL sharing: posting patterns versus click patterns. In: International conference on advances in social networks analysis and miningGoogle Scholar
- 10.Wang D (2014) Analysis and detection of low quality information in social networks, PhD symposium of 30th IEEE international conference on data engineering (ICDE 2014). Chicago, IL, United StatesGoogle Scholar
- 11.Radulescu C, Dinsoreanu M, Potolea R (2014) Identification of spam comments using natural language processing techniques. In: 2014 IEEE 10th international conference on intelligent computer communication and processingGoogle Scholar
- 13.Fabricio B, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on Twitter, collaboration, electronic messaging, anti- abuse and spam conference (CEAS), vol 6. National Academy PressGoogle Scholar
- 14.Rashhid C, Nuriddin M, Mahmud GAN, Rashedur M (2013) A data mining based spam detection system for YouTube. In: Eighth international conference on digital information management, pp. 373–378Google Scholar
- 15.Sarita Y, Daniel R, Grant S, Danah B (2010) Detecting spam in a twitter network. Microsoft Res First Monday, 15(1)Google Scholar
- 16.Stafford G, Louis LY (2013) An evaluation of the effect of spam on twitter trending topics. IEEE, New YorkGoogle Scholar
- 17.Zhao Y, Zhaoxiang Z, Yungonh W, Liu J (2012) Robust mobile spamming detection via graph patterns. In: 21st international conference on pattern recognition.Google Scholar
- 19.Mohammed B (2011) An unsupervised approach for identifying spammers in social networks. In: 23rd IEEE international conference on tools with artificial intelligence.Google Scholar
- 20.Pelleg D, Moore A (2000) X-means: extending K-means with efficient estimation of the number of clusters. ICML.Google Scholar