Spammer Detection on Online Social Networks Based on Logistic Regression

Zhu, Xiang; Nie, Yuanping; Jin, Songchang; Li, Aiping; Jia, Yan

doi:10.1007/978-3-319-23531-8_3

Xiang Zhu¹⁵,
Yuanping Nie¹⁵,
Songchang Jin¹⁵,
Aiping Li¹⁵ &
…
Yan Jia¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9391))

Included in the following conference series:

International Conference on Web-Age Information Management

798 Accesses
11 Citations

Abstract

Millions of users generate and propagate information in online social network. Search engines and data mining tools allow people to track hot topics and events online. However, the massive use of social media also makes it easier for malicious users, known as social spammers, to occupy social network with junk information. To solve this problem, a classifier is needed to detect social spammers. One effective way for spammer detection is based on contents and user information. Nevertheless, social spammers are tricky and able to fool the system with evolving their contents and information. Firstly, social spammers continually change their patterns to deceive detecting system. Secondly, spammers will try to gain influence and disguise themselves as far as possible. Due to the dynamic pattern of social spammers, it is difficult for existing methods to effectively and efficiently respond to social spammers. In this paper, we present a model based on logistic regression considering content attributes and behavior attributes of users in social network. Analyses of user attributes are made to differentiate spammers and non-spammers inherently. Experimental results on Twitter data show the effectiveness and efficiency of the proposed method.

X. Zhu—Sponsored by National Key fundamental Research and Development Program No. 2013CB329601.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on twitter. In: Collaboration, Electronic Messaging, Anti-abuse and Spam Conference (CEAS), vol. 6, p. 12 (2010)
Google Scholar
Benevenuto, F., Rodrigues, T., Almeida, V.A., Almeida, J., Gonçalves, M., Ross, K.: Video pollution on the web. First Monday 15(4), 1–20 (2010)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article MATH Google Scholar
Calais, P., Pires, D.E., Neto, D.O.G., Meira Jr., W., Hoepers, C., Steding-Jessen, K.: A campaign-based characterization of spamming strategies. In: CEAS (2008)
Google Scholar
Chen, C., Wu, K., Srinivasan, V., Zhang, X.: Battling the internet water army: Detection of hidden paid posters. In: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 116–120. ACM (2013)
Google Scholar
Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Who is tweeting on twitter: human, bot, or cyborg? In: Proceedings of the 26th Annual Computer Security Applications Conference, pp. 21–30. ACM (2010)
Google Scholar
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ACM (2006)
Google Scholar
Fetterly, D., Manasse, M., Najork, M.: Spam, damn spam, and statistics: Using statistical analysis to locate spam web pages. In: Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004, pp. 1–6. ACM (2004)
Google Scholar
Genkin, A., Lewis, D.D., Madigan, D.: Large-scale bayesian logistic regression for text categorization. Technometrics 49(3), 291–304 (2007)
Article MathSciNet Google Scholar
Hosmer, D.W., Lemeshow, S., Sturdivant, R.X.: Introduction to the logistic regression model. Wiley Online Library (2000)
Google Scholar
Hu, X., Tang, J., Liu, H.: Online social spammer detection. In: Twenty-Eighth AAAI Conference on Artificial Intelligence (2014)
Google Scholar
Islam, M.S., Mahmud, A.A., Islam, M.R.: Machine learning approaches for modeling spammer behavior. In: Cheng, P.-J., Kan, M.-Y., Lam, W., Nakov, P. (eds.) AIRS 2010. LNCS, vol. 6458, pp. 251–260. Springer, Heidelberg (2010)
Chapter Google Scholar
Kwak, H., Lee, C., Park, H., Moon, S.: What is twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600. ACM (2010)
Google Scholar
Pal, A., Counts, S.: Identifying topical authorities in microblogs. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 45–54. ACM (2011)
Google Scholar
Ron, K., Foster, P.: Special issue on applications of machine learning and the knowledge discovery process. J. Mach. Learn. 30, 271–274 (1998)
Article Google Scholar
Sadowski, C., Levin, G.: Simhash: Hash-based similarity detection. Technical report, Google (2007)
Google Scholar
Sumner, M., Frank, E., Hall, M.: Speeding up logistic model tree induction. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 675–683. Springer, Heidelberg (2005)
Chapter Google Scholar
Yun, Z., Quan, Z., Caixin, S., Shaolan, L., Yuming, L., Yang, S.: RBF neural network and ANFIS-based short-term load forecasting approach in real-time price environment. IEEE Trans. Power Syst. 23(3), 853–858 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer, National University of Defense Technology, Changsha, China
Xiang Zhu, Yuanping Nie, Songchang Jin, Aiping Li & Yan Jia

Authors

Xiang Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Yuanping Nie
View author publications
You can also search for this author in PubMed Google Scholar
Songchang Jin
View author publications
You can also search for this author in PubMed Google Scholar
Aiping Li
View author publications
You can also search for this author in PubMed Google Scholar
Yan Jia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiang Zhu .

Editor information

Editors and Affiliations

School of Computer Engineering, Nanyang Technological University, Singapore, Singapore
Xiaokui Xiao
Advanced Digital Sciences Center, Singpore, Singapore
Zhenjie Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, X., Nie, Y., Jin, S., Li, A., Jia, Y. (2015). Spammer Detection on Online Social Networks Based on Logistic Regression. In: Xiao, X., Zhang, Z. (eds) Web-Age Information Management. WAIM 2015. Lecture Notes in Computer Science(), vol 9391. Springer, Cham. https://doi.org/10.1007/978-3-319-23531-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-23531-8_3
Published: 21 October 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23530-1
Online ISBN: 978-3-319-23531-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics