Abstract
We address a novel semi-supervised learning strategy for Web Spam issue. The proposed approach explores graph construction which is the key of representing data semantical relationship, and emphasizes on label propagation from multi views under consistency criterion. Furthermore, we infer labels for the rest of the unlabeled nodes in fusing spectral space. Experiments on the Webspam Challenging dataset validate the efficiency and effectiveness of the proposed method.
This work is partially supported by Natural Science Foundation of China under grant No. 60275025 and No. 60121302.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baeza-Yates, R., Castitlo, C., Davison, B.D., Denoyer, L., Gallinari, P.: Web spam challenge (2007), http://webspam.lip6.fr/wiki/pmwiki.php
Benczúr, A.A., Csalogány, K., Sarlós, T., Uher, M.: Spamrank – fully automatic link spam detection. In: AIRWeb, pp. 25–38 (2005)
Castillo, C., Donato, D., Gionis, A., Murdock, V., Silvestri, F.: Know your neighbors: Web spam detection using the web topology. In: Proceedings of SIGIR, Amsterdam, Netherlands, pp. 423–430. ACM Press, New York (2007)
Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: VLDB, pp. 576–587 (2004)
Kamvar, S.D., Klein, D., Manning, C.D.: Spectral learning. In: IJCAI, pp. 561–566 (2003)
Zhou, D.Y., Burges, C.J.C.: Spectral clustering and transductive learning with multiple views. In: ICML 2007, pp. 1159–1166 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, YJ., Yang, SH., Hu, BG. (2008). Fighting WebSpam: Detecting Spam on the Graph Via Content and Link Features. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_112
Download citation
DOI: https://doi.org/10.1007/978-3-540-68125-0_112
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68124-3
Online ISBN: 978-3-540-68125-0
eBook Packages: Computer ScienceComputer Science (R0)