KES 2005: Knowledge-Based Intelligent Information and Engineering Systems pp 723-729 | Cite as
Detecting Search Engine Spam from a Trackback Network in Blogspace
Conference paper
Abstract
We aim to develop a technique to detect search engine optimization (SEO) spam websites. Specifically, we propose four methods for extracting the SEO spam entries from a given trackback network in blogspace that are based on fundamental metrics on a network. Using real data of trackback networks in blogspace, we experimentally evaluate the performance of the proposed methods, and demonstrate that the method of ranking entries based on average degrees of nearest neighbors can be a very promising approach for extracting SEO spam entries.
Keywords
Search Engine Complex Network Theory International World Wide 12th International World Search Engine Optimization
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Preview
Unable to display preview. Download preview PDF.
References
- 1.Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)CrossRefMathSciNetGoogle Scholar
- 2.Brin, S., Page, L.: The anatomy of a large scale hypertextualWeb search engine. In: Proceedings of the Seventh International World Wide Web Conference, pp. 107–117 (1998)Google Scholar
- 3.Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of Web communities. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 150–160 (2000)Google Scholar
- 4.Girvan, M., Newman, E.J.: Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America 99, 7821–7826 (2002)MATHCrossRefMathSciNetGoogle Scholar
- 5.Gruhl, D., Guha, R., Liben-Nowell, D., Tomkins, A.: Information diffusion through blogspace. In: Proceedings of the 13th International World Wide Web Conference, pp. 491–501 (2004)Google Scholar
- 6.Kleinberg, J.: Authoritative sources in a hyperlinked environment. In: Proceedings of the Ninth ACM-SIAM Symposium on Discrete Algorithms, pp. 668–677 (1998)Google Scholar
- 7.Kumar, R., Novak, J., Raghavan, P., Tomkins, A.: On the bursty evolution of Blogspace. In: Proceedings of the 12th International World Wide Web Conference, pp. 568–576 (2003)Google Scholar
- 8.Pastor-Satorras, R., Vázquez, A., Vespignani, A.: Dynamical and correlation properties of the Internet. Physical Review Letters 87, 258701 (2001)CrossRefGoogle Scholar
- 9.Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998)CrossRefGoogle Scholar
Copyright information
© Springer-Verlag Berlin Heidelberg 2005