Web Spam Detection by Probability Mapping GraphSOMs and Graph Neural Networks

  • Lucia Di Noi
  • Markus Hagenbuchner
  • Franco Scarselli
  • Ah Chung Tsoi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6353)

Abstract

In this paper, we will apply, to the task of detecting web spam, a combination of the best of its breed algorithms for processing graph domain input data, namely, probability mapping graph self organizing maps and graph neural networks. The two connectionist models are organized into a layered architecture, consisting of a mixture of unsupervised and supervised learning methods. It is found that the results of this layered architecture approach are comparable to the best results obtained so far by others using very different approaches.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: Proceedings of the Thirtieth international conference on Very large data bases, vol. 30, p. 587. VLDB Endowment (2004)Google Scholar
  2. 2.
    Manning, C., Raghavan, P., Schütze, H.: An introduction to information retrieval. Cambridge University Press, Cambridge (2008)Google Scholar
  3. 3.
    Brin, S., Page, L., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the Web. Technical Report 1999-66, Stanford University (1999), http://dbpubs.stanford.edu:8090/pub/1999-66
  4. 4.
    Bianchini, M., Gori, M., Scarselli, F.: Inside pagerank. ACM Transactions on Internet Technology (TOIT) 5(1), 92–128 (2005)CrossRefGoogle Scholar
  5. 5.
    Gyöngyi, Z., Garcia-Molina, H.: Web spam taxonomy. In: Adversarial Information Retrieval on the Web (2005)Google Scholar
  6. 6.
    Scarselli, F., Gori, M., Tsoi, A., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Transactions on Neural Networks 20(1), 61–80 (2009)CrossRefGoogle Scholar
  7. 7.
    Hagenbuchner, M., Zhang, S., Tsoi, A., Sperduti, A.: Projection of undirected and nonpositional graphs using self organizing maps. In: European Symposium on Artificial Neural Networks-Advances in Computational Intelligence and Learning, pp. 22–24 (April 2009)Google Scholar
  8. 8.
    Scarselli, F., Gori, M., Tsoi, A., Hagenbuchner, M., Monfardini, G.: Computational capabilities of graph neural networks. IEEE Transactions on Neural Networks 20(1), 81–102 (2009)CrossRefGoogle Scholar
  9. 9.
    Frasconi, P., Gori, M., Sperduti, A.: A general framework for adaptive processing of data structures. IEEE Transactions on Neural Networks 9(5), 768–786 (1998)CrossRefGoogle Scholar
  10. 10.
    Hagenbuchner, M., Sperduti, A., Tsoi, A.: A self-organizing map for adaptive processing of structured data. IEEE Transactions on Neural Networks 14(3), 491–505 (2003)CrossRefGoogle Scholar
  11. 11.
    Kohonen, T.: Self-organization and associative memory. Springer Information Sciences Series (1989)Google Scholar
  12. 12.
    Khamsi, M.A.: An Introduction to Metric Spaces and Fixed Point Theory. John Wiley & Sons Inc., Chichester (2001)Google Scholar
  13. 13.
    Almeida, L.: A learning rule for asynchronous perceptrons with feedback in a combinatorial environment. In: Caudill, M., Butler, C. (eds.) IEEE International Conference on Neural Networks, San Diego, vol. 2, pp. 609–618. IEEE, New York (1987)Google Scholar
  14. 14.
    Pineda, F.: Generalization of back–propagation to recurrent neural networks. Physical Review Letters 59, 2229–2232 (1987)CrossRefMathSciNetGoogle Scholar
  15. 15.
    Riedmiller, M., Braun, H.: RPROP-A fast adaptive learning algorithm. In: Proc. of ISCIS VII, Universitat (1992)Google Scholar
  16. 16.
    Castillo, C., Donato, D., Gionis, A., Murdock, V., Silvestri, F.: Know your neighbors: web spam detection using the web topology. In: SIGIR 2007: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 423–430. ACM, New York (2007)CrossRefGoogle Scholar
  17. 17.
    Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: VLDB Conference (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Lucia Di Noi
    • 1
  • Markus Hagenbuchner
    • 2
  • Franco Scarselli
    • 1
  • Ah Chung Tsoi
    • 3
  1. 1.University of SienaSienaItaly
  2. 2.University of WollongongWollongongAustralia
  3. 3.Hong Kong Baptist UniversityHong Kong

Personalised recommendations