Advertisement

Stability and Similarity of Link Analysis Ranking Algorithms

  • Debora Donato
  • Stefano Leonardi
  • Panayiotis Tsaparas
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3580)

Abstract

Recently, there has been a surge of research activity in the area of Link Analysis Ranking, where hyperlink structures are used to determine the relative authority of Web pages. One of the seminal works in this area is that of Kleinberg [15], who proposed the Hits algorithm. In this paper, we undertake a theoretical analysis of the properties of the Hits algorithm on a broad class of random graphs. Working within the framework of Borodin et al.[7], we prove that on this class (a) the Hits algorithm is stable with high probability, and (b) the Hits algorithm is similar to the InDegree heuristic that assigns to each node weight proportional to the number of incoming links. We demonstrate that our results go through for the case that the expected in-degrees of the graph follow a power-law distribution, a situation observed in the actual Web graph [9]. We also study experimentally the similarity between Hits and InDegree, and we investigate the general conditions under which the two algorithms are similar.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Achlioptas, D., McSherry, F.: Fast computation of low rank matrix approximations. In: ACM Symposium on Theory of Computing (STOC) (2001)Google Scholar
  2. 2.
    Adamic, L.A., Huberman, B.A.: Zipf’s law and the internet. Glottometrics 3, 143–150 (2002)Google Scholar
  3. 3.
    Aiello, W., Chung, F.R.K., Lu, L.: Random evolution in massive graphs. In: IEEE Symposium on Foundations of Computer Science, pp. 510–519 (2001)Google Scholar
  4. 4.
    Azar, Y., Fiat, A., Karlin, A., McSherry, F., Saia, J.: Spectral analysis of data. In: Proceedings of the 33rd Symposium on Theory of Computing (STOC 2001), Greece (2001)Google Scholar
  5. 5.
    Barabasi, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Bianchini, M., Gori, M., Scarselli, F.: Pagerank: A circuital analysis. In: Proceedings of the Eleventh International World Wide Web (WWW) Conference (2002)Google Scholar
  7. 7.
    Borodin, A., Roberts, G.O., Rosenthal, J.S., Tsaparas, P.: Link Analysis Ranking: Algorithms, Theory, and Experiments. ACM Transactions on Internet Technology 05(1) (2005)Google Scholar
  8. 8.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. In: Proceedings of the 7th International World Wide Web Conference, Brisbane, Australia (1998)Google Scholar
  9. 9.
    Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomikns, A., Wiener, W.: Graph structure in the Web. In: Proceedings of WWW9 (2000)Google Scholar
  10. 10.
    Chung, F., Lu, L.: Connected components in random graphs with given degree sequences. Annals of Combinatorics 6, 125–145 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Chung, F., Lu, L.: The average distances in random graphs with given expected degrees. Internet Mathematics 1, 91–114 (2003)zbMATHMathSciNetCrossRefGoogle Scholar
  12. 12.
    Chung, F., Lu, L., Vu, V.: Eigenvalues of random power law graphs. Annals of Combinatorics 7, 21–33 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Erdös, P., Rènyi, A.: On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. 5, 17–61 (1960)zbMATHGoogle Scholar
  14. 14.
    Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, SODA (2003)Google Scholar
  15. 15.
    Kleinberg, J.: Authoritative sources in a hyperlinked environment. In: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 668–677 (1998)Google Scholar
  16. 16.
    Kumar, R., Raghavan, P., Rajagopalan, S., Sivakumar, D., Tomkins, A., Upfal, E.: Stochastic models for the web graph. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science (2000)Google Scholar
  17. 17.
    Lee, H.C., Borodin, A.: Perturbation of the hyperlinked environment. In: Proceedings of the Ninth International Computing and Combinatorics Conference (2003)Google Scholar
  18. 18.
    Lempel, R., Moran, S.: The stochastic approach for link-structure analysis (SALSA) and the TKC effect. In: Proceedings of the 9th International World Wide Web Conference (2000)Google Scholar
  19. 19.
    Lempel, R., Moran, S.: Rank stability and rank similarity of link-based web ranking algorithms in authority connected graphs. In: Second Workshop on Algorithms and Models for the Web-Graph, WAW 2003 (2003)Google Scholar
  20. 20.
    Mihail, M., Papadimitriou, C.H.: On the eigenvalue power law. In: Rolim, J.D.P., Vadhan, S.P. (eds.) RANDOM 2002. LNCS, vol. 2483, p. 254. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  21. 21.
    Motwani, R., Raghavan, P.: Randomized Algorithms. Cambridge University Press, Cambridge (1995)zbMATHGoogle Scholar
  22. 22.
    Ng, A.Y., Zheng, A.X., Jordan, M.I.: Link analysis, eigenvectors, and stability. In: Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI (2001)Google Scholar
  23. 23.
    Stewart, G.W., Sun, J.: Matrix Perturbation Theory. Academic Press, London (1990)zbMATHGoogle Scholar
  24. 24.
    Zipf, G.K.: Human Behavior and the principle of least effort. Addison-Wesley, Reading (1949)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Debora Donato
    • 1
  • Stefano Leonardi
    • 1
  • Panayiotis Tsaparas
    • 2
  1. 1.Universita di Roma,“La Sapienza” 
  2. 2.University of Helsinki 

Personalised recommendations