Skip to main content
Log in

Ranking Pages by Topology and Popularity within Web Sites

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

We compare two link analysis ranking methods of web pages in a site. The first, called Site Rank, is an adaptation of PageRank to the granularity of a web site and the second, called Popularity Rank, is based on the frequencies of user clicks on the outlinks in a page that are captured by navigation sessions of users through the web site. We ran experiments on artificially created web sites of different sizes and on two real data sets, employing the relative entropy to compare the distributions of the two ranking methods. For the real data sets we also employ a nonparametric measure, called Spearman's footrule, which we use to compare the top-ten web pages ranked by the two methods. Our main result is that the distributions of the Popularity Rank and Site Rank are surprisingly close to each other, implying that the topology of a web site is very instrumental in guiding users through the site. Thus, in practice, the Site Rank provides a reasonable first order approximation of the aggregate behaviour of users within a web site given by the Popularity Rank.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adamic, L.A., Huberman, B.A.: The web's hidden order. Commun. ACM 44(9), 55–59 (2001)

    Article  Google Scholar 

  2. Berendt B., Mobasher, B., Spiliopoulou, M., Wiltshire, J.: Measuring the Accuracy of Sessionizers for Web Usage Analysis. In Proceedings of the Web Mining Workshop at the First SIAM International Conference on Data Mining, pp. 7–14. Chicago (April 2001)

  3. Bianchini, M., Gori, M., Scarselli, F.: Inside pageRank. ACM Transactions on Internet Technology 5, 92–128 (2005)

    Article  Google Scholar 

  4. Borges, J., Levene, M.: Data mining of user navigation patterns. In: Masand, B., Spiliopoulou, M. (eds.) Web Usage Analysis and User Profiling, Lecture Notes in Artificial Intelligence (LNAI 1836), pp. 92–111. Springer, Berlin Heidelberg New York (2000)

    Google Scholar 

  5. Borges, J., Levene, M.: A fine grained heuristic to capture web navigation patterns. SIGKDD Explorations 2, 40–50 (2000)

    Google Scholar 

  6. Brin, S., Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. In Proceedings of International World Wide Web Conference, pp. 107–117. Brisbane (1998)

  7. Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, A., Stata, R., Tomkins, A., Wiener, J.: Graph structure in the web. Comput. Networks 33, 309–320 (2000)

    Article  Google Scholar 

  8. Fagin, R., Kumar, R., Sivakumar, D.: Comparing top K lists. SIAM J. Discrete Math. 17, 134–160 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  9. Gray, R.M.: Entropy and Information Theory. Springer, Berlin Heidelberg New York (1990)

    MATH  Google Scholar 

  10. Henzinger, M.R.: Hyperlink analysis for the web. IEEE Internet Computing 5, 45–50 (2001)

    Article  Google Scholar 

  11. Huberman, B.A., Pirolli, P.L.T., Pitkow, J.E., Lukose, R.M.: Strong regularities in world wide web surfing. Science 280, 95–97 (1998)

    Article  Google Scholar 

  12. Kemeny, J.G., Snell, J.L.: Finite Markov Chains. Van Nostrand, Princeton, New Jersey (1960)

    Google Scholar 

  13. Khinchin, A.I.: Mathematical Foundations of Information Theory. Dover, New York, New York (1957). Translated by R.A. Silverman and M.D. Friedman.

    MATH  Google Scholar 

  14. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46, 604–632 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  15. Langville, A.N., Meyer, C.D.: Deeper inside PageRank. Internet Mathematics 1, 335–380 (2004)

    MATH  MathSciNet  Google Scholar 

  16. Levene, M., Loizou, G.: Computing the entropy of user navigation in the web. International Journal of Information Technology and Decision Making 2, 459–476 (2003)

    Article  Google Scholar 

  17. Mobasher, B.: Web usage mining and personalization. In: Singh, M.P. (ed.) Practical Handbook of Internet Computing. CRC, Boca Raton, Florida. (2004)

    Google Scholar 

  18. Page, L., Brin, S., Motwani, R., Winograd. T.: The Pagerank Citation Ranking: Bringing Order to the Web. Working paper, Department of Computer Science, Stanford University (1998)

  19. Pandurangan, E.U.G., Raghavan, P.: Using PageRank to Characterize Web Structure. In: Proceedings of the International Conference on Computing and Combinatorics (COCOON), pp. 330–229, Singapore (2002)

  20. Perkowitz, M., Etzioni, O.: Towards adaptive web sites: conceptual framework and case study. Artif. Intell. 118(2000), 245–275 (2000)

    Article  Google Scholar 

  21. Spiliopoulou, M., Mobasher, B., Berendt, B., Nakagawa, M.: A framework for the evaluation of session reconstruction heuristics in web usage analysis. INFORMS J. Comput. 15, 171–190 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mark Levene.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Borges, J., Levene, M. Ranking Pages by Topology and Popularity within Web Sites. World Wide Web 9, 301–316 (2006). https://doi.org/10.1007/s11280-006-8558-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-006-8558-y

Keywords

Navigation